您好,欢迎访问三七文档
当前位置:首页 > 建筑/环境 > 电气安装工程 > 聚类分析外文文献及翻译
本科毕业论文外文文献及译文文献、资料题目:ClusterAnalysis—BasicConceptsandAlgorithms文献、资料来源:文献、资料发表(出版)日期:院(部):土木工程学院专业:土木工程班级:姓名:学号:指导教师:翻译日期:山东建筑大学毕业论文外文文献及译文-1-外文文献:ClusterAnalysis—BasicConceptsandAlgorithmsClusteranalysisdividesdataintogroups(clusters)thataremeaningful,useful,orboth.Ifmeaningfulgroupsarethegoal,thentheclustersshouldcapturethenaturalstructureofthedata.Insomecases,however,clusteranalysisisonlyausefulstartingpointforotherpurposes,suchasdatasummarization.Whetherforunderstandingorutility,clusteranalysishaslongplayedanimportantroleinawidevarietyoffields:psychologyandothersocialsciences,biology,statistics,patternrecognition,informationretrieval,machinelearning,anddatamining.Therehavebeenmanyapplicationsofclusteranalysistopracticalproblems.Weprovidesomespecificexamples,organizedbywhetherthepurposeoftheclusteringisunderstandingorutility.ClusteringforUnderstandingClasses,orconceptuallymeaningfulgroupsofobjectsthatsharecommoncharacteristics,playanimportantroleinhowpeopleanalyzeanddescribetheworld.Indeed,humanbeingsareskilledatdividingobjectsintogroups(clustering)andassigningparticularobjectstothesegroups(classification).Forexample,evenrelativelyyoungchildrencanquicklylabeltheobjectsinaphotographasbuildings,vehicles,people,animals,plants,etc.Inthecontextofunderstandingdata,clustersarepotentialclassesandclusteranalysisisthestudyoftechniquesforautomaticallyfindingclasses.Thefollowingaresomeexamples:Biology.Biologistshavespentmanyyearscreatingataxonomy(hierarchicalclassification)ofalllivingthings:kingdom,phylum,class,order,family,genus,andspecies.Thus,itisperhapsnotsurprisingthatmuchoftheearlyworkinclusteranalysissoughttocreateadisciplineofmathematicaltaxonomythatcouldautomaticallyfindsuchclassificationstructures.Morerecently,biologistshaveappliedclusteringtoanalyzethelargeamountsofgeneticinformationthatarenowavailable.Forexample,clusteringhasbeenusedtofindgroupsofgenesthathavesimilarfunctions.•InformationRetrieval.TheWorldWideWebconsistsofbillionsofWebpages,and山东建筑大学毕业论文外文文献及译文-2-theresultsofaquerytoasearchenginecanreturnthousandsofpages.Clusteringcanbeusedtogroupthesesearchresultsintoasmallnumberofclusters,eachofwhichcapturesaparticularaspectofthequery.Forinstance,aqueryof“movie”mightreturnWebpagesgroupedintocategoriessuchasreviews,trailers,stars,andtheaters.Eachcategory(cluster)canbebrokenintosubcategories(sub-clusters),producingahierarchicalstructurethatfurtherassistsauser’sexplorationofthequeryresults.•Climate.UnderstandingtheEarth’sclimaterequiresfindingpatternsintheatmosphereandocean.Tothatend,clusteranalysishasbeenappliedtofindpatternsintheatmosphericpressureofpolarregionsandareasoftheoceanthathaveasignificantimpactonlandclimate.•PsychologyandMedicine.Anillnessorconditionfrequentlyhasanumberofvariations,andclusteranalysiscanbeusedtoidentifythesedifferentsubcategories.Forexample,clusteringhasbeenusedtoidentifydifferenttypesofdepression.Clusteranalysiscanalsobeusedtodetectpatternsinthespatialortemporaldistributionofadisease.•Business.Businessescollectlargeamountsofinformationoncurrentandpotentialcustomers.Clusteringcanbeusedtosegmentcustomersintoasmallnumberofgroupsforadditionalanalysisandmarketingactivities.ClusteringforUtility:Clusteranalysisprovidesanabstractionfromindividualdataobjectstotheclustersinwhichthosedataobjectsreside.Additionally,someclusteringtechniquescharacterizeeachclusterintermsofaclusterprototype;i.e.,adataobjectthatisrepresentativeoftheotherobjectsinthecluster.Theseclusterprototypescanbeusedasthebasisforanumberofdataanalysisordataprocessingtechniques.Therefore,inthecontextofutility,clusteranalysisisthestudyoftechniquesforfindingthemostrepresentativeclusterprototypes.•Summarization.Manydataanalysistechniques,suchasregressionorPCA,haveatimeorspacecomplexityofO(m2)orhigher(wheremisthenumberofobjects),andthus,arenotpracticalforlargedatasets.However,insteadofapplyingthealgorithmtotheentiredataset,itcanbeappliedtoareduceddatasetconsistingonlyofclusterprototypes.Dependingonthetypeofanalysis,thenumberofprototypes,andtheaccuracywithwhichtheprototypesrepresentthedata,theresultscanbecomparabletothosethatwouldhave山东建筑大学毕业论文外文文献及译文-3-beenobtainedifallthedatacouldhavebeenused.•Compression.Clusterprototypescanalsobeusedfordatacompres-sion.Inparticular,atableiscreatedthatconsistsoftheprototypesforeachcluster;i.e.,eachprototypeisassignedanintegervaluethatisitsposition(index)inthetable.Eachobjectisrepresentedbytheindexoftheprototypeassociatedwithitscluster.Thistypeofcompressionisknownasvectorquantizationandisoftenappliedtoimage,sound,andvideodata,where(1)manyofthedataobjectsarehighlysimilartooneanother,(2)somelossofinformationisacceptable,and(3)asubstantialreductioninthedatasizeisdesired•EffcientlyFindingNearestNeighbors.Findingnearestneighborscanrequirecomputingthepairwisedistancebetweenallpoints.Oftenclustersandtheirclusterprototypescanbefoundmuchmoreeffciently.Ifobjectsarerelativelyclosetotheprototypeoftheircluster,thenwecanusetheprototypestoreducethenumberofdistancecomputationsthatarenecessarytofind
本文标题:聚类分析外文文献及翻译
链接地址:https://www.777doc.com/doc-4002135 .html