On the convergence of spectral clustering on rando

OntheConvergenceofSpectralClusteringonRandomSamples:theNormalizedCaseUlrikevonLuxburg1,OlivierBousquet1,andMikhailBelkin21MaxPlanckInstituteforBiologicalCybernetics,T¨ubingen,Germany{ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de2TheUniversityofChicago,DepartmentofComputerSciencemisha@cs.uchicago.eduAbstract.Givenasetofnrandomlydrawnsamplepoints,spectralclusteringinitssimplestformusesthesecondeigenvectorofthegraphLaplacianmatrix,constructedonthesimilaritygraphbetweenthesam-plepoints,toobtainapartitionofthesample.Weareinterestedinthequestionhowspectralclusteringbehavesforgrowingsamplesizen.IncaseoneusesthenormalizedgraphLaplacian,weshowthatspectralclus-teringusuallyconvergestoanintuitivelyappealinglimitpartitionofthedataspace.WearguethatincaseoftheunnormalizedgraphLaplacian,equallystrongconvergenceresultsarediﬃculttoobtain.1IntroductionClusteringisawidelyusedtechniqueinmachinelearning.Givenasetofdatapoints,oneisinterestedinpartitioningthedatabasedonacertainsimilarityamongthedatapoints.Ifweassumethatthedataisdrawnfromsomeunderly-ingprobabilitydistribution,whichoftenseemstobethenaturalmathematicalframework,thegoalbecomestopartitiontheprobabilityspaceintocertainre-gionswithhighsimilarityamongpoints.Inthissettingtheproblemofclusteringistwo-fold:–Assumingthattheunderlyingprobabilitydistributionisknown,whatisadesirableclusteringofthedataspace?–Givenﬁnitelymanydatapointssampledfromanunknownprobabilitydis-tribution,howcanwereconstructthatoptimalpartitionempiricallyontheﬁnitesample?Interestingly,whileextensiveliteratureexistsonclusteringandpartitioning,tothebestofourknowledgeveryfewalgorithmshavebeenanalyzedorshowntoconvergeforincreasingsamplesize.Someexceptionsarethek-meansalgorithm(cf.Pollard,1981),thesinglelinkagealgorithm(cf.Hartigan,1981),andtheclusteringalgorithmsuggestedbyNiyogiandKarmarkar(2000).Thegoalofthispaperistoinvestigatethelimitbehaviorofaclassofspectralclusteringalgorithms.SpectralclusteringisapopulartechniquegoingbacktoDonathandHoﬀman(1973)andFiedler(1973).Ithasbeenusedforloadbalancing(VanDriesscheandRoose,1995),parallelcomputations(HendricksonandLeland,1995),andVLSIdesign(HagenandKahng,1992).Recently,Laplacian-basedclusteringal-gorithmshavefoundsuccessinapplicationstoimagesegmentation(cf.ShiandMalik,2000).MethodsbasedongraphLaplacianshavealsobeenusedforotherproblemsinmachinelearning,includingsemi-supervisedlearning(cf.BelkinandNiyogi,toappear;Zhuetal.,2003).Whiletheoreticalpropertiesofspectralclus-teringhavebeenstudied(e.g.,GuatteryandMiller(1998),Weiss(1999),Kannanetal.(2000),MeilaandShi(2001),alsoseeChung(1997)foracomprehensivetheoreticaltreatmentofthespectralgraphtheory),wedonotknowofanyre-sultsdiscussingtheconvergenceofspectralclusteringorthespectraofgraphLaplaciansforincreasingsamplesize.Howeverforkernelmatrices,theconver-genceoftheeigenvaluesandeigenvectorshasalreadyattractedsomeattention(cf.WilliamsandSeeger,2000;Shawe-Tayloretal.,2002;Bengioetal.,2003).2BackgroundandnotationsLet(X,dist)beametricspace,BtheBorelσ-algebraonX,Paprobabilitymeasureon(X,B),andL2(P):=L2(X,B,P)thespaceofsquare-integrablefunctions.Letk:X×X→IRameasurable,symmetric,non-negativefunc-tionthatcomputesthesimilaritybetweenpointsinX.ForgivensamplepointsX1,...,Xndrawniidaccordingtothe(unknown)distributionPwedenotetheempiricaldistributionbyPn.WedeﬁnethesimilaritymatrixasKn:=(k(Xi,Xj))i,j=1,...,nandthedegreematrixDnasthediagonalmatrixwithdiag-onalentriesdi:=Pnj=1k(Xi,Xj).TheunnormalizeddiscreteLaplacianmatrixisdeﬁnedasLn:=Dn−Kn.Forsymmetricandnon-negativek,Lnisapositivesemi-deﬁnitelinearoperatoronIRn.Leta=(a1,...,an)thesecondeigenvec-torofLn.Here,“secondeigenvector”referstotheeigenvectorbelongingtothesecondsmallesteigenvalue,wheretheeigenvaluesλ1≤λ2...≤λnarecountedwithmultiplicity.Inanutshell,spectralclusteringinitssimplesformpartitionsthesamplepoints(Xi)iintotwo(orseveral)groupsbythresholdingthesecondeigenvectorofLn:pointXibelongstocluster1ifaib,andtocluster2oth-erwise,whereb∈IRissomeappropriateconstant.AnintuitiveexplanationofwhythisworksisdiscussedinSection4.Often,spectralclusteringisalsoperformedwithanormalizedversionofthematrixLn.TwocommonwaysofnormalizingareL0n:=D−1/2nLnD−1/2norL00n:=D−1nLn.Theeigenvaluesandeigenvectorsofbothmatricesarecloselyrelated.DeﬁnethenormalizedsimilaritymatricesH0n:=D−1/2nKnD−1/2nandH00n:=D−1nKn.ItcanbeseenbymultiplyingtheeigenvalueequationL0nv=λvfromleftwithD−1/2nthatv∈IRniseigenvectorofL0nwitheigenvalueλiﬀD−1/2nviseigenvectorofL00nwitheigenvalueλ.Furthermore,rearrangingtheeigenvalueequationsforL0nandL00nshowsthatv∈IRnisaneigenvectorofL0nwitheigenvalueλiﬀviseigenvectorofH0nwitheigenvalue(1−λ),andthatv∈IRnisaneigenvectorofL00nwitheigenvalueλiﬀviseigenvectorofH00nwitheigenvalue(1−λ).Thus,propertiesaboutthespectrumofoneofthema-tricesL0n,L00n,H0n,orH00ncanbereformulatedforthethreeothermatricesaswell.Inthefollowingwewanttorecallsomedeﬁnitionsandfactsfrompertur-bationtheoryforboundedoperators.Thestandardreferenceforgeneralper-turbationtheoryisKato(1966),forperturbationtheoryinHilbertspaceswealsorec

On the convergence of spectral clustering on rando

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

物理学院高频电子线路g3-3

第三章工程机械大修共同维修工艺

中国的能源状况与政策-中华人民共和国农业部中国农业信息网

通信网络基础

第十五章抗精神失常药ppt g wqd3 a1 j xj

无公害农产品知识

模拟转高清给安防行业带来的变革

精品天车应急救援预案

科研项目管理(1)

心态培训课件（PPT34页)

相关文档

相关搜索