您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 营销创新 > 基于信息瓶颈的社区发现
31420084CHINESEJOURNALOFCOMPUTERSVol.31No.4Apr.2008:2007212210.(2004CB318109,2007CB311100)IST20072Web210(FY072RES2THEME2067).,,1982,,.E2mail:shenhuawei@software.ict.ac.cn.,,1971,,,P2P.,,1979,,.,,1971,,,Web.1),2)1)1),2)1)1)(100080)2)(100049),.,,.,.,.,,,,,,),2)CHENGXue2Qi1)CHENHai2Qiang1),2)LIUYue1)1)(InstituteofComputingTechnology,ChineseAcademyofSciences,Beijing100080)2)(GraduateUniversityofChineseAcademyofSciences,Beijing100049)AbstractThispaperproposesaprojectionmethodtotransformaunipartitenetworkintoabi2partitenetwork.Astotheobtainedbipartitenetwork,itpresentsaninformation2bottleneck2basedmethodforcommunitydetectionundertheinformation2theoreticframework.Thismethoddetectsthecommunitystructureofnetworksbyfindinganefficientcompressionofthenetwork.Theefficientcompressionholdstheregularityoftheoriginalnetworkasmanyaspossible.Appli2cationsonthecomputer2generatednetworksandmanyreal2worldnetworksdemonstratethatthismethodisveryeffectiveatcommunitydetectionofnetworks.Asforthecommunitydetectionofdirectednetworks,existingmethodsneglectthedirectionofedgesandtreatthemasundirectednetworks.Theinformationprovidedbydirectionalityislostinthisprocess.Usingtheprojectionmethodproposedinthispaper,thedirectionofedgescanberetainedandthusthemethodismoresuitabletodetectthecommunitystructureindirectednetworks,suchastheworld2wide2web.Andsomenewknowledgecanbefoundbythemethod.Inaddition,themethodcanbedirectlyap2pliedtothedetectionofcommunitystructureinbipartitenetworks,whicharecommoninrealworld.Keywordscommunitydetection;informationbottleneck;modularity1,,Internet[122].,,,,.,[3].,..,[4].,:,.,,.,.:(1);(2).,,NN,.,.,,.,,NC(),,.,,.,.,Kernighan2Lin[5],O(n3).,,,..,,;,..,,[6][7][8][9][10][11],(modulari2ty)[11].Modularity[7]Newman,.Modularity(Q)Q=i(eii-a2i)(1),eijij;ai=jeiji.Modularity:,,,Q.,Q013017.modularityQ,,Q.Stirling[11],.Q,[12][13][14]Q,8762008.,modularity,.,,,[15].,[15216],,[16],.,.,.,;,,,.,,,.,.2;3,,;4,;5,,.2,.Tishby[17].:XYp(x,y),X,I(X,Y).,XYp(x,y)p(x)p(y),X(Y)Y(X).XC,I(C,Y)FI(X,Y).,XY.XCp(c|x),p(c|x)XxCc.,xCc,,xc,.,L[p(c|x)]=I(C,X)-I(C,Y)(2),,CY.p(x,y),(2).3:1Cp(c);2XCp(c|x);3p(y|c),CY.:p(c|x)=p(c)Z(,x)exp(-DKL[p(y|x)p(y|c)])p(y|c)=1p(c)xp(c|x)p(x)p(y|x)p(c)=xp(c|x)p(x)(3),Z(,x),,DKL[p(y|x)p(y|c)]p(y|x)p(y|c)Kullback2Leibler,XCY.(3),.,,xc.,(2)2,1,(3)p(c|x)=1,xc0,p(y|c)=1p(c)xcp(x,y)p(c)=xcp(x)(4)(4)[18].,x,X.,9764:,,,I(C,Y).cicj,c3,(5).p(c3|x)=1,xcixcj0,p(y|c3)=p(ci)p(c3)p(y|ci)+p(cj)p(c3)p(y|cj)p(c3)=p(ci)+p(cj)(5),I(ci,cj)I(Cbefore;Y)-I(Cafter;Y)(6),I(ci,cj)(p(ci)+p(cj))DJS[p(y|ci),p(y|cj)](7),DJSJensen2Shannon(JS),DJS[pi,pj]=iDKL[pi^p]+jDKL[pj^p](8),{pi,pj}{p(y|ci),p(y|cj)}{i,j}p(ci)p(c3),p(cj)p(c3)^p=ip(y|ci)+jp(y|cj)(9)JS,0,1,.,JS.33.1G=(V,E),B.:(1)Gvi,Buiwi;(2)G(vi,vj)E,B(ui,wj)(uj,wi);(3)(ui,wj)(uj,wi)(vi,vj).1(a).G=(V,E),B.:(1)Gvi,Buiwi;(2)Gvi,vjE,Bui,wj,;(3)ui,wjvi,vj.1(b).13.2,M,,.Mij(ui,wj)ui,wj,uiwj,0.X,Y,XYM.MTW,TWM,(TotalWeight).,.,.,.,.,.,.,,.1,12,112,1608620081379,712461012,.,.,,(16)(712);,,(13,79)(46,1012).,.1OutIn137946101216YesNo712NoYes.X,XXYI(X,Y),XY..,,,.:,,,,..dendrogram,(2),..mod2ularity,modularity.1.1..:::1.,XYp(x,y).2.CX.3.i,j=1X,ij,dij(p(ci)+p(cj))DJS[p(y|ci),p(y|cj)]:Form=X-11diji,jci,cjc3Cci,cj,c3Cc3dijmodularity,End:1.modularity2..2Zachary,(dendrogram)dij,ci,cj,dij,cicj.,..N,M.,1864:dijO(N),ci,cjdij,O(MN).,N-1,,dijO(M),O(N),dijc3dij,O(N2),modularityO(N2)..,O(N3).,.,5000,.,,.4.4.1,,,.41111[7].128,4.,pin,pout.poutzout,pinztot16.zout0,,.4.1.2,,.,CF,CR,C3R,CFC3R.,FVIC(FractionofVerticesIdentifiedCorrectly).zout,FVIC,FVICzout.,,.,4,4,,,.(NMI)[4].N,,.NNijij.iNi.,jN.j,NS.CR,CF.,FR:NMI(R,F)=-2CRi=1CFj=1NijlogNijSNi.N.jCRi=1Ni.logNi.S+CFj=1N.jlogN.jS,,NMI1.,NMI0.,FVICNMI.4.1.3,zout08,zout,,100,,FVICNMI,,zoutFVICNMI.3,,zout6,95%,NMI0190;zout7,80%;zout8,,.,,NewmanFast[11],NewmanFast.3,FVICNMI2862008,,New2manFast.,,.3FVICNMIzout(NFNew2manFast[11],IB)4.2,,,,.,,;5000,.41211Zachary()4ZacharyZachary(4).,34.ZacharyWayne,34,.,,,.,,.4,.2Zachary.dendrogram,,modularity.5,modularity01392,,modularity.,,.2,,2,,3,2,,3.,2,modularity01360,.41212(WordAssociationNetwork)(UniversityofSouthFlorida),.5019,6000.,,.755019.,.A,B,#GA,#PBA,ABFSG=#P#G,B,AB(BSG)BA.5019,,,3864:TheUniversityofSouthFloridawordassociation,rhyme,andwordfragmentnorms.(BA)AB(BA).50195019,(01025),.5019.18(5(a)),modularity01423,.(5(a)),,,.,3,17().5(b),,modularity01691,.,17,..5(c),.,.,power2law[122].,5019,,power2law,power2law.,,,.5,,,.,.,.,,.,4862008.,,.,,.,,,.,,.,,.,,,.,,,,.,..P2P,,.,![1]AlbertR,BarabÀsiA2L.Statisticalmechanicsofcomplexnetworks.ReviewsofModernPhysics,2002,74(1):47297[2]NewmanMEJ.Thestructureandfunctionofcomplexnet2works.SIAMReview,2003,45(2):1672256[3]NewmanMEJ.Detectingcommunitystructureinnetworks.EuropeanPhysicalJournalB,2004,38(2):3212330[4]DanonL,Daz2GuileraA,DuchJ,ArenasA.Comparingcommunitystructureidentification.JournalofStatisticalMe2chanics,2005,P09008[5]KernighanBW,LinS.Anefficientheu
本文标题:基于信息瓶颈的社区发现
链接地址:https://www.777doc.com/doc-658743 .html