您好,欢迎访问三七文档
arXiv:0711.0491v1[physics.soc-ph]4Nov2007CommunityDetectioninComplexNetworksUsingGeneticAlgorithmsMurselTasgin,1AmacHerdagdelen,1andHalukBingol11DepartmentofComputerEngineeringBogaziciUniversity,Turkey(Dated:February2,2008)Communitydetectionisanimportantresearchtopicincomplexnetworks.Wepresenttheem-ploymentofageneticalgorithmtodetectcommunitiesincomplexnetworkswhichisbasedonoptimizingnetworkmodularity.Itdoesnotneedanypriorknowledgeaboutthenumberofcommu-nities.Itsperformanceistestedontworeallifenetworkswithknowncommunitystructuresandasetofsyntheticnetworks.Astheperformancemeasureaninformationtheoreticalmetric,variationofinformation,isused.Theresultsarepromisingandinsomecasesbetterthanpreviouslyreportedstudies.I.INTRODUCTIONCommunitystructuredetectionisoneofthehottopicsthathavecreatedagreatinterestincomplexnetworkstudies.Acommunityislooselydefinedasagroupofverticeswithahighdensityofin-groupandalowdensityofout-groupedgesandtheiridentificationincomplexnetworkscallsfortechniquesborrowedfromphysicsandcomputersciences[1,2,3].Differentmethodsandalgorithmshavebeenproposedtorevealtheunderlyingcommunitystructureincom-plexnetworks.Acrucialpartofthealgorithmsishowtheydefineacommunity[4].Therearedifferentformaldefinitionsofacommunity[5].Inthispaper,weuseaquantitativedefinitionproposedbyGirvanandNewmanwhichmakesuseofameasurecallednetworkmodular-ity[6].ThenetworkmodularityQisdefinedasQ=Xi(eii−a2i)(1)wheretheindexirunsoverallcommunities,eiiisthefractionofedgesthatconnecttwonodeswithingroupi,whileaiisthefractionofedgesthathaveatleastoneendpointwithinthegroup.Someoftherecentcom-munitydetectionalgorithmslikeNewman’sfastalgo-rithmfordetectingcommunities,thealgorithmforverylargenetworks,andthealgorithmusingExtremalOp-timizationusethenetworkmodularityasqualitymet-ric[2,4,7].Calculationofthenetworkmodularityislesstimeconsumingthantheedgebetweennesscentral-ityusedinGirvan-Newman(GN)algorithm[6].Inthispaper,weproposeanewcommunitydetectionalgorithmwhichtriestooptimizethenetworkmodular-itybyemployinggeneticalgorithms.Unlikethepreviousmethods,thenewalgorithmdoesnotrequirethenumberofcommunitiespresentinagraph.Thenumberofcom-munitiescomesasanemergentresultasthemodularityvalueisoptimized.II.BACKGROUNDOuralgorithmisbasedontheoptimizationofnetworkmodularityQbyemployingageneticalgorithm.Thealgorithmproducesapartitiononthesetofverticesofgraph.Successofacommunitydetectionalgorithmisdefinedastheclosenessofthepartitiongeneratedbythealgorithmtothepartitioncorrespondingtotherealcom-munitystructure.Thereforebeforedescribingthealgo-rithm,letusintroducetheterminologyonpartitionsthatwewillusethroughoutthispaper.A.RepresentationWearegivenanundirectedgraphG(V,E)with|V|=nverticesand|E|=eedges.AssumingafixedorderingoftheverticesV={v1,v2,···,vn},anypartitionΩofVcanberepresentedbyavectorκ=κ1κ2···κnofndimensionswhereκi∈{1,2,···,|Ω|}istheindexoftheclusterofthevertexvi.Theverticesviandvjareinthesameclusterifandonlyifκi=κj.Notethatthisisnotacanonicalrepresentation,meaningthattherearemanydifferentvectorscorrespondingtothesamepartition.Anexamplewouldbehelpful:LetV={v1,v2,v3,v4}bethesetofverticesinthegivenorderandconsidertheparti-tionΩ={{v1,v3},{v2},{v4}}.Then,althoughtheyaredifferent,thevectorsκ1=[1213],κ2=[4243]andκ3=[2124]representthepartition.LetΓbethepartitionofVthatcorrespondstherealcommunitystructureoftheunderlyingthegraph.Anel-ementγofΓiscalledacommunity.Intotal,thereare|Γ|elementscorrespondingto|Γ|communities.ThepurposeofthecommunitydetectionistobuildanestimationofthepartitionΓbasedonthetopologyofthegraph.LetΩbethepartitionthatrepresentsourestimatedcom-munitystructure.AnelementωofΩiscalledacluster.Notethatthenumberofclusters|Ω|doesnothavetobeequaltothenumberofcommunities|Γ|.Inthemostgeneralcase,theoutputofouralgorithmwillbeavectorκoflengthnwhichcorrespondstoourestimatedpartitionΩofthevertexset.Weassumethatthenumberofcommunities|Γ|isunknownbuthastobe2estimatedbythealgorithm.B.DistanceMetricforPartitionsTheoutputofouralgorithmisapartitionΩoftheverticesinthegraph.Theevaluationofaresultingclus-teringΩandagivencommunitystructureΓofanetworkisnotstraightforwardbecauseitisnotalwaysclearwhichclustercorrespondstowhichcommunityandhowtodealwithmixedclusterswhichcontainmembersoftwoormorecommunities.Eventhenumberofclusters|Ω|andthenumberofcommunities|Γ|maydiffer.Twoimportantissuesregardingtheevaluationofaclusteringalgorithmistheaccuracyandtheprecisionofthealgorithm.Accuracyisameasureofthesuccessofanalgorithminclusteringthemembersofthesamecommunitytogetherwithoutanyseparation(i.e.intra-clusterscatter).Precisionisameasureofthesuccessofanalgorithmincreatinghomogeneousclusterswhichcontainthemembersofthesamecommunities(i.e.inter-clusterscatter).Inordertounderstandtheseconcepts,considertwoextremecases.Thepartitionofsingletons,whereeachvertexisaclusterbyitself,thatis|Ω|=n,isveryprecisesincenoclustercontainselementsofmorethenonecommunities.Ontheotherhanditisveryinaccuratesincetheelementsofanycommunityarescatteredintomanyclusters.Thesecondextremeisthecasewherethepartitioniscomposedofsingl
本文标题:Community-Detection-in-Complex-Networks-Using-Gene
链接地址:https://www.777doc.com/doc-4086197 .html