您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 冶金工业 > FCM算法中参数的优选方法及实例应用
ANewValidityFunctionForFuzzyClusteringYangLi,FushengYu*SchoolofMathematicalSciences,BeijingNormalUniversityLaboratoryofMathematicsandComplexSystems,MinistryofEducationBeijing100875,ThePeople'sRepublicofChinae-mail:liyangbnu@mail.bnu.edu.cn,yufusheng@263.netAbstract—Thispaperfirstgivesanewvalidityfunctionforfuzzyclustering,thenpresentsamethodoftheoptimalselectingoftheclusternumberinthestandardfuzzyc-meansclusteringalgorithm,andfinallyoutlinesthefuzzyc-meansclusteringalgorithmwithparametersself-adapted.Experimentalresultscarriedonsyntheticdatasetanddatasetbasedonactualbackgroundillustratetheperformanceofthenewvalidityfunctionandthecorrespondingfuzzyclusteringalgorithm.Keywords-FuzzyC-Means;fuzzyclusteringanalysis;clusternumber;clusteringvalidityfunctionI.INTRODUCTIONFuzzyC-Means(FCM)clusteringalgorithmcanfixastructureofagivendatasetfortheclusternumberwhichmustbegiveninadvance.Thismeansthatthestructurefounddependsonthegivenclusternumber.Doesthestructurefoundaccordwiththeonethatthedatasetexhibits?Thisleadstotheissueofevaluationoffuzzyclustering.Toimplementingthis,avalidityfunctionisusuallydefinedtofindanoptimalclusternumber.Thushowtodefineagoodvalidityfunctionbecomesanimportanttopicthatattractsmoreattentionofresearchers.Therehavebeenexistingmanydifferentversionsofvalidityfunctions[1,2,3,6].Amongthemsomearedefinedbymeansofpartitionmatrix,examplesarePartitionIndex(PI)andPartitionEntropy(PE);whilesomearedefinedbymeansofpartitionmatrixanddata,XieandBeniIndexissuchanexample[4,7].Inspiteofdifferentformulations,theymainlytakeaccountoftheinter-clusterdistancesand(or)intra-clusterdistances.Agoodpartitionshouldsatisfytworequirements:(a)divergence:Theinter-clusterdistancesshouldbeasbiggeraspossible;(b)compactness:Theintra-clusterdistancesshouldbeassmalleraspossible.Thevalueoftheratioofthecompactnessandthedivergencecanbethecriterionoftheclusteringvalidity.XieandBeniIndexissuchaclusteringvalidityfunction[7].Inthispaper,weproposeanewversionofvalidityfunctionwhichadoptsnewmannertoconstructtheratioofcompactnessanddivergence.Theexamplesandexperimentspresentedinthispaperillustratethegoodperformanceofthenewvalidityfunctionoffuzzyclustering.Inthefollowing,wearrangethestudyasthisorder:SectionⅡpresentsthenecessarypreliminariesofourstudy;SectionⅢgivesthenewvalidityfunctionandthe*Correspondingauthorcorrespondingself-adaptingFCMalgorithm.Anexampleisalsopresentedhere.InSectionⅣ,wecarriedtwoexperimentstoshowtheperformanceofthenewvalidityfunction.II.PRELEMINARYInthissection,webrieflyintroducetheFCMalgorithmandlistthemainversionsofexistingvalidityfunctioninliterature.A.FCMAlgorithmDunnspreadtheHardC-Means(HCM)tofuzzysituationbasedonthefuzzypartitionofsetwhichwasdefinedbyRuspini.Bezdekspreadittoamoregeneralsituationandgaveageneraldescriptionoffuzzyclustering[5].Thefuzzyclusteringproblemcanbeformulatedasthefollowingmathematicalprogrammingproblem[9]:∑∑===cinjijmijmduVUJ112),(min∑=≤≤=ciijnju11,1wherem(m1)isthefuzzyweightingexponent,cisthenumberofclusterstobeexplored,nisthecardofdatasetX.Bezdekgaveaniterativealgorithmtogettheoptimalsolutionoftheabovemathematicalprogrammingproblem.Theiterativeformulaeinthealgorithmare:∑=−=crmkrjkijkijddu112)()()()(/1,∑∑==+=njmkijnjjmkijkiuxuv1)(1)()1()(/)(Thisalgorithmisconvergentandatthesolutionpoint(i.e.thepartitionmatrixUandtheprototypesV)theobjectivefunctionattainsitslocalminimum.B.MainValidityFunctionsThecommonlyusedvalidityfunctionsarePartitionIndex(PI)andPartitionEntropy(PE)respectivelydefinedbythefollowingformulas[5]:∑∑∑∑====−===cinjijijBcinjijuunPELunPI11112)ln(1,1Bothindexesabovedefinedbypartitionmatrixtendtobemonotonic.Thislimitstheirapplications.XieandBenidefinedavalidityfunctioncalledindexofXie-Beni[7]asfollows:)min(2112kikicinjijijXBvvnvxuL−−=≠==∑∑Where),,2,1(njxjL=isapatternindatasetX,),,2,1(cjvjL=istheprototypeofthejthcluster,and•standsforakindofdistancemeasure.III.ANEWVALIDITYFUNCTIONANDCORRESPONDINGSELF-ADAPTINGFCMInthissection,wepresentanewvalidityfunctionandthecorrespondingalgorithm.Atlast,weusefourgroupsofsyntheticdatatotestthealgorithm.A.TheNewvalidityFunctionAgoodpartitionshouldsatisfytworequirements:(a)divergence:Theinter-clusterdistancesshouldbeasbiggeraspossible;(b)compactness:Theintra-clusterdistancesshouldbeassmalleraspossible.Thevalueoftheratioofthecompactnessandthedivergencecanbethecriterionoftheclusteringvalidity.AccordingtothisguidelineandtheF-statisticvariable,weconstructedthefollowingnewvalidityfunction:∑∑∑∑====−−−−=cinjijmijcinjimijcnvxucxvucL112112)/()1/()(Wherem(m1)isthefuzzyweightingexponent,cisthenumberofclusterstobeexplored,nisthecardofdatasetX,),,2,1(njxjL=isapatternindatasetX,),,2,1(cjvjL=istheprototypeofthejthcluster,and•standsforakindofdistancemeasure.xisthecentralvectoroftheoveralldata∑∑===cinjjmijxunx111.ThenumeratorofL(c)denotesthesumofthedistancesbetweenclassesandthedenominatorofL(c)denotesthesumoftheintra-distancesofalltheclusters.SothebiggerL(c)is,themorereliabletheresultofclusteringis.TheclusternumbercisthebestonewhenL(c)reachesitsmaximumvalue.B.AlgorithmNow,w
本文标题:FCM算法中参数的优选方法及实例应用
链接地址:https://www.777doc.com/doc-1437830 .html