您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 信息化管理 > 最小方差优化初始聚类中心的K-means算法
408Vol.40No.8ComputerEngineering20148August2014··1000-3428201408-0205-07ATP18313722502013K12-03-24GK201102007。1971-、。2013-05-102013-08-12E-mailxiejuany@snnu.edu.cnK-means710062K-meansK-means。K-means。K-means。UCI。K-meansK-meansAlgorithmBasedonMinimumDeviationInitializedClusteringCentersXIEJuan-yingWANGYan-eSchoolofComputerScienceShaanxiNormalUniversityXi’an710062China【Abstract】ToovercomethedeficienciesoftraditionalK-meansalgorithmwhoseclusteringisdependentontheseedschosenrandomlyandoftheimprovedK-meansalgorithmswhoseclusteringareunstablefortheparametersselectedarbitrarilyanovelK-meansclusteringalgorithmisproposedinthispaper.ThisnewK-meansalgorithmadoptsthepatterninformationofexemplarsinadatasetandcomputesthedeviationforeachsample.Itusesthewellknownprinciplethatthedeviationofasampleaddressestheintensiveofexemplarsaroundit.Thelessthedeviationisthemoreexemplarsareintensivelygatheredaroundtherelatedsample.TheproposedK-meansalgorithmchoosesthefirstKsampleswiththeminimumdeviationandfarawayfromeachotherastheinitialclustercenterstoimprovetheperformanceofit.TheproposedK-meansalgorithmistestedonUCIdatasetsandonsyntheticdatasetswithsomeproportionalnoises.TheexperimentalresultsdemonstratethattheproposednovelK-meansalgorithmnotonlycanachieveaverypromisingandstableclusteringbutalsogettheimmunepropertywithnoisesinitsclustering.【Keywords】clusteringK-meansalgorithmdeviationintensivedegreeinitializedclusteringcentersDOI10.3969/j.issn.1000-3428.2014.08.03911。。、、2。、、、、1-2。K-meansnK。K-means、。K-meansKK。K-means22014815K。K-meansK-meansK-means。、KK-meansK-meansK-means。2K-meansK-meansKnK。。K-meansnKK。K。K-means3K-means4。Forgy5。K-means。6K-meansK7K-means8-9K-means10K-means11-14KK-means15K-means16-17RPCLK-means18K-means。K-meanK-mean。K-means。3K-means。。611-17K-meansK-means。。。K-meansKK-means。K-meansK-meansK-means。3.119。602408K-means。。。。K-means。RK-means。RRK。K-means。3.2X=xixi∈Rpi=12…nKC1C2…CkW1W2…WkKW。1xixjdxixj=xi-xjTxi-xj槡2ximi=1n∑nj=1dxixj3xivari=1n-1∑dxixj-mi24cmean=2nn+1∑ni=1∑ij=1dxixj5E=∑Kj=1∑xl∈Cjxl-Cj23.3111~3Wx1iC14cmeanc=1W1=xjdxjx1icmeanj=12…nW=W-W12cKc=c+1WxcicCcWc=xjdxjxcicmeanj=12…nxjWrr=12…c-1W=W-WcKC1C2…Ck2。32。211。2。35E。311。2。35E'。4E'-E10-10E=E'3。4UCI11131516K-meansK-means100111315。、、Rand、JaccardAdjustedRandIndex。1620-23AdjustedRandIndex24。4.1UCI25101。70220148151UCIIris15043Wine178133PimaIndiansDiabetes76882Ionoshpere351342Wdbc569302Soybean-samll47354Haberman30632New_thyroid21553Balance-scale62543Seeds21073111105%10%15%20%25%30%35%40%45%50%。3600200。ixμixyμiyiσi。3σl。2。2123μ1x=0μ1y=0μ2x=6μ2y=-1μ3x=6μ3y=2σ1=1.5σ2=0.5σ3=0.5σl=24.2UCI。K-means。。4.2.1UCI310UCI410UCI。3~4。3UCIUCIK-means11131516Iris92.593278.945178.945178.945178.940878.9451Wine2.4206e+0062.6336e+0062.6336e+0062.3707e+0062.3707e+0062.3707e+006PimaIndiansDiabetes5.1424e+0065.1424e+0065.1424e+0065.1424e+0065.1424e+0065.1424e+006Ionosphere2.3823e+0032.3873e+0032.3873e+0032.3873e+0032.4194e+0032.3873e+003Wdbc7.7943e+0077.7943e+0077.7943e+0074.6134e+0177.7943e+0077.7943e+007Soybean-samll235.9480219.9455234.6923222.7330222.1148218.9455Haberman5.0594e+0044.8393e+0044.8393e+0044.8393e+0044.3460e+0044.3514e+004New_thyroid2.9909e+0042.9097e+0042.8774e+0042.9175e+0042.9097e+0042.8158e+004Balance-scale3.6709e+0033.6872e+0033.6772e+0033.6772e+0033.5431e+0033.6592e+003Seeds203.2061203.2061203.2061203.2061203.2061203.20614UCIsUCIK-means11131516Iris0.00492.03621.45416.13690.01130.1409Wine0.00471.15643.60874.44340.05530.0962PimaIndiansDiabetes0.00553.26700.744232.63420.22560.2140Ionosphere0.00372.05301.76966.54910.10110.1014Wdbc0.00432.36241.497914.39190.11640.1243Soybean-samll0.00253.22862.79603.67840.07770.0968Haberman0.00262.86162.85375.37610.09020.0886New_thyroid0.00353.00082.80185.42640.09800.1139Balance-scale0.00772.96720.870718.38700.14020.1646Seeds0.00542.16021.13754.28270.10520.1150802408K-means3K-means100Ionosphere16111315。3IrisHabermanBalance-scale316Soybean-smallnew-thyroidK-meansK-means。K-means。4K-means111315K-means16K-meansPimaIndiansDiabetesHaberman2。16K-means111315。1a~1d、RandIndex、JaccardAdjustedRandIndex。1UCI1aIrisSoybean-small290%。1b10UCIRandK-means9Iris16K-meansRandK-means。1cJaccard。1dAdjustedRandIndex。K-means9022014815。4.2.2K-means、1113151656。5131516K-means11。6K-means15131611K-means。K-means。5/%K-means11131516001.2942e+0031.6758e+003968.53741.0979e+0031.0979e+0031.0979e+003051.4027e+0031.2125e+0031.0214e+0031.2125e+0031.2125e+0031.2125e+003101.3092e+0031.3623e+003989.56201.1595e+0031.1595e+0031.1595e+003151.2655e+0031.2936e+003957.25531.0660e+0031.0660e+0031.0660e+003201.5386e+0031.5965e+0031.0230e+0031.3579e+0031.3579e+0031.3579e+003251.4964e+0031.3921e+0031.1091e+0031.3421e+0031.3421e+0031.3421e+003301.5524e+0031.4402e+0031.1079e+0031.3402e+0031.3402e+0031.3402e+003351.6723e+0031.6274e+0031.5274e+0031.5273e+0031.5274e+0031.5274e+003401.5576e+0031.3968e+0031.2723e+0031.3368e+0031.3368e+0031.3368e+003451.7579e+0031.6200e+0031.3436e+0031.5660e+0031.5660e+0031.5660e+003501.7204e+0031.6041e+0031.3508e+0031.5040e+0031.5040e+0031.5040e+0036s/%K-means1113151600.00330.23230
本文标题:最小方差优化初始聚类中心的K-means算法
链接地址:https://www.777doc.com/doc-6072646 .html