您好,欢迎访问三七文档
当前位置:首页 > 医学/心理学 > 药学 > 基于机器学习的网络流量分类研究进展_王涛
JournalofChineseComputerSystems201255Vol.33No.520122010-12-28U0735002609701462009A080207008.1983、1958、、.1215100062510006E-mailwangtaosea@msn.com.2004、、.、、.TP393A1000-1220201205-1034-07AdvancesinMachineLearningBasedNetworkTrafficClassificationWANGTao1YUShun-zheng21DepartmentofNetworkandInformationEngineeringGuangdongUniversityofTechnologyGuangzhou510006China2DepartmentofElectronicandCommunicationEngineeringSunYat-SenUniversityGuangzhou510006ChinaAbstractMLmachinelearningemploysstatisticalnetworkflowcharacteristicstoassistintheIPtrafficclassificationidentificationandclassificationwhichisdifferentwithtraditionalmethodsthatdependonwellknownapplicationportnumbersordeeplyinspectingthecontentsofpacketpayloads.ML-basednetworktrafficclassificationhasbeenresearchedwidelyanddevelopedrapidly.Thissur-veyreviewsthesignificantworksthatcoverthedominantperiodsince2004andcategorizeanalyzeandcomparethemaccordingtotheirchoiceofMLstrategieswhichincludesupervisedunsupervisedandsemi-supervisedlearningalgorithms.WeimportantlydiscusstheorientationsandchallengesfortheemploymentofML-basedtrafficclassifiersinoperationalIPnetworks.Morespecificallythekeyissuessuchassamplelabelingbottleneckskeweddatadistributionreal-timeandcontinuousclassificationandscalabilityofclas-sificationalgorithmsarediscussed.Keywordsmachinelearningnetworkflownetworktrafficclassificationstatisticalcharacteristics1P2P.、.QoS..intru-siondetectionsystemIDS1、..2IANA..34P2PFTP.5.2IP.2....flow、..22.1、、、、67.flow-IP、-.1.C=C1C2…CkT=t1t2…tntiAi=Ai1Ai2…Aim2.fT→C.1Fig.1TrainingprocessofsupervisedMLbasedIPtrafficclassification1、2.1、、、..2..2Fig.2DataflowwithinanoperationalsupervisedMLtrafficclassifier2.2.、、、.filterwrapper.Filter、.CFS8、CON9、FCBF10.Wrapper11wrapper12.DashFilter13.2.3.nmi11iTable1ThecontingencytableforcategoryiCategoryiExpertjudgmentsTrueFalseClassifierjudgmentsPositiveTPFPNegativeFNTNTPtruepositiveiTPiFNfalsenegativeiFNiFPfalsepositivei53015iFPi.3recall、precisionoverallrecall1~3recalli=TPiTPi+FNi1precisioni=TPiTPi+FPi2overall_recall=∑mi=1TPi∑mi=1TPi+FNi33.iiiii.、..33.1..3.1.1Huang14K-K-nearestneighborKNN.690%.K..Roughan15K-.5、、、、..3.1.2MooreZuev16NavebayesNB.65%.Moore95%.Moore17299%95%.、.3.1.318C4.5boosting2.C4.5.19C4.5..C4.5NBC4.5.Li20C4.5AdaBoost+C4.512TCP5-6.P2P99%.Raahemi21CVFDFP2P.C4.5.3.1.422supportvectormachineSVM.structuralriskminimizationSRM.SVM.Moore247.63012012RuiW23V-SVMP2P.P2Pnon-P2P.SVMSVM.SVMSVM.3.1.5Shen24BPP2PBPP2Pnon-P2p4.96.3%BP.Sun25probabilisticneuralnetworkPNNPNN、RBFPNN.BPPNNwebP2P.Raahemi26fuzzyART-MAPP2PBP.80%78%..3.1.6Nguyen27、、...Nguyen28EM.Huang29NaveBayesSMOBayesiannetworkpartialdecisiontreeC4.5partialdecisiontree97.24%.3.1.7Wang30SunYat-Sen7FTP、MSN、PPLive、QQ、Web、Thunder、QQGame.NaveBayesNB、DecisionTree、BayesianNeutralNetworkBNN、NaveBayesTreeNBTAdaBoost590%.Williams31NaveBayeswithdiscretisationNBD、NaveBayeswithkerneldensityestimationNBK、C4.5、Bayesiannet-work、NaveBayestreeNBT5.NLANR225、、..C4.5NBD、、NBT、NBK.NBT、C4.5、、NBD、NBK..3.2..3.2.1EMMcGregor32expectationmaxi-maizationEMHTTP、FTP、SMTP、IMAP、NTPDNSEM、.Erman33EMBayes..Zander34autoclassEMExpectationMaximaization.86.5%.73015.3.2.2K-MeansBernaille35TCP.K-Means.、Bernaille.36K-Means.X-MeansK-MeansX-Means.32NaveNayesC4.593.96%92.72%..Hirvonen37K-means.4.3.2.3DBSCANYang38DBSCANDBSCAN3.DBSCAN87%..3.2.4Erman393K-MeansDBSCANAutoClass2AucklandCalgary94..AutoClassK-MeansKDBSCAN..3.3、.40K-Means.d.NaveBayesNB、K-MeansK-MeansNB.Erman41Shrivastav42K-Means.4、、.4.1....4.2.web、mail、p2p...22SVM、NB83012012SVMNBSVMNB...4.3IP.Huang29Bernaille35.Nguyen27-28、..4.4.、.、.SVM22Om3Om2m..5、、、..、..References1SnortEB/OL.http//.snort.org2008.2InternetassignednumbersauthorityEB/OL.http//.iana.org2008.3KaragiannisTBroidoABrownleeNetal.IsP2PdyingorjusthidingC.GlobalTelecommunicationsConference200431532-1538.4MadhukarAWilliamsonC.AlongitudinalstudyofP2PtrafficclassificationC.InProc.ofthe14thIEEEInt'lSymp.onModelingAnalysisandSimulation.Monterey2006.5MooreAWPapagiannakiK.TowardtheaccurateidentificationofnetworkapplicationsA.InDovrolisCed.Proc.ofthePAM2005.LNCS3431CHeidelbergSpringer-Verlag200541-54.6FisherHDPazzaniJMLangleyP.Conceptformationknowl-edgeandexperienceinunsupervisedlearningM.MorganKauf-mann1991.7WittenIFrankE.DataminingpracticalmachinelearningtoolsandtechniqueswithJavaimplementationssecondeditionM.MorganKaufmannPublishers2005.8HallM.Correlation-basedfeatureselectionfrommachinelearningD.NewZealandDepartmentofComputerScienceWaikatoUniversity1998.9DashMLiuH.Consistency-basedsearchinfeatureselectionJ.ArtificialIntelligence2003151122155-176.10YuLeiLiuHuan.Featureselectionforhigh-dimensionaldataafastcorrelation-basedfiltersolutionC.Proceedingsofthe12thInternationalConferenceonMachineLearning2003856-863.11KohaviRJohnGH.WrappersforfeaturesubsetselectionJ.ArtificialIntelligent1997971-2273-324.12ParkJTyanHRKC-CJ.GA-basedInternettrafficclassificationtechniqueforQoSprovisioningC.InProc.2006InternationalConferenceonIntelligentInformationHidingandMultimediaSig-nalProcessingPasadenaCaliforniaDecember2006.13DashMChoiKS
本文标题:基于机器学习的网络流量分类研究进展_王涛
链接地址:https://www.777doc.com/doc-4535394 .html