您好,欢迎访问三七文档
1SequentialMinimalOptimization:AFastAlgorithmforTrainingSupportVectorMachinesJohnC.PlattMicrosoftResearchjplatt@microsoft.comTechnicalReportMSR-TR-98-14April21,1998©1998JohnPlattABSTRACTThispaperproposesanewalgorithmfortrainingsupportvectormachines:SequentialMinimalOptimization,orSMO.Trainingasupportvectormachinerequiresthesolutionofaverylargequadraticprogramming(QP)optimizationproblem.SMObreaksthislargeQPproblemintoaseriesofsmallestpossibleQPproblems.ThesesmallQPproblemsaresolvedanalytically,whichavoidsusingatime-consumingnumericalQPoptimizationasaninnerloop.TheamountofmemoryrequiredforSMOislinearinthetrainingsetsize,whichallowsSMOtohandleverylargetrainingsets.Becausematrixcomputationisavoided,SMOscalessomewherebetweenlinearandquadraticinthetrainingsetsizeforvarioustestproblems,whilethestandardchunkingSVMalgorithmscalessomewherebetweenlinearandcubicinthetrainingsetsize.SMO’scomputationtimeisdominatedbySVMevaluation,henceSMOisfastestforlinearSVMsandsparsedatasets.Onreal-worldsparsedatasets,SMOcanbemorethan1000timesfasterthanthechunkingalgorithm.1.INTRODUCTIONInthelastfewyears,therehasbeenasurgeofinterestinSupportVectorMachines(SVMs)[19][20][4].SVMshaveempiricallybeenshowntogivegoodgeneralizationperformanceonawidevarietyofproblemssuchashandwrittencharacterrecognition[12],facedetection[15],pedestriandetection[14],andtextcategorization[9].However,theuseofSVMsisstilllimitedtoasmallgroupofresearchers.OnepossiblereasonisthattrainingalgorithmsforSVMsareslow,especiallyforlargeproblems.AnotherexplanationisthatSVMtrainingalgorithmsarecomplex,subtle,anddifficultforanaverageengineertoimplement.ThispaperdescribesanewSVMlearningalgorithmthatisconceptuallysimple,easytoimplement,isgenerallyfaster,andhasbetterscalingpropertiesfordifficultSVMproblemsthanthestandardSVMtrainingalgorithm.ThenewSVMlearningalgorithmiscalledSequentialMinimalOptimization(orSMO).InsteadofpreviousSVMlearningalgorithmsthatusenumericalquadraticprogramming(QP)asaninnerloop,SMOusesananalyticQPstep.ThispaperfirstprovidesanoverviewofSVMsandareviewofcurrentSVMtrainingalgorithms.TheSMOalgorithmisthenpresentedindetail,includingthesolutiontotheanalyticQPstep,MoredocumentsanddatumdownloadwebsiteLuZhenbo'sBlog:blog.sina.com.cn/luzhenbo2Communication&Cooperation:luzhenbo@yahoo.com.cn2heuristicsforchoosingwhichvariablestooptimizeintheinnerloop,adescriptionofhowtosetthethresholdoftheSVM,someoptimizationsforspecialcases,thepseudo-codeofthealgorithm,andtherelationshipofSMOtootheralgorithms.SMOhasbeentestedontworeal-worlddatasetsandtwoartificialdatasets.ThispaperpresentstheresultsfortimingSMOversusthestandard“chunking”algorithmforthesedatasetsandpresentsconclusionsbasedonthesetimings.Finally,thereisanappendixthatdescribesthederivationoftheanalyticoptimization.1.1OverviewofSupportVectorMachinesVladimirVapnikinventedSupportVectorMachinesin1979[19].Initssimplest,linearform,anSVMisahyperplanethatseparatesasetofpositiveexamplesfromasetofnegativeexampleswithmaximummargin(seefigure1).Inthelinearcase,themarginisdefinedbythedistanceofthehyperplanetothenearestofthepositiveandnegativeexamples.TheformulafortheoutputofalinearSVMisuwxb=⋅-rr,(1)wherewisthenormalvectortothehyperplaneandxistheinputvector.Theseparatinghyperplaneistheplaneu=0.Thenearestpointslieontheplanesu=±1.Themarginmisthusmw=12||||.(2)Maximizingmargincanbeexpressedviathefollowingoptimizationproblem[4]:min||||(),,,rrrrwbiiwywxbi1221subjectto⋅-≥∀(3)PositiveExamplesNegativeExamplesMaximizedistancestonearestpointsSpaceofpossibleinputsFigure1AlinearSupportVectorMachineMoredocumentsanddatumdownloadwebsiteLuZhenbo'sBlog:blog.sina.com.cn/luzhenbo2Communication&Cooperation:luzhenbo@yahoo.com.cn3wherexiistheithtrainingexample,andyiisthecorrectoutputoftheSVMfortheithtrainingexample.Thevalueyiis+1forthepositiveexamplesinaclassand–1forthenegativeexamples.UsingaLagrangian,thisoptimizationproblemcanbeconvertedintoadualformwhichisaQPproblemwheretheobjectivefunctionΨissolelydependentonasetofLagrangemultipliersαi,min()min(),rrrrrααααααΨ=⋅-===∑∑∑12111yyxxijNiNjijijiiN(4)(whereNisthenumberoftrainingexamples),subjecttotheinequalityconstraints,αii≥∀0,,(5)andonelinearequalityconstraint,yiiNi=∑=10α.(6)Thereisaone-to-onerelationshipbetweeneachLagrangemultiplierandeachtrainingexample.OncetheLagrangemultipliersaredetermined,thenormalvectorrwandthethresholdbcanbederivedfromtheLagrangemultipliers:rrrrwyxbwxyiiNiikk==⋅-=∑10αα,.forsomek(7)Becauserwcanbecomputedviaequation(7)fromthetrainingdatabeforeuse,theamountofcomputationrequiredtoevaluatealinearSVMisconstantinthenumberofnon-zerosupportvectors.Ofcourse,notalldatasetsarelinearlyseparable.Theremaybenohyperplanethatsplitsthepositiveexamplesfromthenegativeexamples.Intheformulationabove,thenon-separablecasewouldcorrespondtoaninfinitesolution.However,in1995,Cortes&Vapnik[7]suggestedamodificationtotheoriginaloptimizationstatement(3)whichallows,butpenalizes,thefailureofanexampletoreachthecorrectmargin.Thatmodificationis:min||||(),,,,rrrrrwbiiNiiiwCywxbiξξξ12211+⋅-≥-∀=∑subjectto(8)whereξiareslackvaria
本文标题:Sequential-Minimal-Optimization---A-Fast-Algorithm
链接地址:https://www.777doc.com/doc-5433810 .html