机器学习十大算法：SVM

Chapter3SVM:SupportVectorMachinesHuiXue,QiangYang,andSongcanChenContents3.1SupportVectorClassiﬁer................................................373.2SVCwithSoftMarginandOptimization.................................413.3KernelTrick............................................................423.4TheoreticalFoundations.................................................473.5SupportVectorRegressor...............................................503.6SoftwareImplementations...............................................523.7CurrentandFutureResearch............................................523.7.1ComputationalEfﬁciency........................................523.7.2KernelSelection.................................................533.7.3GeneralizationAnalysis..........................................533.7.4StructuralSVMLearning........................................543.8Exercises...............................................................55References...................................................................56Supportvectormachines(SVMs),includingsupportvectorclassiﬁer(SVC)andsup-portvectorregressor(SVR),areamongthemostrobustandaccuratemethodsinallwell-knowndataminingalgorithms.SVMs,whichwereoriginallydevelopedbyVapnikinthe1990s[1–11],haveasoundtheoreticalfoundationrootedinstatisti-callearningtheory,requireonlyasfewasadozenexamplesfortraining,andareofteninsensitivetothenumberofdimensions.Inthepastdecade,SVMshavebeendevelopedatafastpacebothintheoryandpractice.3.1SupportVectorClassiﬁerForatwo-classlinearlyseparablelearningtask,theaimofSVCistoﬁndahyperplanethatcanseparatetwoclassesofgivensampleswithamaximalmarginwhichhasbeenprovedabletoofferthebestgeneralizationability.Generalizationabilityreferstothefactthataclassiﬁernotonlyhasgoodclassiﬁcationperformance(e.g.,accuracy)onthetrainingdata,butalsoguaranteeshighpredictiveaccuracyforthefuturedatafromthesamedistributionasthetrainingdata.37©2009byTaylor&FrancisGroup,LLC38SVM:SupportVectorMachinesOptimalHyperplanewTx+b=0x2x1r*r*ρFigure3.1IllustrationoftheoptimalhyperplaneinSVCforalinearlyseparablecase.Intuitively,amargincanbedeﬁnedastheamountofspace,orseparation,betweenthetwoclassesasdeﬁnedbyahyperplane.Geometrically,themargincorrespondstotheshortestdistancebetweentheclosestdatapointstoanypointonthehyper-plane.Figure3.1illustratesageometricconstructionofthecorrespondingoptimalhyperplaneundertheaboveconditionsforatwo-dimensionalinputspace.Letwandbdenotetheweightvectorandbiasintheoptimalhyperplane,respec-tively.ThecorrespondinghyperplanecanbedeﬁnedaswTx+b=0(3.1)Thedesireddirectionallygeometricaldistancefromthesamplextotheoptimalhyperplane[12,13]isr=g(x)w(3.2)whereg(x)=wTx+bisthediscriminantfunction[7]asdeﬁnedbythehyperplaneandalsocalledx’sfunctionalmargingivenwandb.Consequently,SVCaimstoﬁndtheparameterswandbforanoptimalhyperplaneinordertomaximizethemarginofseparation[ρinEquation(3.5)]thatisdeterminedbytheshortestgeometricaldistancesr∗fromthetwoclasses,respectively,thusSVCisalsocalledmaximalmarginclassiﬁer.Nowwithoutlossofgenerality,weﬁxthefunctionalmargin[7]tobeequalto1;thatis,givenatrainingset{xi,yi}ni=1∈Rm×{±1},wehavewTxi+b≥1foryi=+1wTxi+b≤−1foryi=−1(3.3)©2009byTaylor&FrancisGroup,LLC3.1SupportVectorClassiﬁer39Theparticulardatapoints(xi,yi)forwhichtheequalitiesoftheﬁrstorsecondpartsinEquation(3.3)aresatisﬁedarecalledsupportvectors,whichareexactlytheclosestdatapointstotheoptimalhyperplane[13].Then,thecorrespondinggeometricaldistancefromthesupportvectorx∗totheoptimalhyperplaneisr∗=g(x∗)w=⎧⎪⎪⎨⎪⎪⎩1wify∗=+1−1wify∗=−1(3.4)FromFigure3.1,clearlythemarginofseparationρisρ=2r∗=2w(3.5)Toensurethatthemaximummarginhyperplanecanbefound,SVCattemptstomaximizeρwithrespecttowandb:maxw,b2ws.t.yiwTxi+b≥1,i=1,...,n(3.6)Equivalently,minw,b12w2s.t.yi(wTxi+b)≥1,i=1,...,n(3.7)Here,weoftenusew2insteadofwfortheconvenienceofcarryingoutthesubsequentoptimizationsteps.Generally,wesolvetheconstrainedoptimizationprobleminEquation(3.7),knownastheprimalproblem,byusingthemethodofLagrangemultipliers.WeconstructthefollowingLagrangefunction:L(w,b,α)=12wTw−ni=1αi yiwTxi+b−1(3.8)whereαiistheLagrangemultiplierwithrespecttotheithinequality.DifferentiatingL(w,b,α)withrespecttowandb,andsettingtheresultsequaltozero,wegetthefollowingtwoconditionsofoptimality:⎧⎪⎪⎨⎪⎪⎩∂L(w,b,α)∂w=0∂L(w,b,α)∂b=0(3.9)©2009byTaylor&FrancisGroup,LLC40SVM:SupportVectorMachinesThenweobtain⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩w=ni=1αiyixini=1αiyi=0(3.10)SubstitutingEquation(3.10)intotheLagrangefunctionEquation(3.8),wecangetthecorrespondingdualproblem:maxαW(α)=ni=1αi−12ni=1nj=1αiαjyiyjxTixjs.t.ni=1αiyi=0αi≥0,i=1,...,n(3.11)Andatthesametime,theKarush-Kuhn-Tuckercomplementaryconditionisαi yiwTxi+b−1=0,i=1,...,n(3.12)Consequently,onlythesupportvectors(xi,yi)thataretheclosestdatapointstotheoptimalhyperplaneanddeterminethemaximalmargin,correspondtothenonzeroαis.Alltheotherαisequalzero.ThedualprobleminEquation(3.11)isatypicalconvexquadraticprogrammingoptimizationproblem.Inmanycases,itcanefﬁcientlyconvergetotheglobaloptimumbyadoptingsomeappr

机器学习十大算法：SVM

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

SAPHANA入门

中国工程建设优秀职业经理人

梅州大桥主桥桩基施工溶洞处理

富硒农业科技产业园建设规划

管理学教程第九章创新

产品管理制度

1企业清浩发展战略与CDM

并行多机成组工作总流水时间调度问题

TDA600程控电话交换机招标文件-标准语音交换机技术要

美国全球战略布局瞄准瓦罕走廊

相关文档

相关搜索

机器学习十大算法：SVM

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

SAPHANA入门

中国工程建设优秀职业经理人

梅州大桥主桥桩基施工溶洞处理

富硒农业科技产业园建设规划

管理学教程第九章创新

产品管理制度

1企业清浩发展战略与CDM

并行多机成组工作总流水时间调度问题

TDA600程控电话交换机招标文件-标准语音交换机技术要

美国全球战略布局 瞄准瓦罕走廊

相关文档

相关搜索

美国全球战略布局瞄准瓦罕走廊