Experiments with a New Boosting Algorithm

1、MachineLearning:ProceedingsoftheThirteenthInternationalConference,1996.ExperimentswithaNewBoostingAlgorithmYoavFreundRobertE.SchapireAT&TLaboratories600MountainAvenueMurrayHill,NJ07974-0636yoav,schapire@research.att.comAbstract.Inanearlierpaper,weintroducedanew“boosting”algorithmcalledAdaBoostwhich,theoretically,canbeusedtosigniﬁcantlyreducetheerrorofanylearningalgorithmthatcon-sistentlygeneratesclassiﬁerswhoseperformanceisalittlebetterthanrandomguessing.Wealsointroducedtherelatednotionofa“ps。

2、eudo-loss”whichisamethodforforcingalearningalgorithmofmulti-labelconceptstoconcentrateonthelabelsthatarehardesttodiscriminate.Inthispaper,wedescribeexperimentswecarriedouttoassesshowwellAdaBoostwithandwithoutpseudo-loss,performsonreallearningproblems.Weperformedtwosetsofexperiments.TheﬁrstsetcomparedboostingtoBreiman’s“bagging”methodwhenusedtoaggregatevariousclassiﬁers(includingdecisiontreesandsingleattribute-valuetests).Wecomparedtheperformanceofthetwomethodsonacollectionofmachine-learningbench。

3、marks.Inthesecondsetofexperiments,westudiedinmoredetailtheperformanceofboostingusinganearest-neighborclassiﬁeronanOCRproblem.1INTRODUCTION“Boosting”isageneralmethodforimprovingtheperfor-manceofanylearningalgorithm.Intheory,boostingcanbeusedtosigniﬁcantlyreducetheerrorofany“weak”learningalgorithmthatconsistentlygeneratesclassiﬁerswhichneedonlybealittlebitbetterthanrandomguessing.Despitethepotentialbeneﬁtsofboostingpromisedbythetheoret-icalresults,thetruepracticalvalueofboostingcanonlybeassessedby。

4、testingthemethodonrealmachinelearningproblems.Inthispaper,wepresentsuchanexperimentalassessmentofanewboostingalgorithmcalledAdaBoost.Boostingworksbyrepeatedlyrunningagivenweak1learningalgorithmonvariousdistributionsoverthetrain-ingdata,andthencombiningtheclassiﬁersproducedbytheweaklearnerintoasinglecompositeclassiﬁer.TheﬁrstprovablyeffectiveboostingalgorithmswerepresentedbySchapire[20]andFreund[9].Morerecently,wede-scribedandanalyzedAdaBoost,andwearguedthatthisnewboostingalgorithmhascertainprope。

5、rtieswhichmakeitmorepracticalandeasiertoimplementthanitsprede-cessors[10].Thisalgorithm,whichweusedinallourexperiments,isdescribedindetailinSection2.Homepage:“”.Expectedtochangeto“˜uid”some-timeinthenearfuture(foruidyoav,schapire).1Weusetheterm“weak”learningalgorithm,eventhough,inpractice,boostingmightbecombinedwithaquitestronglearningalgorithmsuchasC4.5.Thispaperdescribestwodistinctsetsofexperiments.Intheﬁrstsetofexperiments,describedinSection3,wecomparedboostingto“bagging,”amethoddescribed。

6、byBreiman[1]whichworksinthesamegeneralfashion(i.e.,byrepeatedlyrerunningagivenweaklearningalgorithm,andcombiningthecomputedclassiﬁers),butwhichcon-structseachdistributioninasimplermanner.(Detailsgivenbelow.)Wecomparedboostingwithbaggingbecausebothmethodsworkbycombiningmanyclassiﬁers.Thiscom-parisonallowsustoseparateouttheeffectofmodifyingthedistributiononeachround(whichisdonedifferentlybyeachalgorithm)fromtheeffectofvotingmultipleclassiﬁers(whichisdonethesamebyeach).Inourexperiments,wecomparedbo。

7、ostingtobaggingusinganumberofdifferentweaklearningalgorithmsofvaryinglevelsofsophistication.Theseinclude:(1)analgorithmthatsearchesforverysimplepredictionruleswhichtestonasingleattribute(similartoHolte’sverysim-pleclassiﬁcationrules[14]);(2)analgorithmthatsearchesforasinglegooddecisionrulethattestsonaconjunctionofattributetests(similarinﬂavortotherule-formationpartofCohen’sRIPPERalgorithm[3]andF¨urnkranzandWidmer’sIREPalgorithm[11]);and(3)Quinlan’sC4.5decision-treealgorithm[18].Wetestedthesealgo。

8、rithmsonacollectionof27benchmarklearningproblemstakenfromtheUCIrepository.Themainconclusionofourexperimentsisthatboost-ingperformssigniﬁcantlyanduniformlybetterthanbag-gingwhentheweaklearningalgorithmgeneratesfairlysimpleclassiﬁers(algorithms(1)and(2)above).WhencombinedwithC4.5,boostingstillseemstooutperformbaggingslightly,buttheresultsarelesscompelling.Wealsofoundthatboostingcanbeusedwithverysim-plerules(algorithm(1))toconstructclassiﬁersthatarequitegoodrelative,say,toC4.5.KearnsandMansour[16]a。

9、rguethatC4.5canitselfbeviewedasakindofboostingalgo-rithm,soacomparisonofAdaBoostandC4.5canbeseenasacomparisonoftwocompetingboostingalgorithms.SeeDietterich,KearnsandMansour’spaper[4]formoredetailonthispoint.Inthesecondsetofexperiments,wetesttheperfor-manceofboostingonanearestneighborclassiﬁerforhand-writtendigitrecognition.Inthiscasetheweaklearningalgorithmisverysimple,andthisletsusgainsomeinsightintotheinteractionbetweentheboostingalgorithmandthenearestneighborclassiﬁer.Weshowthattheboostingal-。

10、gorithmisaneffectivewayforﬁndingasmallsubsetofprototypesthatperformsalmostaswellasthecompleteset.WealsoshowthatitcomparesfavorablytothestandardmethodofCondensedNearestNeighbor[13]intermsofitstesterror.Thereseemtobetwoseparatereasonsfortheimprove-mentinperformancethatisachievedbyboosting.Theﬁrstandbetterunderstoodeffectofboostingisthatitgeneratesahypothesiswhoseerroronthetrainingsetissmallbycom-biningmanyhypotheseswhoseerro。