您好,欢迎访问三七文档
ClassifierSelectionforMajorityVotingDymitrRuta∗andBogdanGabrys†ABSTRACTIndividualclassificationmodelsarerecentlychallengedbycombinedpatternrecognitionsystems,whichoftenshowbetterperformance.Insuchsystemstheoptimalsetofclassifiersisfirstselectedandthencombinedbyaspecificfusionmethod.Forasmallnumberofclassifiersoptimalensemblescanbefoundexhaustively,buttheburdenofexponentialcomplexityofsuchsearchlimitsitspracticalapplicabilityforlargersystems.Asaresult,simplersearchalgorithmsand/orselectioncriteriaareneededtoreducethecomplexity.Thisworkprovidesarevisionoftheclassifierselectionmethodologyandevaluatesthepracticalapplicabilityofdiversitymeasuresinthecontextofcombiningclassifiersbymajorityvoting.Anumberofsearchalgorithmsareproposedandadjustedtoworkproperlywithanumberofselectioncriteriaincludingmajorityvotingerrorandvariousdiversitymeasures.Extensiveexperimentscarriedoutwith15classifierson27datasetsindicateinappropriatenessofdiversitymeasuresusedasselectioncriteriainfavourofthedirectcombinererrorbasedsearch.Furthermore,theresultspromptedanoveldesignofmultipleclassifiersystemsinwhichselectionandfusionarerecurrentlyappliedtoapopulationofbestcombinationsofclassifiersratherthantheindividualbest.Theimprovementofthegeneralisationperformanceofsuchsystemisdemonstratedexperimentally.KEYWORDSClassifierFusion,ClassifierSelection,Diversity,SearchAlgorithms,MajorityVoting,Generalisation1IntroductionGivenalargepoolofdifferentclassifiersthereareanumberofpossiblecombiningstrategiestofollowanditisusuallynotclearwhichonemaybetheoptimalforaparticularproblem.Thesimpleststrategycouldbe∗ComputationalIntelligenceGroup,BTExactTechnologies,OrionBuilding1stfloor,pp12,AdastralPark,MartleshamHeath,IpswichIP53RE,UK,dymitr.ruta@bt.com†ComputationalIntelligenceResearchGroup,BournemouthUniversity,SchoolofDesign,Engineering&Computing,PooleHouse,TalbotCampus,FernBarrowPooleBH125BB,UnitedKingdom,bgabrys@bournemouth.ac.uktoselectthesingle,bestperformingclassifieronthetrainingdataandapplyingittothepreviouslyunseenpatterns[26].Suchanapproach,althoughthesimplest,doesnotguaranteetheoptimalperformance[28].Moreover,thereisapossibilitythatatleastsomesubsetsofclassifierscouldjointlyoutperformthebestclassifierifsuitablycombined.Toensuretheoptimalperformance,amultipleclassifierdesignshouldbeabletoselectthesubsetofclassifiersthatisoptimalinthesensethatitproducesthehighestpossibleperformanceforaparticularcombiner.Ononehand,itisclearthatcombiningthesameclassifiersdoesnotcontributetoanythingbuttheincreasedcomplexityofasystem.Ontheotherhand,differentbutmuchworseperformingclassifiersareunlikelytobringanybenefitsincombinedperformance.Itisbelievedthattheoptimalcombinationsofclassifiersshouldhavegoodindividualperformancesandatthesametimesufficientlevelofdiversity[35].Inmanyrecentworksithasbeenshownhoweverthatneitherindividualperformances[27],[40]nordiversity[37],[29]ontheirownprovideareliablediagnostictoolabletodetectwhencombineroutperformstheindividualbestclassifier.AsnotedbyRogova[27],individualclassifierperformancesdonotrelatewelltocombinedperformanceastheymissouttheimportantinformationabouttheteamstrengthoftheclassifiers.Inturn,diversity,duetoproblemswithmeasuringandevenperceivingit,alsodoesnotprovideareliableselectioncriterionthatwouldbewellcorrelatedwithcombinerperformance[31].Someattemptsatincludingbothcomponentsjointlyguidingselectionprovedtobehighlycomplexwhileofferingonlyrelativelysmallimprovements[40],[30].Alittlemoresuccessfulhavebeenselectionattemptsbasedonspecificsimilaritymeasuresdevisedinconjunctionwiththecombinerforwhichtheclassifiersareselected.Thefaultmajoritypresentedin[29]orsimilarityS3hmeasurepresentedin[16]arejusttwoexamplesthathaveshownhighcorrelationwithmajorityvotingperformance.Unlikegeneralstatisticallydrivendiversitymeasures,measuresexploitingcombinerdefinitiontakeintoaccountinformationofwhatmakesaparticularcombinerworkandselectionguidedbysuchacombinernaturallyhavegreaterchancesofbeingsuccessful.Allthesefindingspointtothecombinedperformanceasarelevantselectioncriterion.Effectivelythemostreliablestrategyseemstobeevaluationofasmanydifferentdesignsaspossibleandsubsequentselectionofthebestperformingmodel.Adifficultyhoweveristhatsuchawideopenscaleofevaluationiscomputationallyintractable.Torealisethis,itissufficienttonotethatassumingachosencombiner,evaluationofallsubsetsfromanensembleofredundantclassifiersisaprocessgrowingexponentiallywiththenumberofclassifiers.Ontopofthat,forlargenumbersofclassifierstheperfor-mancebasedsearchspacebecomesincreasinglyflatwhichmakesselectionevenmoredifficult[40].Inthelightofsuchdifficulties,amodulardecompositionmodelofcombiningseemsadvisable,particularly2ifonlyonelocallybestclassifieristobeselectedforaparticularsubtaskorlocalinputsubspace.Anum-berofdynamicselectionmodels[7],[8],[10]orclusterandselectbasedapproaches[18],[24]illustratethatadvantageandinsomecasesshowevensubstantialimprovementcomparedwiththeindividualbestclassifier.However,byanalogytoredundantcombining,ingeneral,improvementmaybealsosoughtincombiningmanyclassifierswithineachsubtaskorinputsubspace,whichcom
本文标题:ABSTRACT Classifier Selection for Majority Voting
链接地址:https://www.777doc.com/doc-6182583 .html