ABSTRACT Classifier Selection for Majority Voting

ClassiﬁerSelectionforMajorityVotingDymitrRuta∗andBogdanGabrys†ABSTRACTIndividualclassiﬁcationmodelsarerecentlychallengedbycombinedpatternrecognitionsystems,whichoftenshowbetterperformance.Insuchsystemstheoptimalsetofclassiﬁersisﬁrstselectedandthencombinedbyaspeciﬁcfusionmethod.Forasmallnumberofclassiﬁersoptimalensemblescanbefoundexhaustively,buttheburdenofexponentialcomplexityofsuchsearchlimitsitspracticalapplicabilityforlargersystems.Asaresult,simplersearchalgorithmsand/orselectioncriteriaareneededtoreducethecomplexity.Thisworkprovidesarevisionoftheclassiﬁerselectionmethodologyandevaluatesthepracticalapplicabilityofdiversitymeasuresinthecontextofcombiningclassiﬁersbymajorityvoting.Anumberofsearchalgorithmsareproposedandadjustedtoworkproperlywithanumberofselectioncriteriaincludingmajorityvotingerrorandvariousdiversitymeasures.Extensiveexperimentscarriedoutwith15classiﬁerson27datasetsindicateinappropriatenessofdiversitymeasuresusedasselectioncriteriainfavourofthedirectcombinererrorbasedsearch.Furthermore,theresultspromptedanoveldesignofmultipleclassiﬁersystemsinwhichselectionandfusionarerecurrentlyappliedtoapopulationofbestcombinationsofclassiﬁersratherthantheindividualbest.Theimprovementofthegeneralisationperformanceofsuchsystemisdemonstratedexperimentally.KEYWORDSClassiﬁerFusion,ClassiﬁerSelection,Diversity,SearchAlgorithms,MajorityVoting,Generalisation1IntroductionGivenalargepoolofdiﬀerentclassiﬁersthereareanumberofpossiblecombiningstrategiestofollowanditisusuallynotclearwhichonemaybetheoptimalforaparticularproblem.Thesimpleststrategycouldbe∗ComputationalIntelligenceGroup,BTExactTechnologies,OrionBuilding1stﬂoor,pp12,AdastralPark,MartleshamHeath,IpswichIP53RE,UK,dymitr.ruta@bt.com†ComputationalIntelligenceResearchGroup,BournemouthUniversity,SchoolofDesign,Engineering&Computing,PooleHouse,TalbotCampus,FernBarrowPooleBH125BB,UnitedKingdom,bgabrys@bournemouth.ac.uktoselectthesingle,bestperformingclassiﬁeronthetrainingdataandapplyingittothepreviouslyunseenpatterns[26].Suchanapproach,althoughthesimplest,doesnotguaranteetheoptimalperformance[28].Moreover,thereisapossibilitythatatleastsomesubsetsofclassiﬁerscouldjointlyoutperformthebestclassiﬁerifsuitablycombined.Toensuretheoptimalperformance,amultipleclassiﬁerdesignshouldbeabletoselectthesubsetofclassiﬁersthatisoptimalinthesensethatitproducesthehighestpossibleperformanceforaparticularcombiner.Ononehand,itisclearthatcombiningthesameclassiﬁersdoesnotcontributetoanythingbuttheincreasedcomplexityofasystem.Ontheotherhand,diﬀerentbutmuchworseperformingclassiﬁersareunlikelytobringanybeneﬁtsincombinedperformance.Itisbelievedthattheoptimalcombinationsofclassiﬁersshouldhavegoodindividualperformancesandatthesametimesuﬃcientlevelofdiversity[35].Inmanyrecentworksithasbeenshownhoweverthatneitherindividualperformances[27],[40]nordiversity[37],[29]ontheirownprovideareliablediagnostictoolabletodetectwhencombineroutperformstheindividualbestclassiﬁer.AsnotedbyRogova[27],individualclassiﬁerperformancesdonotrelatewelltocombinedperformanceastheymissouttheimportantinformationabouttheteamstrengthoftheclassiﬁers.Inturn,diversity,duetoproblemswithmeasuringandevenperceivingit,alsodoesnotprovideareliableselectioncriterionthatwouldbewellcorrelatedwithcombinerperformance[31].Someattemptsatincludingbothcomponentsjointlyguidingselectionprovedtobehighlycomplexwhileoﬀeringonlyrelativelysmallimprovements[40],[30].Alittlemoresuccessfulhavebeenselectionattemptsbasedonspeciﬁcsimilaritymeasuresdevisedinconjunctionwiththecombinerforwhichtheclassiﬁersareselected.Thefaultmajoritypresentedin[29]orsimilarityS3hmeasurepresentedin[16]arejusttwoexamplesthathaveshownhighcorrelationwithmajorityvotingperformance.Unlikegeneralstatisticallydrivendiversitymeasures,measuresexploitingcombinerdeﬁnitiontakeintoaccountinformationofwhatmakesaparticularcombinerworkandselectionguidedbysuchacombinernaturallyhavegreaterchancesofbeingsuccessful.Alltheseﬁndingspointtothecombinedperformanceasarelevantselectioncriterion.Eﬀectivelythemostreliablestrategyseemstobeevaluationofasmanydiﬀerentdesignsaspossibleandsubsequentselectionofthebestperformingmodel.Adiﬃcultyhoweveristhatsuchawideopenscaleofevaluationiscomputationallyintractable.Torealisethis,itissuﬃcienttonotethatassumingachosencombiner,evaluationofallsubsetsfromanensembleofredundantclassiﬁersisaprocessgrowingexponentiallywiththenumberofclassiﬁers.Ontopofthat,forlargenumbersofclassiﬁerstheperfor-mancebasedsearchspacebecomesincreasinglyﬂatwhichmakesselectionevenmorediﬃcult[40].Inthelightofsuchdiﬃculties,amodulardecompositionmodelofcombiningseemsadvisable,particularly2ifonlyonelocallybestclassiﬁeristobeselectedforaparticularsubtaskorlocalinputsubspace.Anum-berofdynamicselectionmodels[7],[8],[10]orclusterandselectbasedapproaches[18],[24]illustratethatadvantageandinsomecasesshowevensubstantialimprovementcomparedwiththeindividualbestclassiﬁer.However,byanalogytoredundantcombining,ingeneral,improvementmaybealsosoughtincombiningmanyclassiﬁerswithineachsubtaskorinputsubspace,whichcom

ABSTRACT Classifier Selection for Majority Voting

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

自动化腹膜透析

电力科职责

XXXX首届亚太养老地产高峰论坛

服装连锁加盟(DOC 7)

(辽宁化工环境工程实践能力提升培训

橡胶片丝项目可行性研究报告

法规考前模拟题(二)-XXXX-李向国

法制环境、金融发展与企业长期债务融资

大型商业综合体消防技术标投标书-副本

QSY14332011炼化企业气防用品使用规范

相关文档

相关搜索