您好,欢迎访问三七文档
POP:PatchworkofPartsModelsforObjectRecognitionYaliAmit∗andAlainTrouv´e†January15,2007∗YaliAmitiswiththeDepartmentofStatisticsandthedepartmentofComputerScience,UniversityofChicago,Chicago,IL,60637.Email:amit@marx.uchicago.edu.SupportedinpartbyNSFITRDMS-0219016.†AlainTrouv´eiswiththeCMLAattheEcoleNormaleSuperieur,Cachan1AbstractWeformulateadeformabletemplatemodelforobjectswithanefficientmecha-nismforcomputationandparameterestimation.Thedataconsistsofbinaryorientededgefeatures,robusttophotometricvariationandsmalllocaldeformations.Thetem-plateisdefinedintermsofprobabilityarraysforeachedgetype.Aprimarycontri-butionofthispaperisthedefinitionoftheinstantiationofanobjectintermsofshiftsofamoderatenumberlocalsubmodels-parts-whicharesubsequentlyrecombinedusingapatchworkoperation,todefineacoherentstatisticalmodelofthedata.Ob-jectclassesaremodeledasmixturesofPOPmodelsthatarediscoveredsequentiallyasmoreclassdataisobserved.Wedefinethenotionofthesupportassociatedtoaninstantiation,andusethistoformulatestatisticalmodelsformulti-objectconfigura-tionsincludingpossibleocclusions.Alldecisionsonthelabelingoftheobjectsintheimagearebasedoncomparinglikelihoods.Thecombinationofadeformablemodelwithanefficientestimationprocedureyieldscompetitiveresultsinavarietyofappli-cationswithverysmalltrainingsets,withoutneedtotraindecisionboundaries-onlydatafromtheclassbeingtrainedisused.ExperimentsarepresentedontheMNISTdatabase,readingzipcodes,andfacedetection.1IntroductionTwodirectionsofresearch-categorizationanddetection-havedominatedthefieldofshapeandviewbasedobjectrecognition.Thefirst,categorization,referstotheclassificationbetweenseveralobjectclassesbasedonsegmenteddata(seeVapnik(1995),Amit&Geman(1997),LeCunetal.(1998),Hastie&Simard(1998),Belongieetal.(2002)),andthesecond,detection,tofindinginstancesofaparticularobjectclassinlargeimages(seeLeungetal.(1995),Rowleyetal.(1998),Viola&Jones(2004),Amit&Geman(1999),Burletal.(1998),Torralbaetal.(2004).)Thelatterisoftenconsideredasaproblemofclassificationbetweenobjectandbackground.Bothsubjectsareviewedasbuildingblockstowardsmoregeneralalgorithmsfortheanalysisofcomplexscenescontainingmultipleobjects.Thechallengeofcomputervisionistheanalysisofimageswithmultipleinteracting2objectsandclutter,requiringsomemethodologyforintegratingthedifferentdetectorsandclassifiersinoneframework,aswellassequentiallylearningadditionalobjectclassesfromnewexamples,withoutaccesstoearliertrainingsets.Imaginerunningdetectorsforeachobjectclassatlowfalsenegativerates.Thiswilltypicallyyieldquitealargenumberoffalsepositivesaswellasmultiplehits(fordifferentdetectors)inthesameregion.Itisthennecessarytoclassifyamongtheseandeliminatefalsepositives.Furthermore,ifseveralobjectscanbepresentinthescene,oneneedstochooseamongmultiplecandidateinterpretations,i.e.differentassignmentsoflabels,locations,andinstantiationsforanumberofobjects,possiblyoccludingeachother.Thiscannotbeperformedbasedonpre-trainedclassifiersamongthevirtuallyinfinitenumberofpossibleconfigurations,andrequiresonlineprocedures.Thesameissuewouldariseifbottom-upsegmentation,orsaliencydetectionareusedtodeterminecandidateregionsorlocationsoftheobjectsofinterest.Competingsegmentations/classificationsneedtoberesolved.Weproposetoaddressthesechallengesinacoherentstatisticalframework,basedonanovelfamilyofdeformableobjectmodels,whichcanbecomposedtodefinemodelsformulti-objectconfigurations.Thedataateachpixel,inourcasebinaryorientededges,isassumedindependentconditionalontheinstantiation,whichconsistsofanon-lineardeformationofthemodel.Thebasicideaistodescribethedeformationintermsofshiftsofamoderatenumberoflocalsubmodels,parts,whicharesubsequentlyrecombinedusingapatchworkoperation,todefineacoherentmodelofthedata-hencethenamepatchworkofparts(POP)model.Theoptimaldeformationandassociatedlikelihoodofthedatacanbeefficientlycomputedthroughiterativeoptimizationontheshifts.Trainingisachallengeinmodelswithhighdimensionalinstantiationparameters,be-causethesearetypicallyunobserved.Thespecificformoftheproposeddeformableobjectmodelmotivatesanapproximateestimationprocedure,whereeachofthepartsisestimatedseparatelyandforeachparttheonlyunobservedvariableisalocalshift.Thisprocedureisonlyapproximate,howeveritisveryfastandyieldsverygoodestimates.Givenaninstantiatedobjectmodelweintroducethenotionofthesupport,andthevisiblesupport-thenon-occludedsubsetofthesupport.Thisleadstoanothercontribution3ofthispaper:awelldefinedmechanismforcomposinginstantiatedobjects,online,intoadatamodelforaninterpretation,i.e.aconfigurationofobjectswithocclusions(seefigure3.)Alldecisionsarethenbasedonlikelihoodratiosbetweencompetingclassesorcompetinginterpretations.Mostexistingobjectdetectionorcategorizationapproachesdonothavethismodularcapability(seesection1.1).Animportantadvantageofusingstatisticalmodelsisthattrainingcanbeperformedoneclassatatime.Thereisnoneedtoseealltheclassesaheadoftimeinordertocomputedecisionboundaries.Moreoverduetotheexplicitmodelingofobjectdeformations,stateoftheartperformancecanbeachievedwithmuchsmallertrainingsets.1.1OtherworkDeformab
本文标题:Pop Patchwork of parts models for object recogniti
链接地址:https://www.777doc.com/doc-4815011 .html