您好,欢迎访问三七文档
当前位置:首页 > IT计算机/网络 > 数据库 > Pictorial structures for object recognition
PictorialStructuresforObjectRecognitionPedroF.FelzenszwalbArti¯cialIntelligenceLab,MassachusettsInstituteofTechnologyp®@ai.mit.eduDanielP.HuttenlocherComputerScienceDepartment,CornellUniversitydph@cs.cornell.eduAbstractInthispaperwepresentacomputationallye±cientframeworkforpart-basedmod-elingandrecognitionofobjects.OurworkismotivatedbythepictorialstructuremodelsintroducedbyFischlerandElschlager.Thebasicideaistorepresentanob-jectbyacollectionofpartsarrangedinadeformablecon¯guration.Theappearanceofeachpartismodeledseparately,andthedeformablecon¯gurationisrepresentedbyspring-likeconnectionsbetweenpairsofparts.Thesemodelsallowforqualitativedescriptionsofvisualappearance,andaresuitableforgenericrecognitionproblems.Weaddresstheproblemofusingpictorialstructuremodelsto¯ndinstancesofanobjectinanimageaswellastheproblemoflearninganobjectmodelfromtrainingex-amples,presentinge±cientalgorithmsinbothcases.Wedemonstratethetechniquesbylearningmodelsthatrepresentfacesandhumanbodiesandusingtheresultingmodelstolocatethecorrespondingobjectsinnovelimages.keywords:part-basedobjectrecognition,statisticalmodels,energyminimization.1IntroductionResearchinobjectrecognitionisincreasinglyconcernedwiththeabilitytorecognizegenericclassesofobjectsratherthanjustspeci¯cinstances.Inthispaper,wecon-siderboththeproblemofrecognizingobjectsusinggenericpart-basedmodelsand1(a)(b)Figure1:Sampleresultsfordetectionofaface(a);andahumanbody(b).Eachimageshowsthegloballybestlocationforthecorrespondingobject,ascomputedbyouralgorithms.Theobjectmodelswerelearnedfromtrainingexamples.theproblemoflearningsuchmodelsfromexampleimages.OurworkismotivatedbythepictorialstructurerepresentationintroducedbyFischlerandElschlager[16]thirtyyearsago,whereanobjectismodeledbyacollectionofpartsarrangedinadeformablecon¯guration.Eachpartencodeslocalvisualpropertiesoftheobject,andthedeformablecon¯gurationischaracterizedbyspring-likeconnectionsbetweencertainpairsofparts.Thebestmatchofsuchamodeltoanimageisfoundbyminimizinganenergyfunctionthatmeasuresbothamatchcostforeachpartandadeformationcostforeachpairofconnectedparts.Whilethepictorialstructureformulationisappealinginitssimplicityandgener-ality,severalshortcomingshavelimiteditsuse:(i)theresultingenergyminimizationproblemishardtosolvee±ciently,(ii)themodelhasmanyparameters,and(iii)itisoftendesirableto¯ndmorethanasinglebest(minimumenergy)match.Inthispaperweaddresstheselimitations,providingtechniquesthatarepracticalforabroadrangeofobjectrecognitionproblems.Weillustratethemethodfortwoquitedi®erentgenericrecognitiontasks,¯ndingfacesand¯ndingpeople.Forfaces,thepartsarefeaturessuchastheeyes,noseandmouth,andthespring-likeconnectionsallowforvariationintherelativelocationsofthesefeatures.Forpeople,thepartsarethelimbs,torsoandhead,andthespring-likeconnectionsallowforarticulationatthejoints.MatchingresultswiththesetwomodelsareillustratedinFigure1.Themaincontributionsofthispaperarethree-fold.First,weprovideane±cientalgorithmfortheclassicalpictorialstructureenergyminimizationproblemdescribedin[16],forthecasewheretheconnectionsbetweenpartsdonotformanycyclesand2areofaparticular(butquitegeneral)type.Manyobjects,includingfaces,peopleandanimalscanberepresentedbysuchacyclicmulti-partmodels.Second,weintroduceamethodforlearningthesemodelsfromtrainingexamples.Thismethodlearnsallthemodelparameters,includingthestructureofconnectionsbetweenparts.Third,wedeveloptechniquesfor¯ndingmultiplegoodhypothesesforthelocationofanobjectinanimageratherthanjustasinglebestsolution.Findingmultiplehypothesesisimportantfortaskswheretheremaybeseveralinstancesofanobjectinanimage,aswellasforcaseswhereimprecisioninthemodelmayresultinthedesiredmatchnotbeingtheonewiththeminimumenergy.Weaddresstheproblemsoflearningmodelsfromexamplesandofhypothesizingmultiplematchesbyexpressingthepictorialstructureframeworkinastatisticalsetting.1.1PictorialStructuresApictorialstructuremodelforanobjectisgivenbyacollectionofpartswithconnec-tionsbetweencertainpairsofparts.Theframeworkisquitegeneral,inthesensethatitisindependentofthespeci¯cschemeusedtomodeltheappearanceofeachpartaswellasthetypeofconnectionsbetweenparts.AnaturalwaytoexpresssuchamodelisintermsofanundirectedgraphG=(V;E),wheretheverticesV=fv1;:::;vngcorrespondtothenparts,andthereisanedge(vi;vj)2Eforeachpairofconnectedpartsviandvj.Aninstanceoftheobjectisgivenbyacon¯gurationL=(l1;:::;ln),whereeachlispeci¯esthelocationofpartvi.SometimeswerefertoLsimplyastheobjectlocation,but\con¯gurationemphasizesthepart-basedrepresentation.Thelocationofeachpartcansimplyspecifyitspositionintheimage,butmorecomplexparameterizationsarealsopossible.Forexample,forthepersonmodelinSection6thelocationofapartspeci¯esaposition,orientationandanamountofforeshortening.In[16]theproblemofmatchingapictorialstructuretoanimageisde¯nedintermsofanenergyfunctiontobeminimized.Thecostorenergyofaparticularcon¯gurationdependsbothonhowwelleachpartmatchestheimagedataatitslocation,andhowwelltherelativelocationsofthepartsagreewiththedeformablemodel.Givenanimage,letmi(li)beafunctionmeasu
本文标题:Pictorial structures for object recognition
链接地址:https://www.777doc.com/doc-5009880 .html