您好,欢迎访问三七文档
c°CopyrightbyYingWu,2001VISIONANDLEARNINGFORINTELLIGENTHUMAN-COMPUTERINTERACTIONBYYINGWUB.E.,HuazhongUniversityofScienceandTechnology,1994M.E.,TsinghuaUniversity,1997THESISSubmittedinpartialfulfillmentoftherequirementsforthedegreeofDoctorofPhilosophyinElectricalEngineeringintheGraduateCollegeoftheUniversityofIllinoisatUrbana-Champaign,2001Urbana,IllinoisABSTRACTItwasadreamtomakecomputerssee.Theresearchincomputervisionprovidespromisingtechnologiestocapture,analyze,transmit,retrieveandinterpretvisualinformation.However,duetotherichnessandlargevariationsinthevisualinputs,thepracticeofmanystatisticallearningtechniquesforvisualmotioncapturingandrecognitionareconfrontedbysomesimilarproblems,suchthatmakingintelligentandvisuallycapablemachinesisstillachallengingtask.Thisdissertationconcentratesontwoimportantproblems:capturingandrecognizinghumanmotioninvideosequences,whicharecrucialfortheresearchandapplicationsofintelligenthumancomputerinteraction,multimediacommunication,andsmartenvironments.Thisdissertationpresentsthreeeffectivetechniquesforvisualmotionanalysistasks:non-stationarycolormodeladaptationforefficientlocalization,multiplevisualcuesintegrationforrobusttracking,andlearningmotionmodelsforcapturingarticulatedhandmotion.Besides,thisdissertationdescribesanovelstatisticallearningmethod,theDiscriminant-EM(D-EM)algorithm,intheframeworkofself-supervisedlearningparadigm.D-EMemploysbothlabeledandunlabeledtrainingdataandconvergessupervisedandunsupervisedlearning.Manytopicsinthedissertationisunifiedbythefourproblemsofself-supervisedlearning,i.e.,transduction,co-transduction,modeltransductionandco-inferencing.Extensiveexperimentsandtwopro-totypesystemshavevalidatedtheproposedapproachesinthedomainofvision-basedhumancomputerinteraction.iiiTomyparentsandtoJindanivACKNOWLEDGMENTSAboveall,IwouldliketoexpressmysincerethankstomyadvisorProfessorThomasS.Huangforhisinsightfulguidance,enlighteningadvice,andendlessencouragementthroughoutmyPh.D.study,whichhasgivenmeagreatopportunitytoexplorevariousdifficultbutin-terestingproblems.Iwasluckyandamproudofbeingastudentofhim,agreatmanwithextraordinaryvisionandwisdom.Especially,IwouldliketothankmytwomentorsinMicrosoftResearch,Dr.KentaroToyamaandDr.ZhengyouZhangfortheirselflessdiscussionsandsug-gestions,withoutwhichIcouldhavenotmadethisworkpossible.IwouldalsoliketothankmyPh.D.advisorycommitteemembersProfessorNarendraAhuja,ProfessorDavidKriegman,andDr.KentaroToyama,fortheirinspiringandconstructivediscussionsduringmystudy.IalsowouldliketothankallmycolleaguesintheImageFormationandProcessingGroupandmanyofmyfriendsinMicrosoftResearch.Inparticular,IwouldliketothankDr.SteveShafer,Dr.YingShan,Dr.HarryShum,Dr.JohnKrumm,Dr.RickSzeliski,ErikHanson,Dr.VladimirPavlovic,GregBerry,Dr.NebojsaJojic,Dr.QiongLiu,JohnY.Lin,QiTian,SeanXiangZhou,andYunqiangChen.SpecialthankstoJohnY.Linforhishardworkofcollectingfingermotiondataandhisselflesshelponfingertrackingexperimentsandpaperproofreading.Iwishtothankmyfamilyforalltheirendlesslove,supportandencouragementthoughallthetimeofmystudyabroad.Finally,butnotleast,Iwouldliketoexpressmydeepthankstomydearwife,Jindan,forallherlove,sacrifice,understandingandhelp,whichcouldbefeltineverywordinthiswork.vTABLEOFCONTENTSCHAPTERPAGE1INTRODUCTION:::::::::::::::::::::::::::::::::::11.1Background.......................................11.1.1Virtualenvironments..............................11.1.2Human-computerinteraction.........................21.1.3Vision-basedhuman-computerinteraction..................21.1.4Gestureinterfaces...............................31.1.5Visuallearning.................................41.2Motivation.......................................51.3Organization......................................61.4Contributions......................................82VISION-BASEDGESTUREINTERFACES:AREVIEW::::::::::102.1Introduction.......................................102.2GestureRepresentation................................102.3HandModeling.....................................112.3.1Modelingtheshape..............................122.3.2Modelingthekinematicstructure.......................132.3.3Modelingthedynamics............................152.4CapturingHumanHandMotion...........................152.4.1Formulatinghandmotion...........................152.4.2Localizinghandsinvideosequences.....................162.4.3Selectingimagefeatures............................182.4.4CapturinghandmotioninfullDOF.....................192.5DataPreparationforRecognition...........................202.5.1Featuresforgesturerecognition........................202.5.2Datacollectionforrecognition........................212.6StaticHandPostureRecognition...........................222.6.13-DModel-basedapproaches.........................232.6.2Appearance-basedapproaches.........................242.7TemporalGestureRecognition............................252.7.1Recognizinglow-levelmotion.........................25vi2.7.2Recognizinghigh-levelmotion.........................262.7.3GesturerecognitionbyHMM.......................
本文标题:VISION AND LEARNING FOR INTELLIGENT HUMAN-COMPUTER
链接地址:https://www.777doc.com/doc-4993738 .html