您好,欢迎访问三七文档
DepartmentofComputerScienceSeriesofPublicationsAReportA-2007-4StatisticalandInformation-TheoreticMethodsforDataAnalysisTeemuRoosTobepresented,withthepermissionoftheFacultyofScienceoftheUniversityofHelsinki,forpubliccriticismintheauditoriumofArppeanum(HelsinkiUniversityMuseum,Snellmaninkatu3)onJune9th,at12o’clocknoon.UniversityofHelsinkiFinlandContactinformationPostaladdress:DepartmentofComputerScienceP.O.Box68(GustafH¨allstr¨ominkatu2b)FI-00014UniversityofHelsinkiFinlandEmailaddress:postmaster@cs.Helsinki.FI(Internet)URL::+35891911Telefax:+358919151120Copyrightc2007TeemuRoosISSN1238-8645ISBN978-952-10-3988-1(paperback)ISBN978-952-10-3989-8(PDF)ComputingReviews(1998)Classification:G.3,H.1.1,I.2.6,I.2.7,I.4,I.5Helsinki2007HelsinkiUniversityPrintingHouseStatisticalandInformation-TheoreticMethodsforDataAnalysisTeemuRoosDepartmentofComputerScienceP.O.Box68,FI-00014UniversityofHelsinki,Finlandteemu.roos@cs.helsinki.fifi/teemu.roos/PhDThesis,SeriesofPublicationsA,ReportA-2007-4Helsinki,March2007,82+75pagesISSN1238-8645ISBN978-952-10-3988-1(paperback)ISBN978-952-10-3989-8(PDF)AbstractInthisThesis,wedeveloptheoryandmethodsforcomputationaldataanal-ysis.Theproblemsindataanalysisareapproachedfromthreeperspectives:statisticallearningtheory,theBayesianframework,andtheinformation-theoreticminimumdescriptionlength(MDL)principle.Contributionsinstatisticallearningtheoryaddressthepossibilityofgeneralizationtoun-seencases,andregressionanalysiswithpartiallyobserveddatawithanapplicationtomobiledevicepositioning.InthesecondpartoftheThesis,wediscusssocalledBayesiannetworkclassifiers,andshowthattheyarecloselyrelatedtologisticregressionmodels.Inthefinalpart,weapplytheMDLprincipletotracingthehistoryofoldmanuscripts,andtonoisereductionindigitalsignals.ComputingReviews(1998)CategoriesandSubjectDescriptors:G.3ProbabilityandStatistics:correlationandregressionanalysis,nonparametricstatisticsH.1.1SystemsandInformationTheoryI.2.6Learning:conceptlearning,induction,parameterlearningI.2.7NaturalLanguageProcessing:textanalysisI.4ImageProcessingandComputerVisionI.5PatternRecognitioniiiivGeneralTerms:dataanalysis,statisticalmodeling,machinelearningAdditionalKeyWordsandPhrases:informationtheory,statisticallearningtheory,Bayesianism,minimumdescriptionlengthprinciple,Bayesiannetworks,regression,positioning,stemmatology,denoisingPreface“Weareallshapedbythetoolsweuse,inparticular:theformalismsweuseshapeourthinkinghabits,forbetterorforworse[...]”EdsgerW.Dijkstra(1930–2002)ThisThesisisaboutdataanalysis:learningandmakinginferencesfromdata.Whatdothedatahavetosay?Tosimplify,thisistheques-tionwewouldultimatelyliketoanswer.Herethedatamaybewhateverobservationswemake,beitintheformoflabeledfeaturevectors,text,orimages—alloftheseformatsareencounteredinthiswork.Here,asusual,thecomputerscientist’smodusoperandiistodeveloprulesandalgorithmsthatcanbeimplementedinacomputer.Inadditiontocomputerscience,therearemanyotherdisciplinesthatarerelevanttodataanalysis,suchasstatistics,philosophyofscience,andvariousappliedsciences,includingengineeringandbioinformatics.Eventhesearedividedintovarioussub-fields.Forinstance,theBayesianversusnon-Bayesiandivisionrelatedtotheinterpretationofprobabilityexistsinmanyareas.Diversitycharacterizesalsothepresentwork.ThesixpublicationsthatmakethesubstanceofthisThesiscontainonlyonecross-referencebetweeneachother(thefifthpaperiscitedinthesixthone).Theadvantageofdiversityisthatwithmoretoolsthanjustahammer(orasupportvectormachine),allproblemsdonothavetobenails.Ofcourse,onecouldnotevenhopetobecomprehensiveandall-inclusive.Inallofthefollowing,probabilityplaysacentralrole,oftentogetherwithitscousin,thecode-length.ThisdefinesadhocthescopeandthecontextofthisThesis.Hencealsoitstitle.Inordertocoverthenecessarypreliminariesandbackgroundfortheactualwork,threealternativeparadigmsfordataanalysisareencounteredbeforereachingthebackcoverofthiswork.TheThesisisdividedaccord-inglyintothreeparts:eachpartincludesabriefintroductiontooneoftheparadigms,followedbycontributionsinit.Thesepartare:1.StatisticalLearningTheory;2.theBayesianApproach;and3.MinimumDescriptionvviPartI:StatisticalLearningTheoryPartIII:MinimumDescriptionLengthPrinciplePartII:theBayesianApproachChapter1PreliminariesChapter3GeneralizationtoUnseenCasesChapter2RegressionEstimationwiththeEMAlgorithmChapter5DiscriminativeBayesianNetworkClassifiersPaper2Paper3Paper1Chapter6PreliminariesChapter8MDLDenoisingChapter7Compression-BasedStemmaticAnalysisPaper5Paper6Paper4Chapter4PreliminariesFigure1:Therelationshipsbetweenthechaptersandoriginalpublications(Papers1–6)oftheThesis.LengthPrinciple.ThestructureoftheThesisisdepictedinFigure1.Asthisisnotatextbookintendedtobeself-contained,manybasicconceptsareassumedknown.Standardreferencesare,forinstance,inprobabilityandstatistics[28],inmachinelearning[26,83],inBayesianmethods[7],andininformationtheory[19,37].Acknowledgments:Iamgratefultomyadvisors,ProfessorsPetriMylly-m¨akiandHenryTirri,fortheiradvice,fortheireffortsinman
本文标题:Statistical and Information-Theoretic Methods for
链接地址:https://www.777doc.com/doc-4240545 .html