您好,欢迎访问三七文档
BayesianInferenceforCategoricalDataAnalysisAlanAgrestiDepartmentofStatisticsUniversityofFloridaGainesville,Florida,USA32611-8545PhoneUSA(352)392-1941,Fax(352)392-5175e-mailaa@stat.u.eduDavidB.HitchcockDepartmentofStatisticsUniversityofSouthCarolinaColumbia,SC,USA29208e-mailhitchcock@stat.sc.edu1BayesianInferenceforCategoricalDataAnalysisSummaryThisarticlesurveysBayesianmethodsforcategoricaldataanalysis,withprimaryem-phasisoncontingencytableanalysis.EarlyinnovationswereproposedbyGood(1953,1956,1965)forsmoothingproportionsincontingencytablesandbyLindley(1964)forinferenceaboutoddsratios.TheseapproachesprimarilyusedconjugatebetaandDirichletpriors.Altham(1969,1971)presentedBayesiananalogsofsmall-samplefrequentisttestsfor22tablesusingsuchpriors.Analternativeapproachusingnormalpriorsforlogitsreceivedconsiderableattentioninthe1970sbyLeonardandothers(e.g.,Leonard1972).Adoptedusuallyinahierarchicalform,thelogit-normalapproachallowsgreaterexibilityandscopeforgeneralization.The1970salsosawconsiderableinterestinloglinearmodeling.Thead-ventofmoderncomputationalmethodssincethemid-1980shasledtoagrowingliteratureonfullyBayesiananalyseswithmodelsforcategoricaldata,withmainemphasisongeneral-izedlinearmodelssuchaslogisticregressionforbinaryandmulti-categoryresponsevariables.Keywords:Betadistribution;Binomialdistribution;Dirichletdistribution;EmpiricalBayes;Graphicalmodels;Hierarchicalmodels;Logisticregression;Loglinearmodels;MarkovchainMonteCarlo;Matchedpairs;Multinomialdistribution;Oddsratio;Smoothing.21Introduction1.1Abriefhistoryupto1965ThepurposeofthisarticleistosurveyBayesianmethodsforanalyzingcategoricaldata.ThestartingplaceisthelandmarkworkbyBayes(1763)andbyLaplace(1774)onesti-matingabinomialparameter.Theybothusedauniformpriordistributionforthebinomialparameter.Dale(1999)andStigler(1986,pp.100-136)summarizedthiswork,Stigler(1982)discussedwhatBayesimpliedbyhisuseofauniformprior,andHald(1998)discussedlaterdevelopments.Forcontingencytables,thesampleproportionsareordinarymaximumlikelihood(ML)estimatorsofmultinomialcellprobabilities.Whendataaresparse,thesecanhaveundesir-ablefeatures.Forinstance,foracellwithasamplingzero,0.0isusuallyanunappealingestimate.EarlyapplicationsofBayesianmethodstocontingencytablesinvolvedsmoothingcellcountstoimproveestimationofcellprobabilitieswithsmallsamples.MuchofthisappearedinvariousworksbyI.J.Good.Good(1953)usedauniformpriordistributionoverseveralcategoriesinestimatingthepopulationproportionsofanimalsofvariousspecies.Good(1956)usedlog-normalandgammapriorsinestimatingassociationfactorsincontingencytables.Foraparticularcell,theassociationfactorisdenedtobetheprobabilityofthatcelldividedbyitsprobabilityassumingindependence(i.e.,theproductofthemarginalprobabilities).Good's(1965)monographsummarizedtheuseofBayesianmethodsforestimatingmultinomialprobabilitiesincontingencytables,usingaDirichletpriordistribution.GoodalsowasinnovativeinhisearlyuseofhierarchicalandempiricalBayesianapproaches.Hisinterestinthisareaapparentlyevolvedoutofhisserviceasthemainstatisticalassistantin1941toAlanTuringonintelligenceissuesduringWorldWarII(e.g.,seeGood1980).Inaninuentialarticle,Lindley(1964)focusedonestimatingsummarymeasuresofassociationincontingencytables.Forinstance,usingaDirichletpriordistributionforthemultinomialprobabilities,hefoundtheposteriordistributionofcontrastsoflogprobabilities,suchasthelogoddsratio.EarlycriticsoftheBayesianapproachincludedR.A.Fisher.Forinstance,inhisbookStatisticalMethodsandScienticInferencein1956,Fisherchallenged1theuseofauniformpriorforthebinomialparameter,notingthatuniformpriorsonotherscaleswouldleadtodierentresults.(Interestingly,Fisherwasthersttousetheterm\Bayesian,startingin1950.SeeFienberg(2005)foradetaileddiscussionoftheevolutionoftheterm.FienbergnotesthatthemoderngrowthofBayesianmethodsfollowedthepopularizationinthe1950softheterm\Bayesianby,inparticular,L.J.Savage,I.J.Good,H.RaiaandR.Schlaifer.)1.2OutlineofthisarticleLeonardandHsu(1994)selectivelyreviewedthegrowthofBayesianapproachestocategoricaldataanalysissincethegroundbreakingworkbyGoodandbyLindley.Muchofthisreviewfocusedonresearchinthe1970sbyLeonardthatevolvednaturallyoutofLindley(1964).AnencyclopediaarticlebyAlbert(2004)focusedonmorerecentdevelopments,suchasmodelselectionissues.OfthemanybookspublishedinrecentyearsontheBayesianapproach,themostcompletecoverageofcategoricaldataanalysisisthechapterofO'HaganandForster(2004)ondiscretedatamodelsandthetextbyCongdon(2005).Thepurposeofourarticleistoprovideasomewhatbroaderoverview,intermsofcover-ingamuchwidervarietyoftopicsthanthesepublishedsurveys.Wedothisbyorganizingthesectionsaccordingtothestructureofthecategoricaldata.Section2beginswithestima-tionofbinomialandmultinomialparameters,continuingintoestimationofcellprobabilitiesincontingencytablesandrelatedparametersforloglinearmodels(Section3).Section4discussesBayesiananalogsofsomeclassicalcondenceintervalsandsignicancetests.Sec-tion5dealswithextensionstotheregressionmodelingofcategoricalresponsevariables.Computationalaspectsared
本文标题:Summary Bayesian Inference for Categorical Data An
链接地址:https://www.777doc.com/doc-6497048 .html