您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 其它行业文档 > 自然语言信息抽取中的机器学习方法研究
1.[1][2]Web[3][4][5][6][7][8]221overfittingunderfittinggenerative(discriminative)22wrapperapproach[9]0[10][11]RELIEFF[12](Markovblankets)[13]Chow[14]331[15][16][17]NnN=NnNccc......11=)}|max{Pr(arg111NNNwcC=)|Pr(11NNwcnw22+-nnw)|Pr(11NNwc∏=-=NnNnnNNwccwc111111),|Pr()|Pr(∏+---=),|(2212modnnnnnelwccp),|(2212+---nnnnnwccp),,(2212+---nnnnniwccfifil)),,(exp(),(1),|(1221222122212∑=+---+---+---=Minnnnniinnnnnnnnnwccfwczwccplll)),,((exp(),('122'122212∑∑=+---+---=cMinnnniinnnnwccpfwczll),(2212+---nnnnwczl32[18][19][20]1HMM1HMMHMMsHMMstt33MEMMsHMMs[21]HMMsMEMMsMEMMsHMMsP(s|s’,o)s’osMEMMsP(s|s’,o)P(s|s’,o)⎟⎠⎞⎜⎝⎛=∑),(exp),(1)|(''sofsoZosPaaaslMEMMs-2MEMMsViterbi[20])(stdttsViterbi)|()(max1'1''+∈+=tstSstosPsddMEMMsHMMs[21]MEMMsS1S2S3Sn-1SnOnOn-1O3O2O12MEMMslabelbiasproblem34ConditionalRandomFieldsMEMMsLaffertyCRFs[22]OSC(O,S)cliquesHammersley-CliffordCRFspotentialfunction∏∈ΑΦ=),(),(1)0|(osCccccoosZsP(sc,oc)C∑==ΦKkcckkcccosfos1)),(exp(),(lZo3CRFsCRFsFSMsCRFsMEMMsMEMMso=(o1,o2,…,oT)SFSMs=(s1,s2,…,sT)st-1,stoCRFs⎟⎠⎞⎜⎝⎛=∑∑==-ΑTtKkttkkotossfZsP111),,,(exp1)0|(l),,,(1tossfttk-S1S2S3Sn-1SnOnOn-1O3O2O13CRFsCRFso1,o2,…,oT∑∑⎟⎠⎞⎜⎝⎛==+'1''1),,,(exp)()(sKkkktttossfsslaaCRFsMEMMsMcCallum[23][24]35(kernel)f1,…,fNN[25]XKK:XX[0,∞]x,ypositivesemidefinitef(.)=(f1(.),f2(.),…)fi:XR,K(x,y)=f(x),f(y)[26][27]Zelenko[28]SVMVotedPerceptron(SVMs)4[29]boosting,bagging,stacking[32][33][30]Florian[31]n))),|(((),|(...11niiinCwCPfCwCP==Ciif∑=⋅=niinwiPCiWCPCwCP11)|(),,|(),|(∑=⋅=niiiiwCwCP1)(),|(l)(wilwi),|(iiCwCPiwCiCFlorian[31]5bootstrappingBlum[34]Co-TrainingCo-TrainingViewsCo-TrainingCo-Training[35][36][37][38][39]61GaizauskasR,WilksY,InformationExtraction:BeyondDocumentRetrieval.JournalofDocumentation,1997.2DouglasE.AppeltDavidJ.Israel.IntroductiontoInformationExtractionTechnologyIJCAI-993TheACE2003EvaluationPlan,:DescriptionoftheIE2systemusedforMUC-7.InProceedingsofMUC-7,1998.5S.Miller,M.Crystal,H.Fox,L.Ramshaw,R.Schwartz,R.Stone,andR.Weischedel.Algorithmsthatlearntoextractinformation-BBN:DescriptionoftheSIFTsystemasusedforMUC-7.InProceedingsofMUC-7,1998.6DayneFreitag.MachineLearningforInformationExtractioninInformalDomains.PhDthesis,CarnegieMellonUniversity,1998.7F.Ciravegna.Adaptiveinformationextractionfromtextbyruleinductionandgeneralisation.InProceedingsoftheSeventeenthInternationalJointConfonArtificialIntelligence,2001.8MaryElaineCaliffandRaymondJ.Mooney.Relationallearningofpattern-matchrulesforinformationextraction.InProceedingsoftheSixteenthNationalConf.onArtificialIntelligence,pages328–334,1999.9RonKohaviandGeorgeH.John.Wrappersforfeaturesubsetselection.Arti_cialIntelligence,97(1-2):273-324,1997.10A.S.Weigend,D.E.Rumelhart,andB.A.Huberman.Generalizationbyweight-eliminationwithapplicationtoforecasting.Adv.NeuralInf.Proc.Sys.3,875-882,MorganKaufmann,1991.11J.R.Quinlan.C4.5:Programsformachinelearning.MorganKaufmann,199312IgorKononenko,Edvard_Simec,andMarkoRobnik-Sikonja.OvercomingthemyopicofinductivelearningalgorithmswithRELIEFF.AppliedIntelligence,7(1):39-55,1997.13DaphneKollerandMehranSahami.Towardoptimalfeatureselection.InProc.13thInt.Conf.MachineLearning,pages284{292.MorganKaufmann,1996.14C.ChowandC.Liu.Approximatingdiscreteprobabilitydistributionswithdependencetrees.IEEETransactionsonInformationTheory,14:462{467,1968.15A.L.Berger,S.A.DellaPietra,andV.J.DellaPietra.Amaximumentropyapproachtonaturallanguageprocessing.ComputationalLinguistics,22(1):39–72,March.199616OliverBender,FranzJosefOchandHermannNey.MaximumEntropyModelsforNamedEntityRecognition.ProceeingsoftheSeventhCoNLLconferenceEdmonton,May-June200317HaiLeongChieuHweeTouNg.NamedEntityRecognitionwithaMaximumEntropyApproach.ProceeingsoftheSeventhCoNLLconferencepp.160-163Edmonton,May-June200318FreitagD,McCallumA.InformationExtractionwithHMMStructuresLearnedbyStochasticOptimization.ProceedingsofAAAI-200019FreitagD,MaCallumAK.InformationExtractionwithHMMsandShrinkage.AAAI9920L.R.Rabiner.AtutorialonhiddenMarkovmodelsandselectedapplicationsinspeechrecognition.ProceedingsoftheIEEE,77(2),February1989.21McCallum,A.,Freitag,D.,&Pereira,F.MaximumentropyMarkovmodelsforinformationextractionandsegmentation.Proc.ICML2000(pp.591–598).Stanford,California.22JohnLafferty,AndrewMcCallum,andFernandoPereira.Conditionalrandomfields:Probabilisticmodelsforsegmentingandlabelingsequencedata.InProc.ICML.23AndrewMcCallumWeiLi.EarlyResultsforNamedEntityRecognitionwithConditionalRandomFields,FeatureInductionandWeb-EnhancedLexicons.ProceedingsoftheSeventhCoNLLconferenceEdmonton,May-June200324FeiShaandFernandoPereira.Shallowparsingwithconditionalrandomfields.InProceedingsofHumanLanguageTechnology,NAACL.25T.Furey,N.Cristianini,N.Duffy,D.Bednarski,M.Schummer,andD.Haussler.Supportvectormachineclassificationandvalidationofcancertissuesamplesusingmicroarrayexpression.Bioinformatics,16,2000.26H.Lodhi,C.Saunders,J.Shawe-Taylor,N.Cristianini,andChrisWatkins.Textclassificationusingstring
本文标题:自然语言信息抽取中的机器学习方法研究
链接地址:https://www.777doc.com/doc-4955183 .html