您好,欢迎访问三七文档
1WhatisNaturalLanguageProcessing(NLP)•Theprocessofcomputeranalysisofinputprovidedinahumanlanguage(naturallanguage),andconversionofthisinputintoausefulformofrepresentation.•ThefieldofNLPisprimarilyconcernedwithgettingcomputerstoperformusefulandinterestingtaskswithhumanlanguages.•ThefieldofNLPissecondarilyconcernedwithhelpinguscometoabetterunderstandingofhumanlanguage.2FormsofNaturalLanguage•Theinput/outputofaNLPsystemcanbe:–writtentext–speech•Wewillmostlyconcernedwithwrittentext(notspeech).•Toprocesswrittentext,weneed:–lexical,syntactic,semanticknowledgeaboutthelanguage–discourseinformation,realworldknowledge•Toprocessspokenlanguage,weneedeverythingrequiredtoprocesswrittentext,plusthechallengesofspeechrecognitionandspeechsynthesis.3ComponentsofNLP•NaturalLanguageUnderstanding–Mappingthegiveninputinthenaturallanguageintoausefulrepresentation.–Differentlevelofanalysisrequired:morphologicalanalysis,syntacticanalysis,semanticanalysis,discourseanalysis,…•NaturalLanguageGeneration–Producingoutputinthenaturallanguagefromsomeinternalrepresentation.–Differentlevelofsynthesisrequired:deepplanning(whattosay),syntacticgeneration•NLUnderstandingismuchharderthanNLGeneration.But,stillbothofthemarehard.4WhyNLUnderstandingishard?•Naturallanguageisextremelyrichinformandstructure,andveryambiguous.–Howtorepresentmeaning,–Whichstructuresmaptowhichmeaningstructures.•Oneinputcanmeanmanydifferentthings.Ambiguitycanbeatdifferentlevels.–Lexical(wordlevel)ambiguity--differentmeaningsofwords–Syntacticambiguity--differentwaystoparsethesentence–Interpretingpartialinformation--howtointerpretpronouns–Contextualinformation--contextofthesentencemayaffectthemeaningofthatsentence.•Manyinputcanmeanthesamething.•Interactionamongcomponentsoftheinputisnotclear.5KnowledgeofLanguage•Phonology–concernshowwordsarerelatedtothesoundsthatrealizethem.•Morphology–concernshowwordsareconstructedfrommorebasicmeaningunitscalledmorphemes.Amorphemeistheprimitiveunitofmeaninginalanguage.•Syntax–concernshowcanbeputtogethertoformcorrectsentencesanddetermineswhatstructuralroleeachwordplaysinthesentenceandwhatphrasesaresubpartsofotherphrases.•Semantics–concernswhatwordsmeanandhowthesemeaningcombineinsentencestoformsentencemeaning.Thestudyofcontext-independentmeaning.6KnowledgeofLanguage(cont.)•Pragmatics–concernshowsentencesareusedindifferentsituationsandhowuseaffectstheinterpretationofthesentence.•Discourse–concernshowtheimmediatelyprecedingsentencesaffecttheinterpretationofthenextsentence.Forexample,interpretingpronounsandinterpretingthetemporalaspectsoftheinformation.•WorldKnowledge–includesgeneralknowledgeabouttheworld.Whateachlanguageusermustknowabouttheother’sbeliefsandgoals.7AmbiguityImadeherduck.•Howmanydifferentinterpretationsdoesthissentencehave?•Whatarethereasonsfortheambiguity?•Thecategoriesofknowledgeoflanguagecanbethoughtofasambiguityresolvingcomponents.•Howcaneachambiguouspieceberesolved?•Doesspeechinputmakethesentenceevenmoreambiguous?–Yes–decidingwordboundaries8Ambiguity(cont.)•Someinterpretationsof:Imadeherduck.1.Icookedduckforher.2.Icookedduckbelongingtoher.3.Icreatedatoyduckwhichsheowns.4.Icausedhertoquicklylowerherheadorbody.5.Iusedmagicandturnedherintoaduck.•duck–morphologicallyandsyntacticallyambiguous:nounorverb.•her–syntacticallyambiguous:dativeorpossessive.•make–semanticallyambiguous:cookorcreate.•make–syntacticallyambiguous:–Transitive–takesadirectobject.=2–Di-transitive–takestwoobjects.=5–Takesadirectobjectandaverb.=49AmbiguityinaTurkishSentence•Someinterpretationsof:Adamıgördüm.1.Isawtheman.2.Isawmyisland.3.Ivisitedmyisland.4.Ibribedtheman.•MorphologicalAmbiguity:–ada-m-ıada+P1SG+ACC–adam-ıadam+ACC•SemanticAmbiguity:–görtosee–görtovisit–görtobribe10ResolveAmbiguities•Wewillintroducemodelsandalgorithmstoresolveambiguitiesatdifferentlevels.•part-of-speechtagging--Decidingwhetherduckisverbornoun.•word-sensedisambiguation--Decidingwhethermakeiscreateorcook.•lexicaldisambiguation--Resolutionofpart-of-speechandword-senseambiguitiesaretwoimportantkindsoflexicaldisambiguation.•syntacticambiguity--herduckisanexampleofsyntacticambiguity,andcanbeaddressedbyprobabilisticparsing.11ResolveAmbiguities(cont.)ImadeherduckSSNPVPNPVPIVNPNPIVNPmadeherduckmadeDETNherduck12ModelstoRepresentLinguisticKnowledge•Wewillusecertainformalisms(models)torepresenttherequiredlinguisticknowledge.•StateMachines--FSAs,FSTs,HMMs,ATNs,RTNs•FormalRuleSystems--ContextFreeGrammars,UnificationGrammars,ProbabilisticCFGs.•Logic-basedFormalisms--firstorderpredicatelogic,somehigherorderlogic.•ModelsofUncertainty--Bayesianprobabilitytheory.13AlgorithmstoManipulateLinguisticKnowledge•Wewillusealgorithmstomanipulatethemodelsoflinguisticknowledgetoproducethedesiredbehavior.•Mostofthealgorithmswewillstudyaretransducersandparsers.–Thesealgorithmsconstructsomestructurebasedontheirinput.•Sincethelanguageisambiguousatalllevels,thesealgorithmsareneversimpleprocesses.•Categoriesofmostalgorithmsthatwillbeusedcanfallintofollowingcategories.–statespacesearch–dynamicprogramming14LanguageandIntel
本文标题:自然语言处理
链接地址:https://www.777doc.com/doc-4933249 .html