您好,欢迎访问三七文档
STOCHASTICMODELSOFLANGUAGEEVOLUTIONANDANAPPLICATIONTOTHEINDO-EUROPEANFAMILYOFLANGUAGESTANDYWARNOW,STEVENN.EVANS,DONRINGE,ANDLUAYNAKHLEHABSTRACT.Weproposeseveralmodelsofhowlanguagesevolve,anddiscussstatisticalestimationofevolutionunderthesemodels.Wealsodiscussissuesofidentiabilityandstatisticalconsistencyunderthesemodels.1.INTRODUCTIONInrecentmonthsseveralmethodsforestimatingevolutionaryhistoriesoflanguageshavebeendescribedandusedonIndo-European(IE)datasetsinordertoestimatedatesatwhichlanguagesdiversied.Implicitinthesemethodsarestochasticmodelsofhowlanguagesevolve(Forster&Toth,2003;Gray&Atkinson,2003).Weagreethatacarefullyconsideredsto-chasticmodelcanbeoftremendoususetohistoricallinguistics:ifsuf-cientlyrealistic,inferenceunderthemodelcanrevealmuchaboutthehis-toryofthelanguagefamily,andexaminationsofhowreconstructionmeth-odsperformunderthesemodels(viasimulation,inparticular)canhelpusquantifythereliabilityofareconstructionmethod.Sinceourowninter-estinthisisprimarilymotivatedbytheIEfamily,wewillformulatethismodelsoastoreectwhatwebelieveislikelytobetrueaboutIE'sevolu-tion.Much,however,shouldbeappropriateforotherfamilies,andwewilldiscussextensionstootherfamiliesattheendofthepaper.2.MODELSInthissectionweexplainwhatismeantbyastochasticmodeloflan-guageevolution,andwepresentsomespecicmodelsthatareworthexam-ininginthecontextofIEevolution.Webeginbyexplainingwhatlinguisticcharactersare,sincetheevolu-tionarymodeldescribeshoweachcharacterevolves.Date:April16,2004.TWsupportedbyNSFgrantBCS-0312830.SNEsupportedinpartbyNSFgrantDMS-0071468.DRsupportedinpartbyNSFgrantBCS-0312911.12TANDYWARNOW,STEVENN.EVANS,DONRINGE,ANDLUAYNAKHLEH2.1.Linguisticcharacters.A(linguistic)characterisanyfeatureoflan-guagesthatcantakeoneormoreforms;thesedifferentformsarecalledthestatesofthecharacter.Thus,ourcharactersincludelexicalcharacters,wherethedifferentstatesarethecognateclasses,sothattwolanguagesexhibitthesamestateforthelexicalcharacterifandonlyiftheyhavecog-natesforthemeaningassociatedwiththelexicalcharacter.Othercharactersincludephonologicalcharacters(theappearanceofasoundchangewithinthelanguageoritsancestry)andmorphologicalcharacters(e.g.,inectionalmarkers).Thus,acharacterdenesanequivalencerelationonthelanguagefamily,wheretwolanguagesareequivalentiftheyexhibitthesamestateforthecharacter.Givenapartitionofasetintodisjointsubsets,wecandeneanequivalencerelationbymakingtwolanguagesequivalentifandonlyiftheyareinthesamesubset;thus,apartitionofasetintodisjointsubsetsdenesanequivalencerelation(andtheconverseholdsaswell).Ourrstsimplifyingassumptionisthatallthecharactersaremonomor-phic,whichmeansthateverylanguageexhibitsonlyonestateofeachchar-acter.Thecontrastingphenomenonisacharacterwhichhastwoormorestatesforsomelanguages;examplesofsuchcharactersincludethesemanticslotrockforwhichEnglishcontainsatleasttwoequivalents:rockandstone.Becausewedonotunderstandinenoughdetailhowpolymorphismarises,wewillexcludepolymorphiccharactersfromourmodel. Simplifyingassumption#1:thereisnopolymorphism(i.e,theap-pearanceoftwoormorestatesforagivencharacterinagivenlan-guage).Foreachcharacter,wecanassignnumberstothestatesofthecharactersothatthecharacterisdenedtobeafunctionthatassignseverylanguageinaset oflanguagesarealnumber;thenumberassignedtothelanguageiscalledthestateofthecharacterforthatlanguage.Thus,thestatesofallourcharactersarerealnumbers,andwhenwewrite foralanguage andacharacter ,wemeanthestateofthecharacter exhibitedbythelanguage .However,theparticularrealnumberusedtolabelastateisirrelevant,andallthatmattersiswhethertwostatesareequalordifferent.2.2.Treemodels.Languagescanevolveinapurelytreelikefashion(theStammbaummodel),orwithenoughcontactbetweenlanguagesthatun-detected(orundetectable)borrowingoccursbetweenlineages,sothatitbecomesdifcult(orinappropriate)todeneagenetictreeforthefam-ily.Manyconditionscanmakeevolutionnon-treelike;creoles(hybridlan-guages)areone,dialectcontinuaareanother,butmoregenerallycontactitselfbetweendivergentlineagescanalsoleadtotreesbeinginappropriate(orjustdifculttoinfer).Alloftheseconditionscanbelooselygroupedunderthecategoryofreticulateevolution.STOCHASTICMODELS3Wewillinitiallydescribethemodelforthecasewherethereisnoreticu-lateevolution,sincemostoftheconceptsaremorefamiliarinthatcontext;laterwewillshowhowthemodelextendstothecasewherewepermitreticulateevolution.Inthecasewherethereisnoreticulateevolution,theevolutionaryhistoryofthelanguagesisdescribedbyarootedtree ,inwhichtheleavesrepre-sentthelanguagesinthefamily,andtheinternalnodesrepresentancestrallanguagesatparticularpointsintime;thisisthegenetictreeforthefam-ily.Everynode in hasatime associatedtoit,withtimesatnodesincreasingasonemovesawayfromtherootofthetree.Alloftheinternalnodesinthetreewillhaveatleasttwoedgesissuingfromthem(thatis,theywillhaveout-degreeatleasttwo)sothatnodescanalsobethoughtofasrepresentingdiversicationevents.Therefore,anedgewithinthetreerepresentsthedevelopmentofthelanguageoveraperiodoftimebetweendiversicationevents.2.3.Theevo
本文标题:STOCHASTIC MODELS OF LANGUAGE EVOLUTION AND AN APP
链接地址:https://www.777doc.com/doc-5860849 .html