您好,欢迎访问三七文档
当前位置:首页 > 电子/通信 > 综合/其它 > 北京大学现代汉语语料库基本加工规范
1TP391TheBasicProcessingofContemporaryChineseCorpusatPekingUniversitySPECIFICATIONYUShi-wenDUANHui-mingZHUXue-fengBingSWEN(InstituteofComputationalLinguistics,PekingUniversity,Beijing,100871)Abstract:TheInstituteofComputationalLinguistics,PekingUniversityhascompletedthebasicprocessingofacontemporaryChinesecorpusthathas27millionChineseCharacters.Inadditiontowordsegmentationandpart-of-speechtagging,theprocessinginvolvesthetaggingofpropernouns(personnames,placenames,organizationnamesandsoon),morphemesubcategoriesandthespecialusagesofverbsandadjectives.Thesuccessofthislarge-scalelanguageengineeringisattributedtotheSPECIFICATION,whichhadbeenmadebeforehandandwasbeingperfectedwhileinuse.WeareherebymakinganintroductiontotheSPECIFICATIONthroughthispublication,thusinvitingthecommentsfromalltheexpertsandourcolleaguesfortheimprovementofit.Keywords:contemporaryChinese;corpus;wordsegmentation;part-of-speechtagging;specification69483003973G1998030507486398519381219571219371219681042345*·67891011121314151617181920212223
本文标题:北京大学现代汉语语料库基本加工规范
链接地址:https://www.777doc.com/doc-5356730 .html