您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 公司方案 > Lazy Functional Parser Combinators in Java
LazyFunctionalParserCombinatorsinJavaAtzeDijkstra,DoaitseS.SwierstraUU-CS-2001-18June,2001LazyFunctionalParserCombinatorsinJavaAtzeDijkstra1andDoaitseS.Swierstra21InstituteofInformationandComputingSciences,UtrechtUniversity,P.O.Box80.089,3508TBUtrecht,TheNetherlands,atze@cs.uu.nl,~atze2doaitse@cs.uu.nl,~doaitseAbstract.Aparserisaprogramthatchecksifatextisasentenceofthelanguageasdescribedbyagrammar.Traditionally,theprogramtextofaparserisgeneratedfromagrammardescription,afterwhichitiscompiledandsubsequentlyrun.Thelanguageacceptedbysuchaparseris,bythenatureofthisprocess,hardcodedintheprogram.Anotherapproach,primarilytakeninthecontextoffunctionallanguages,allowsparserstobeconstructedatruntime,thusdynamicallycreatingparsersbycombiningelementsfromlibrariesofhigherlevelparsingconcepts;thisexplaninsthethename“parsercombinators”.Efficientimplementationofthisconceptreliesheavilyonthelazinessthatisavailableinmodernfunctionallanguages[13,14].Thispapershowshowtouseparsercom-binatorsinafunctionallanguageaswellasJava,andshowshowparsercombinatorscanbeimplementedinJava.Implementingparsercombi-natorsisaccomplishedbycombiningtwolibraries.Thefirstone,writteninHaskell,defineserror-correctingandanalysingparsercombinators[2].ThesecondoneconsistsofasmallJavalibraryimplementinglazyfunc-tionalbehavior.ThecombinatorlibraryisstraightforwardlycodedinJava,usinglazybehaviorwherenecessary.Inthispaperallthreeas-pects,thetwolibrariesanditscombination,areexplained.1IntroductionCreatingaparserforagrammarnormallyisatwostepprocess.Asastartingpointsometoolspecificnotationforagrammarspecificationisused.Fromthisgrammarspecificationanexecutablespecificationinaprogramminglanguageisgeneratedusingaparsergenerator,subsequentlycompiledintoanexecutableformat.Forexample,usingJava[4],onecoulduseJavaCC[1]togenerateaJavaprogramtobecompiledsubsequently.Thistwostepprocessmakesitpossibletocreatehighlyefficientparsers.Theparsergenerationphaseusuallyanalysesagrammarandtakesadvantageofprogramminglanguageproperties.However,thisefficiencydoesnotcomewithoutaprice.Generally,allinformationabouttheoriginalgrammarhasbeenlostwhenaseparateprogramisgenerated.Thoughthisinformationcanbeaddedas‘debug’informationtothegeneratedparser,itwillnotrepairthefactthatthestructureofagrammarhasbeenhardcodedintothegeneratedprogram.Ingeneral,thispreventstheprogrammatic(runtime)manipulationoftheoriginalparser(andgrammar)asthiswouldrequire(runtime)rebuildingofthegeneratedinforma-tion.Losingtheabilitytoperformruntimemanipulationandcreationofparsersoftenisnotsuchahighpricetopay,exceptinsituationswheretheinputlanguagedescribedbyagrammarmayinfluencethegrammaritself.Forexample,Haskell[5,13]allowstheprogrammertodefineoperatorswiththeirpriorityaswellasassociativity.Thefollowinglines,specifyingthepriorityandleftassociativityofsomeoperators,arefromthepreludeofHugs[9]:infixl7*,/,‘quot‘,‘rem‘,‘div‘,‘mod‘,:%,%infixl6+,-Thesedeclarationsinfluencethewayexpressionsareparsedandtheabstractsyntaxtreeforexpressionsisconstructed.Aparserneedsthisruntimegatheredinformationabouttheprecedenceofoperatorstodoitsjob.Thiscanbeaccom-plishedviaprecedenceparsing(seee.g.[3])intheformofadditionalinformationsteeringtheparsingprocess.However,suchasolutionistailoredforthisspe-cialproblem,thatis,parsingexpressions.Amoregeneralsolutionwouldbetoconstructaparseratruntimeusingthedeclaredfixityandpriorityinformationabouttheoperators.Thisisgenerallynotpossiblewhenageneratorisused.Incontrast,runtimemanipulationofparsersisoneofthestrongpointsofparsercombinators[7,6,8,10,16,15].Inthecontextofparsercombinators,aparserisafirstclassvalue,tobeusedorcombinedaspartofotherparsersandpassedaroundlikeanyother(programming)languagevalue.Anotheradvantageofparsersbeingfirstclasscitizensisthatitallowsthedefinitionofbuildingblocksandtheconstructionofabstractgrammaticalcon-structsoutofthesebuildingblocks.Thisissimilartoregularexpressions(seee.g.[3])whereregularexpressionscanbeoptional(using?),begrouped(using(,))andrepeated(using*,+).Parsercombinatorsallowustogoonestepfurtherbylettingtheprogrammerdefinehisownabstractions.Theseabstractionsthencaneasetheconstructionofaparserforagrammar.Theflexibilityofparsercombinatorshoweverhasitsusualprice:decreasedperformance.Straightforwardimplementationsofparsercombinators[7,6]relyonbacktracking,todeterminewhichalternativesinagrammarproductionrulearetheonesmatchingagiveninput.Becauseofthenon-linear(w.r.t.inputlengthand/orgrammarsize)runtimecoststhisisunacceptableexceptfordemon-strationpurposes.Monadbasedparsercombinators[8,10]allowtherestrictionofbacktrackingbutthisrequiresacarefulgrammardesign.Onlywhen(runtime)analysisisdoneoncombinedparsersbacktrackingcanbeavoided[16,15]andausefulwayoferrorrecoverycanbeoffered.Parsercombinatorimplementationswhichprovideerrorcorrectionandre-portingaswellasgrammaranalysisthusofferasolutiontothegrammarandparserwriterinwhichbothflexibilityandperformanceareatanacceptablelevel.EfficientimplementationsforparsercombinatorshoweveraregenerallywritteninafunctionallanguagelikeHaske
本文标题:Lazy Functional Parser Combinators in Java
链接地址:https://www.777doc.com/doc-6496527 .html