Layered Learning in Multi-Agent Systems

LayeredLearninginMulti-AgentSystemsPeterStoneDecember15,1998CMU-CS-98-187SchoolofComputerScienceCarnegieMellonUniversityPittsburgh,PA15213-3891SubmittedinpartialfulllmentoftherequirementsforthedegreeofDoctorofPhilosophyThesisCommittee:ManuelaM.Veloso,ChairAndrewW.MooreHerbertA.SimonVictorR.Lesser(UniversityofMassachusetts,Amherst)Copyrightc1998PeterStoneTheworkhasbeensupportedthroughthegenerosityoftheNASAGraduateStudentResearchPro-gram(GSRP).ThisresearchisalsosponsoredinpartbytheDefenseAdvancedResearchProjectsAgency(DARPA),andRomeLaboratory,AirForceMaterielCommand,USAF,underagreementnumbersF30602-95-1-0018,F30602-97-2-0250andF30602-98-2-0135andinpartbytheDepartmentoftheNavy,OceofNavalResearchundercontractnumberN00014-95-1-0591.Viewsandconclusionscontainedinthisdocu-mentarethoseoftheauthorsandshouldnotbeinterpretedasnecessarilyrepresentingtheocialpoliciesorendorsements,eitherexpressedorimplied,ofNASA,theDefenseAdvancedResearchProjectsAgency(DARPA),theAirForceResearchLaboratory(AFRL),theDepartmentoftheNavy,OceofNavalRe-search,ortheU.S.Government.Keywords:Multi-agentsystems,machinelearning,multi-agentlearning,controllearning,hierarchicallearning,reinforcementlearning,decisiontreelearning,neuralnetworks,roboticsoccer,networkroutingAbstractMulti-agentsystemsincomplex,real-timedomainsrequireagentstoacteectivelybothau-tonomouslyandaspartofateam.Thisdissertationaddressesmulti-agentsystemsconsistingofteamsofautonomousagentsactinginreal-time,noisy,collaborative,andadversarialenvi-ronments.Becauseoftheinherentcomplexityofthistypeofmulti-agentsystem,thisthesisinvestigatestheuseofmachinelearningwithinmulti-agentsystems.ThedissertationmakesfourmaincontributionstotheeldsofMachineLearningandMulti-AgentSystems.First,thethesisdenesateammemberagentarchitecturewithinwhichaexibleteamstructureispresented,allowingagentstodecomposethetaskspaceintoexiblerolesandallowingthemtosmoothlyswitchroleswhileacting.Teamorganizationisachievedbytheintroductionofalocker-roomagreementasacollectionofconventionsfollowedbyallteammembers.Itdenesagentroles,teamformations,andpre-compiledmulti-agentplans.Inaddition,theteammemberagentarchitectureincludesacommunicationparadigmfordomainswithsingle-channel,low-bandwidth,unreliablecommunication.Thecommunica-tionparadigmfacilitatesteamcoordinationwhilebeingrobusttolostmessagesandactiveinterferencefromopponents.Second,thethesisintroduceslayeredlearning,ageneral-purposemachinelearningparadigmforcomplexdomainsinwhichlearningamappingdirectlyfromagents’sensorstotheiractuatorsisintractable.Givenahierarchicaltaskdecomposition,layeredlearningallowsforlearningateachlevelofthehierarchy,withlearningateachleveldirectlyaectinglearningatthenexthigherlevel.Third,thethesisintroducesanewmulti-agentreinforcementlearningalgorithm,namelyteam-partitioned,opaque-transitionreinforcementlearning(TPOT-RL).TPOT-RLisde-signedfordomainsinwhichagentscannotnecessarilyobservethestatechangeswhenotherteammembersact.Itexploitslocal,action-dependentfeaturestoaggressivelygeneralizeitsinputrepresentationforlearningandpartitionsthetaskamongtheagents,allowingthemtosimultaneouslylearncollaborativepoliciesbyobservingthelong-termeectsoftheiractions.Fourth,thethesiscontributesafullyfunctioningmulti-agentsystemthatincorporateslearninginareal-time,noisydomainwithteammatesandadversaries.Detailedalgorithmicdescriptionsoftheagents’behaviorsaswellastheirsourcecodeareincludedinthethesis.Empiricalresultsvalidateallfourcontributionswithinthesimulatedroboticsoccerdo-main.Thegeneralityofthecontributionsisveriedbyapplyingthemtotherealroboticsoccer,andnetworkroutingdomains.Ultimately,thisdissertationdemonstratesthatbylearningportionsoftheircognitiveprocesses,selectivelycommunicating,andcoordinatingtheirbehaviorsviacommonknowledge,agroupofindependentagentscanworktowardsacommongoalinacomplex,real-time,noisy,collaborative,andadversarialenvironment.34AcknowledgementsIwouldliketothankmanypeoplefortheirsupport,encouragementandguidanceduringmyyearsasagraduatestudenthereatCMU.Firstandforemost,thisdissertationrepresentsagreatdealoftimeandeortnotonlyonmypart,butonthepartofmyadvisor,ManuelaVeloso.Shehashelpedmeshapemyresearchfromdayone,pushedmetogetthroughtheinevitableresearchsetbacks,andencouragedmetoachievetothebestofmyability.WithoutManuela,thisdissertationwouldnothavehappened.Ialsothankmyotherthreecommitteemembers,AndrewMoore,HerbSimon,andVictorLesserforvaluablediscussionsandcommentsregardingmyresearch.Almostallresearchinvolvingrobotsisagroupeort.ThemembersoftheCMUro-bosoccerlabhaveallcontributedtomakingmyresearchpossible.SorinAchim,whohasbeenwithourprojectalmostfromthebeginninghastirelesslyexperimentedwithdierentrobotarchitectures,alwaysmanagingtopullthingstogetherandcreateworkinghardwareintimeforcompetitions.KwunHanwasapartnerinthesoftwaredevelopmentoftheCMUnited-97team,aswellasaninstrumentalhardwaredeveloperforCMUnited-98.MikeBowlingsuccessfullycreatedanewsoftwareapproachfortheCMUnited-98robots.Healsocollaboratedonanearlysimulatoragentimplementation.

Layered Learning in Multi-Agent Systems

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

《西方哲学智慧》第十七讲海德格尔与老庄(XXXX)

巨龙交互式电子白板应用

中外服装史

桥梁常用养护机械设备

中华大学九十二学年度第一学期工程数学(一)网路辅助教...

水利水电工程施工管理报告

数控技术-第一部分系统介绍

高中生物光合作用课件

第三章第五节民用航空器适航管理

制药厂污水处理方案

相关文档

相关搜索