您好,欢迎访问三七文档
ArtificialIntelligence455.AdversarialSearchContents:5.1.Games5.2.OptimalDecisionsinGames5.3.Alpha-BetaPruning5.4.ImperfectReal-timeDecisions5.5.StochasticGames5.6.Monte-CarloMethodsArtificialIntelligence::Searching::AdversarialSearch46Whatisstochasticgame什么是随机博弈Itisadynamicgamewithprobabilistictransitionsplayedbyoneormoreplayers,introducedintheearly1950s.是一种具有概率转换的动态博弈,有一个或多个玩家,于1950年代初提出的。Inreallife,manyunpredictableeventscanputusintounforeseensituations.在现实生活中,许多无法预测的事件可以使我们陷入始料不及的处境。Manygamesmirrorthisunpredictabilitybyincludingarandomelement,suchasthethrowingofdice.许多博弈通过引入一种随机元素来仿照这种不可预测性,例如掷骰子。Applications应用economics,evolutionarybiology,computernetworks.经济学、进化生物学、计算机网络。StochasticGame随机博弈5.5.StochasticGamesArtificialIntelligence::Searching::AdversarialSearch47Beatypicalgamethatcombinesluckandskill,andoneoftheoldestclassesofboardgamesfortwoplayers.是一种典型的运气与技巧并存的博弈,并且是一个最老的两个玩家的棋盘博弈。Theplayingpiecesaremovedaccordingtotherollofdice,andaplayerwinsbyremovingalloftheirpiecesfromtheboardbeforetheiropponent.走棋是根据掷骰子来决定的,在对手之前将所有的棋子移到棋盘外的玩家则获胜。Backgammon西洋双陆棋5.5.StochasticGamesWitheachrollofthedice,playersmustchoosefromnumerousoptionsformovingtheircheckersandanticipatepossiblecounter-movesbytheopponent.随着每次掷骰子,玩家们必须从许多选项中选择如何移动棋子,并且要预见对手可能的对攻棋。ArtificialIntelligence::Searching::AdversarialSearch48Whitemovesclockwisetoward25,andBlackmovescounterclockwisetoward0.Thegoalistomoveallpiecesofftheboard.白棋顺时针移到25,然后黑棋逆时针移到0。目标是将所有的棋子移到棋盘外。Apiececanmovetoanypositionunlessmultipleopponentpiecesarethere;ifthereisoneopponent,itiscapturedandmuststartover.一个棋子可以移到任意位置,除非在那里有多个对手的棋子;如果有一个对手的棋子,它就被抓住、然后必须重新开始。ATypicalBackgammonPosition一盘典型的西洋双陆棋棋局5.5.StochasticGamesInthepositionshown,Whitehasrolled6-5andmustchooseamongfourlegalmoves:如该棋局所示,白棋已经掷了6-5,因而必须从四种合法的走棋中选择:(5-10,5-11),(5-11,19-24),(5-10,10-16),(5-11,11-16).ArtificialIntelligence::Searching::AdversarialSearch49GameTreeforaBackgammonPosition西洋双陆棋棋局的博弈树5.5.StochasticGamesThereare36waystorolltwodice,eachequallylikely;butbecausea(6,5)isthesameasa(5,6),thereareonly21distinctrolls.投掷两个骰子有36种方式,每种都有同样可能;但是由于(6,5)与(5,6)相同,故仅有21种不同的投掷。Thesixdoubles,(1,1)through(6,6),eachhaveaprobabilityof1/36,soP(1,1)=1/36.六对儿双数,即(1,1)到(6,6),每对儿的概率为1/36,故:P(1,1)=1/36。Theother15distinctrollseachhavea1/18probability.其他15种不同的掷骰子,每种有1/18的概率。ArtificialIntelligence::Searching::AdversarialSearch50Expectedminimaxvalueforgameswithchancenodes.博弈的期望Minimax值具有机率节点。Forchancenodeswecomputetheexpectedvalue,whichisthesumofthevalueoveralloutcomes,weightedbytheprobabilityofeachchanceaction:对于机率节点,我们计算该期望值,它是涵盖所有结果值之和,由每个机率动作的概率来加权。ExpectedMinimaxValue期望Minimax值5.5.StochasticGamesWhererrepresentsapossiblediceroll(orotherchanceevent),andRESULT(s,r)isthesamestateass,withtheadditionalfactthattheresultofthedicerollisr.其中,r表示一种可能的骰子投掷(或其他机率事件),而RESULT(s,r)与s的状态相同、且具有附加的骰子投掷结果为r的事实。functionEXPECTI-MINIMAX(s)returnsanactionifTERMINAL-TEST(s)thenreturnUTILITY(s)ifPLAYER(s)=MAXthenreturnmaxaEXPECTI-MINIMAX(RESULT(s,a))ifPLAYER(s)=MINthenreturnminaEXPECTI-MINIMAX(RESULT(s,a))ifPLAYER(s)=CHANCEthenreturn∑rP(r)EXPECTI-MINIMAX(RESULT(s,r))ArtificialIntelligence::Searching::AdversarialSearch51AnOrder-preservingTransformation一种保序变换5.5.StochasticGamesAnorder-preservingtransformationonleafvalueschangesthebestmove:assignsthevalues[1,2,3,4]totheleaves,movea1isbest;withvalues[1,20,30,400],movea2isbest.一种叶节点值的保序变换改变最佳移动:分配叶节点值为[1,2,3,4]时,移动a1最佳;而值为[1,20,30,400]时,则移动a2最佳。ArtificialIntelligence::Searching::AdversarialSearch52Agamblerfacesseveralslotmachines(one-armedbandits),thatlookidenticalbutproducedifferentexpectedwinnings.一个赌徒面对着几台角子机(单臂老虎机),它们看起来完全相同,但具有不同的期望赢率。Theissueisthetrade-offbetweenacquiringnewinformationandcapitalizingontheinformationavailablesofar.问题是如何在获取新的信息和利用以前的可用信息之间权衡。Multi-armedBandit多臂老虎机5.5.StochasticGamesOneaspectthatweareparticularlyinterestedinconcernsmodelingandefficientlyusingvarioustypesofsideinformationthatmaybeavailabletothealgorithm.一个我们特别感兴趣的方面是关注建模以及有效地使用对算法或许有用的各种辅助信息。
本文标题:人工智能原理-北京大学-5--PartIISearchingChapter5Adversari-(5
链接地址:https://www.777doc.com/doc-6717437 .html