深度学习在控制领域的研究现状与展望

425Vol.42,No.520165ACTAAUTOMATICASINICAMay,2016111;211.,.,.,..,.,,,,,,,..,2016,42(5):643¡654DOI10.16383/j.aas.2016.c160019DeepLearningforControl:TheStateoftheArtandProspectsDUANYan-Jie1LVYi-Sheng1ZHANGJie1;2ZHAOXue-Liang1WANGFei-Yue1AbstractDeeplearninghasshowngreatpotentialandadvantageinfeatureextractionandmodel¯tting.Itissigni¯canttousedeeplearningforcontrolproblemsinvolvinghighdimensiondata.Currently,therehavebeensomeinvestigationsfocusingondeeplearningincontrol.Thispaperisareviewofrelatedworkincludingcontrolobjectrecognition,statefeatureextraction,systemparameteridenti¯cationandcontrolstrategycalculation.Besides,thispaperdescribestheapproachesandideasofdeepcontrol,adaptivedynamicprogrammingandparallelcontrolrelatedtodeeplearningincontrol.Also,thispapersummarizesthemainfunctionsandexistingproblemsofdeeplearningincontrol,presentssomeprospectsoffuturework.KeywordsDeeplearning,control,feature,adaptivedynamicprogramming(ADP)CitationDuanYan-Jie,LvYi-Sheng,ZhangJie,ZhaoXue-Liang,WangFei-Yue.Deeplearningforcontrol:thestateoftheartandprospects.ActaAutomaticaSinica,2016,42(5):643¡654,(Deeplearning)[1].[2],,..,.,,,:.2015-12-262016-03-26ManuscriptreceivedDecember26,2015;acceptedMarch26,2016(71232006,61233001,71402178)SupportedbyNationalNaturalScienceFoundationofChina(71232006,61233001,71402178)RecommendedbyAssociateEditorHOUZhong-Sheng1.1001902.2660001.TheStateKeyLaboratoryofManagementandControlforComplexSystems,InstituteofAutomation,ChineseAcademyofSciences,Beijing1001902.QingdaoAcademyofIntelligentIndustries,Shandong266000.1,.,.(RestrictedBoltzmannmachine,RBM)(Deepbeliefnet-work,DBN)[3¡4](Autoencoder,AE)(Stackedautoencoders,SAE)[5](Convolutionalneuralnet-works,CNN)[6](Recurrentneuralnetworks,RNN)[7].,.,[1;8¡11].1.1DBNRBMDBNRBM,1.,64442RBM,,.,(Arti¯-cialneuralnetwork),:(Pre-training)(Finetuning).,,RBM,RBM.RBM,RBM,RBM.,,RBM.,,,(Unsupervisedlearn-ing).,RBM,,BP(Backpropagation),.DBN.1DBNFig.1ThestructureofDBN1.2SAEDBN,SAEAE,2.SAEDBN,.RBM,AE.AE,AE.,,AE,.AE,AE,AE.,,AE.,SAE,.,AE,,BP,.2SAEFig.2ThestructureofSAE1.3CNNCNN(Pooling),3[6].,,,,.,(Pooling).,.,BP,.CNN,,.1.4RNNDBNSAECNN,.RNN,.,RNN[12],4,..RNN[13],..RNN.RNN,RBMAE5:6453CNN[6]Fig.3ThestructureofCNN[6],,.RNN.RNNCNN,.4RNNFig.4ThestructureofRNN2.,..5,,,.5Fig.5Theapplicationofdeeplearningincontrolsystem2.1,.,.,[14],..6[14]..,..,CNN,..,,,,.6[14]Fig.6Roboticgraspingsystem[14],.,CNN.n,ici,C=Pni=1ci,CNN.,.CNN,.,,,.,CNN.,,64642,.,,,.,,,..2.2,.,,,,.,.,,.,Lange[15¡16]SAE;Mnih[17]CNN.,,Atari[17¡18].Atari,,,.Atari,CNN,CNNQ,,7.,Q.,,.,..Atari,.7AtariFig.7PlayingAtariwithdeeplearning,,.,,,.,.,.2.3,.,.,.,...,,..,,.,.,[19]ReLU,.[20],RNN,.[21],.,,,.,.8[20],stt,att,Q(st;at)st,at.,Q,(st;at;st+1¡st).st+1¡st,,Q.,5:647.Q,.,QQ.2.4,.,,.,,.,.PID,..8Q[20]Fig.8NeuralnetworkforlearningstatepredictionandQfunction[20],PID[22].,PID.DBNPID.DBNPID.DBNPID,PID.,.RNN[23].,,.,.{,.RNN,,.,,.[24],.,,.,,.SAE.AE,,AE.9,AE,AE,AE.SAE,,SAE..,.,.9Fig.9Deepneuralnetworkformotorcontrolfunction3,[25].,.,Saridis,,()[26¡27]..,.648423.1,:;..10,3:SNRNCN,.10Fig.10Neuro-fuzzynetwork.,.10,,..01,.SN.,.().,,.10,,if-then.RN.,.10,RN,SN.RN.RN,.,.CN.SNRNCN,10,3,.,[25],.[28]MemeticNeuro-fuzzy,,,[29¡30],.,.,.,,.,..,3,.,.,.,,.,,.3.2Saridis,3:[31].,,.Saridis.:;;.10,.,,.,,,.,.,,,,,.5:649.4,.,:.,.,50.,.Bellman[32¡33],Bellman:J¤(x(k))=minu(k)fl(x(k);u(k))+J¤(x(k+1))g(1)J(x(k))=1Xi=kl(x(i);u(i))(2)s.t.x(k+1)=F(x(k);u(k));k=0;1;¢¢¢(3)J¤(x(k)),Bellman(1).,.,,[33¡34],,.1977,Werbos[35]Bellman,(Forward-in-time).,(Adaptivedynamicprogramming,ADP),,AdaptivecriticdesignsApproximatedynamicprogrammingAsymptoticdynamicprogram-mingRelaxeddynamicprogrammingNeuro-dynamicprogrammingNeuraldynamicprogram-ming,,[36].,,[37][38][39].4.1(Heuristicdynamicprogram-ming,HDP)[40]Werbos.11,,,..(3).Bellman,,Ec=jJ(k)¡l(k)¡J(k+1)j2(4),Ea=°°°°@l(xk;uk)@uk+@J(xk+1)@xk+1@F(xk;uk)@uk°°°°(5),.,,.11Fig.11Thenetworkstructureofadaptivedynamicprogramming4.22002,Murray[41].ADP,.ADP,.ADPADP.,[42¡44]J¤0(xk)=0.i=0;1;¢¢¢,ui(xk)=argminukfl(xk;uk)+J¤i(xk+1)g(6)65042J¤i+1(xk)=l(xk;ui(xk))+J¤i(F(xk;ui(xk)))(7)Wei[45]ADP.[41;46¡48].,Liu[49]ADP.ADPu0(xk).i=0;1;¢¢¢,ViJ¤i(xk)=l(xk;ui(xk))+J¤i(F(xk;ui(xk)))(8)ui+1(xk)=argminukfl(xk;uk)+J¤i(xk+1)g(9)4.3[50],ACP(Arti¯-cialsocieties,Computationalexperiments,Paral-lelexecution)[51¡55],,,.,11,,.,.,,12[50].12[50]Fig.12Parallelcontrolsystems[50].,(3),x(k+1)=F(x(k);u(k);u1(k);¢¢¢;un(k))(10)u1;¢¢¢;un1;¢¢¢;n,,(10).,,,,.,(Agent-basedcontrol,ABC),[56¡58][59¡60][61¡62].5,,.,.,..,,.,,.,.,.,,.,PID,.,,.,,,.,,.,Nature[63],AlphaGo5:6514:1...5.1,.,..,,....,,.,.,.5.2,(Unmannedaevialvehicle,UAV).,,.,.,..5.3,,,.,,,.,..5.4,ACP,,.,.,,.5.5,..,...References1LeCunY,BengioY,HintonG.Deeplearning.Nature,2015,521(7553):436¡4442KrizhevskyA,SutskeverI,HintonGE.ImageNetclassi¯ca-tionwithdeepconvolutionalneuralnetworks.In:Proceed-ingsofthe2012AdvancesinNeuralInformationProcess-ingSystems.LakeTahoe,Nevada,USA:CurranAssociates,Inc.,2012.1097¡11053HintonGE,SalakhutdinovRR.Reducingthedimensional-ityofdatawithneuralnetworks.Science,2006,313(5786):504¡5074HintonGE,OsinderoS,TehYW.

深度学习在控制领域的研究现状与展望

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

上海耐泽电气自动化有限公司EMP600N+通用型保护测控装置

地铁车辆段大平台结构—上部住宅减震系统研究

建筑施工安全防护脚手架篇

劳动合同法实施细则

改创QC故事记录表(doc文档36页)

宁波GQY：董事、监事和高级管理人员持有和买卖本公司股票管理制度(

HR如何凭借科学和艺术

初级茶艺师-茶艺理论复习题库.

安装一台安全的Linux

生产与运作第3讲(1)

相关文档

相关搜索