On the choice of random directions for stochastic

1OntheChoiceofRandomDirectionsforStochasticApproximationAlgorithmsJamesTheilerandJarodAlperThisworkwassupportedbytheLaboratoryDirectedResearchandDevelopment(LDRD)programatLosAlamosNationalLaboratory.J.TheilerisatechnicalstaffmemberintheSpaceandRemoteSensingSciencesGroup,MS-B244,LosAlamosNationalLaboratory,LosAlamos,NM87545.Email:jt@lanl.gov(Phone:505-665-5682;fax:505-665-4414)J.AlperrecentlygraduatedfromBrownUniversitywithadegreeinComputerScience;hewasgraduateresearchassistantatLosAlamoswhenthisworkwasdone.Email:JarodAlper@alumni.brown.edu.June27,2003DRAFT2AbstractWeinvestigatevariantsoftheKushner-ClarkRandomDirectionStochasticApproximation(RDSA)algorithmforoptimizingnoisylossfunctionsinhigh-dimensionalspaces.Thesevariantsemploydif-ferentstrategiesforchoosingrandomdirections.ThemostpopularapproachisrandomselectionfromaBernoullidistribution,whichforhistoricalreasonsgoesalsobythenameSimultaneousPerturba-tionStochasticApproximation(SPSA).Butviablealternativesincludeanaxis-aligneddistribution,anormaldistribution,andauniformdistributiononasphericalshell.AlthoughtherearespecialcaseswheretheBernoullidistributionisoptimal,thereareothercaseswhereitperformsworsethanotheralternatives.Weﬁndthatforgenericlossfunctionsthatarenotalignedtothecoordinateaxes,theaverageasymptoticperformanceisdependsonlyontheradialfourthmomentofthedistributionofdirections,andisidenticalforBernoulli,theaxis-aligned,andthesphericalshelldistributions.Ofthesevariants,thesphericalshellisoptimalinthesenseofminimumvarianceoverrandomorientationsofthelossfunctionwithrespecttothecoordinateaxes.Wealsoshowthatforunalignedlossfunctions,theperformanceoftheKeifer-Wolfowitz-BlumFiniteDifferenceStochasticApproximation(FDSA)isasymptoticallyequivalenttotheRDSAalgorithms,andweobservenumericallythatthepre-asymptoticperformanceofFDSAisoftensuperior.Wealsointroducea“quasirandom”selectionprocesswhichexhibitsthesameasymptoticperformance,butempiricallyisobservedtoconvergetotheasymptotemorerapidly.IndexTermsstochasticapproximation,optimization,noisylossfunction,randomdirection,ﬁnitedifference,simultaneousperturbationI.INTRODUCTIONStochasticapproximationprovidesasimpleandeffectiveapproachforﬁndingrootsandminimaoffunctionswhoseevaluationsarecontaminatedwithnoise.Considerasmooth1p-dimensionallossfunctionL:RRp!RR,withgradientg:RRp!RRp.AssumethatLhasaunique2local(andthereforeglobal)minimumx2RRp.Thatis,L(x)L(x)forallx2RRp,andg(x)=0iffx=x.Ifadirect(butpossiblynoisy)estimator^g(x)ofthegradientfunctionisavailable,thentheRobbins-Monro[1]algorithm(asextendedbyBlum[2]tomultidimensionalsystetms)estimates1Tosimplifyexposition,wetakeLtobeinﬁnitelydifferentiable,butremarkthatmanyoftheresultsonlyrequirethatLbes-timesdifferentiable,wheresdependsontheparticularresult.2Stochasticapproximationalgorithmscanstillbeusefulforlossfunctionswithmultiplelocalminima,butformalresultsaremorereadilyobtainedifthereisasinglelocalminimum.June27,2003DRAFT3arootofg(x)withthefollowingrecursion:xk+1=xkak^g(xk);(1)whereakisasequenceofpositivenumbersthatsatisfyP1k=1ak=1andlimk!1ak=0.Inparticular,ak=ao=kwith01satisﬁestheseconditions.Iftheestimatorisunbiased,thatisEf^g(x)g=g(x),thenxkwillconvergetotherootofg.Inparticular,itcanbeshownthatEf(xkx)2g=O(k)forlargek.KieferandWolfowitz[3]introducedanalgorithminwhichthegradientisestimatedbyﬁnitedifferences(Blum[2]alsoextendedthisresulttomultipledimensions).Thisﬁnitedifferencestochasticapproximation(FDSA)algorithmemploysanestimatorforthegradientwhoseithcomponentisgivenby^gi(x)=^L(x+cei)^L(xcei)2c;(2)whereeiistheunitvectoralongtheithaxis,and^Lisanoisymeasurementofthelossfunction.Sincethisisdoneforeachcomponent,itrequires2pmeasurementsofthelossfunctionforeachiteration.Forc0,Eq.(2)isingeneralabiasedestimatorofthegradient.Convergencisachievedbyprovidingadecreasingsequenceckwithlimn!1ck=0sothatthebiasiseventuallyeliminated.However,thecostofusingasmallercisalargervariance,sotherateatwhichck!0mustbecarefullychosen.IntheFDSAalgorithm,separateestimatesarecomputedforeachcomponentofthegradient.Thismeansthatap-dimensionalproblemrequiresatleast2pevaluationsofthelossfunctionperiteration.Bycontrast,therandomdirectionstochasticapproximation(RDSA)algorithmsestimateonlyonecomponentofthegradientperiteration.Let2RRpbeadirectionvector.InKushnerandClark[4],istreatedasaunitvectorwithjj2=Pi2i=1andsinceitisarandomdirection,itsatisﬁesEnTo=I=p.Chin[5]preferstheconventionthathaveradius3ppsothatjj2=Pi2i=pandEnTo=I.Regardlessofconventionfor,bothauthorswritetheRDSAformulaasxn+1=xkak^L(xk+ckk)^L(xkckk)2ck#k;(3)butitbearsremarkingthattheformulasarenotequivalent.Usingthejj2=pconvention,theaboveformulacorrespondsdirectlytotheRobbins-MonroformulationinEq.(1).Withthe3Chin[5]mistakenlysaystheradiusisp.June27,2003DRAFT4jj2=1convention,however,theaboveformulacorrespondstoxn+1=xk(1=p)ak^g(xk),whichbyasimplerescalingofanisequivalent4toEq.(1).Tofacilitatecomparisonswiththemorerecentwork,wewilltaketheconventionthatjj2=p.Severalchoicesareavailableforchoosingtherandomdistribution

On the choice of random directions for stochastic

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

华南家电研究院中试厂房起重机采购(SD07047)

排水工程深槽施工方案

海运金融学第五章

设计开发中心鞋业模具师职务说明书

XXXX年煤炭行业投资策略

银诺克药业有限公司完美终端产品目录

酒店客务管理

其他：5内分泌科品管圈成果汇报书

大兴区XXXX年初三质量检测(一模)

企业家圈套(1)

相关文档

相关搜索