您好,欢迎访问三七文档
AnEmpiricalStudyofLearningSpeedinBack-PropagationNetworksScottE.FahlmanSeptember1988CMU-CS-88-162AbstractMostconnectionistorneuralnetworklearningsystemsusesomeformoftheback-propagationalgorithm.However,back-propagationlearningistooslowformanyapplications,anditscalesuppoorlyastasksbecomelargerandmorecomplex.Thefactorsgoverninglearningspeedarepoorlyunderstood.Ihavebegunasystematic,empiricalstudyoflearningspeedinbackprop-likealgorithms,measuredagainstavarietyofbenchmarkproblems.Thegoalistwofold:todevelopfasterlearningalgorithmsandtocontributetothedevelopmentofamethodologythatwillbeofvalueinfuturestudiesofthiskind.Thispaperisaprogressreportdescribingtheresultsobtainedduringthefirstsixmonthsofthisstudy.TodateIhavelookedonlyatalimitedsetofbenchmarkproblems,buttheresultsontheseareencouraging:Ihavedevelopedanewlearningalgorithmthatisfasterthanstandardbackpropbyanorderofmagnitudeormoreandthatappearstoscaleupverywellastheproblemsizeincreases.ThisresearchwassponsoredinpartbytheNationalScienceFoundationunderContractNumberEET-8716324andbytheDefenseAdvancedResearchProjectsAgency(DOD),ARPAOrderNo.4976underContractF33615-87-C-1499andmonitoredbytheAvionicsLaboratory,AirForceWrightAeronauticalLaboratories,AeronauticalSystemsDivision(AFSC),Wright-PattersonAFB,OH45433-6543.Theviewsandconclusionscontainedinthisdocumentarethoseoftheauthorsandshouldnotbeinterpretedasrepresentingtheofficialpolicies,eitherexpressedorimplied,oftheseagenciesoroftheU.S.Government.11.IntroductionNote:InthispaperIwillnotattempttoreviewthebasicideasofconnectionismorback-propagationlearning.See[3]forabriefoverviewofthisareaand[10],chapters1-8,foradetailedtreatment.WhenIrefertostandardback-propagationinthispaper,Imeantheback-propagationalgorithmwithmomentum,asdescribedin[9].Thegreatestsingleobstacletothewidespreaduseofconnectionistlearningnetworksinreal-worldapplicationsistheslowspeedatwhichthecurrentalgorithmslearn.Atpresent,thefastestlearningalgorithmformostpurposesisthealgorithmthatisgenerallyknownasback-propagationorbackprop[6,7,9,18].Theback-propagationlearningalgorithmrunsfasterthanearlierlearningmethods,butitisstillmuchslowerthanwewouldlike.Evenonrelativelysimpleproblems,standardback-propagationoftenrequiresthecompletesetoftrainingexamplestobepresentedhundredsorthousandsoftimes.Thismeansthatwearelimitedtoinvestigatingrathersmallnetworkswithonlyafewthousandtrainableweights.Someproblemsofreal-worldimportancecanbetackledusingnetworksofthissize,butmostofthetasksforwhichconnectionisttechnologymightbeappropriatearemuchtoolargeandcomplextobehandledbyourcurrentlearning-networktechnology.OnesolutionistorunournetworksimulationsonfastercomputersortoimplementthenetworkelementsdirectlyinVLSIchips.Anumberofgroupsareworkingonfasterimplementations,includingagroupatCMUthatisusingthe10-processorWarpmachine[13].Thisworkisimportant,butevenifwehadanetworkimplementeddirectlyinhardwareourslowlearningalgorithmswouldstilllimittherangeofproblemswecouldattack.Advancesinlearningalgorithmsandinimplementationtechnologyarecomplementary.Ifwecancombinehardwarethatrunsseveralordersofmagnitudefasterandlearningalgorithmsthatscaleupwelltoverylargenetworks,wewillbeinapositiontotackleamuchlargeruniverseofpossibleapplications.SinceJanuaryof1988Ihavebeenconductinganempiricalstudyoflearningspeedinsimulatednetworks.Ihavestudiedthestandardbackpropalgorithmandanumberofvariationsonstandardback-propagation,applyingthesetoasetofmoderate-sizedbenchmarkproblems.ManyofthevariationsthatIhaveinvestigatedwerefirstproposedbyotherresearchers,butuntilnowtherehavebeennosystematicstudiestocomparethesemethods,individuallyandinvariouscombinations,againstastandardsetoflearningproblems.Onlythroughsuchsystematicstudiescanwehopetounderstandwhichmethodsworkbestinwhichsituations.Thispaperisareportontheresultsobtainedinthefirstsixmonthsofthisstudy.Perhapsthemostimportantresultistheidentificationofanewlearningmethod--actuallyacombinationofseveralideas--thatonarangeofencoder/decoderproblemsisfasterthanstandardback-propagationbyanorderofmagnitudeormore.Thisnewmethodalsoappearstoscaleupmuchbetterthanstandardbackpropasthesizeandcomplexityofthelearningtaskgrows.Imustemphasizethatthisisaprogressreport.Thelearning-speedstudyisfarfromcomplete.UntilnowIhaveconcentratedmostofmyeffortonasingleclassofbenchmarks,namelytheencoder/decoderproblems.Likeanyfamilyofbenchmarkstakeninisolation,encoder/decoderproblemshavecertainpeculiaritiesthatmaybiastheresultsofthestudy.Untilamorecomprehensivesetofbenchmarkshasbeenrun,itwouldbeprematuretodrawanysweepingconclusionsormakeanystrongclaimsaboutthewidespreadapplicabilityofthesetechniques.22.Methodology2.1.WhatMakesaGoodBenchmark?Atpresentthereisnowidelyacceptedmethodologyformeasuringandcomparingthespeedofvariousconnectionistlearningalgorithms.Someresearchershaveproposednewalgorithmsbasedonlyonatheoreticalanalysisoftheproblem.Itissometimeshardtodeterminehowwellthesetheoreticalmodelsfitactualpractice.Otherresearchersimplementtheirideasandrunoneortwobenchmarkstodemonstratethespeedoftheresultings
本文标题:An empirical study of learning speed in back-propa
链接地址:https://www.777doc.com/doc-3328460 .html