Factor analysis using delta-rule wake-sleep learni

TechnicalReportNo.9607,DepartmentofStatistics,UniversityofTorontoFactorAnalysisUsingDelta-RuleWake-SleepLearningRadfordM.NealDepartmentofStatisticsandDepartmentofComputerScienceUniversityofTorontoradford@stat.utoronto.caPeterDayanDepartmentofBrainandCognitiveSciencesMassachusettsInstituteofTechnologydayan@ai.mit.edu24July1996Wedescribealinearnetworkthatmodelscorrelationsbetweenreal-valuedvisiblevari-ablesusingoneormorereal-valuedhiddenvariables—afactoranalysismodel.Thismodelcanbeseenasalinearversionofthe“Helmholtzmachine”,anditsparameterscanbelearnedusingthe“wake-sleep”method,inwhichlearningoftheprimary“generative”modelisassistedbya“recognition”model,whoseroleistoﬁllinthevaluesofhiddenvariablesbasedonthevaluesofvisiblevariables.Thegenerativeandrecognitionmodelsarejointlylearnedin“wake”and“sleep”phases,usingjustthedeltarule.ThislearningprocedureiscomparableinsimplicitytoOja’sversionofHebbianlearning,whichpro-ducesasomewhatdifferentrepresentationofcorrelationsintermsofprincipalcompo-nents.Wearguethatthesimplicityofwake-sleeplearningmakesfactoranalysisaplau-siblealternativetoHebbianlearningasamodelofactivity-dependentcorticalplasticity.1IntroductionActivity-dependentplasticityinthevertebratebrainhastypicallybeenmodeledintermsofHebbianlearning(Hebb1959),inwhichweightchangesarebasedonthecovarianceofpre-synapticandpost-synapticactivity(eg,vonderMalsburg1973;Linsker1986;Miller,Keller,andStryker1989).Thesemodelsderivesupportfromneurobiologicalevidenceoflong-termpotentiation(see,forexample,CollingridgeandBliss(1987),andforarecentreview,BaudryandDavis(1994)).Theyhavealsobeenseenasperformingareasonablefunction,namelyextractingthestatisticalstructureamongstacollectionofinputsintermsofprincipalcom-ponents(Linkser1988).Inthispaper,wesuggestthestatisticaltechniqueoffactoranalysisasaninterestingalternativetoprincipalcomponentsanalysis,andshowhowtoimplementitusinganalgorithmwhosedemandsonsynapticplasticityareaslocalasthoseoftheHebbrule.Factoranalysisisamodelforreal-valueddatainwhichcorrelationsare“explained”bypostulatingthepresenceofoneormoreunderlying“factors”.Thesefactorsplaytheroleof1“latent”or“hidden”variables,whicharenotdirectlyobservable,butwhichallowthedepen-denciesbetweenthe“visible”variablestobeexpressedinaconvenientway.Everitt(1984)givesagoodintroductiontolatentvariablemodelsingeneral,andtofactoranalysisinpar-ticular.Thesemodelsarewidelyusedinpsychologyandthesocialsciencesasawayofex-ploringwhetherobservedpatternsindatamightbeexplainableintermsofasmallnumberofunobservedfactors.Ourinterestinthesemodelsstemsfromtheirpotentialasawayofbuildinghigh-levelrepresentationsfromsensorydata.Oja’sversionofHebbianlearning(OjaandKarhunen1985;Oja1989,1992)isaparticu-larlyconvenientcounterpoint.Thisruleappliestoalinearunitwithweightvectorwthatcomputesanoutputy=wTxwhenpresentedwithareal-valuedinputvectorx(which,forconvenience,isassumedtohavemeanzero).Aftereachpresentationofaninputvector,theweightsfortheunitarechangedbyanamountgivenbythefollowingproportionality:w/y(xyw)=yxy2w:(1)Theﬁrstterminthisweightincrement,yx,isofHebbianform.Thesecondterm,y2w,tendstopushtheweightstowardszero,balancingthepositivefeedbackinplainHebbianlearn-ing,whichwouldotherwiseincreasethemagnitudeoftheweightswithoutbound.WyattandElfadel(1995)giveanexplicitanalysisoflearningbasedonequation(1),showingthatwithreasonablestartingconditions,wconvergestotheprincipaleigenvectorofthecovari-ancematrixoftheinputs—thatis,itconvergestoaunitvectorpointinginthedirectionofhighestvarianceintheinputspace.Extractingthesubsidiaryeigenvectorsofthecovari-ancematrixoftheinputsissomewhatmorechallenging,requiringsomeformofinhibitionbetweensuccessiveoutputunits(Sanger1989;F¨oldi´ak1989;Plumbley1993).Linsker(1988)viewsHebbianlearningasawayofmaximisingtheinformationretainedbyyaboutx.UnderthesimplifyingassumptionthatthedistributionoftheinputsisGaus-sian,settingtheoutputofaunittotheprojectionofitsinputontotheﬁrstprincipalcompo-nentoftheinputcovariancematrixconveysasmuchinformationaspossibleonaverage(seealsoPlumbley1993).Thisgoalseemsreasonablefortheveryearlystagesofsensoryprocess-ing,whereinformationbottleneckssuchastheopticnervemayplausiblybepresent.Note,however,thatitimplicitlyassumesthatallinformationisequallyimportant.Maximizingin-formationtransferseemslesscompellingasagoalforsubsequentlevelsofprocessing,oncesensorysignalshavereachedcortex.Severalothercomputationalgoalshavebeensuggestedfromthisstageupwards,includingfactorialcoding(Barlow1989),sparsiﬁcation(OlshausenandField1995),andvariousmethodsforencouragingthecortextorespectreasonableinvari-ances,suchastranslationorscaleinvarianceforvisualprocessing(LiandAtick1994).Inthispaper,wepursuethesuggestionofHintonandZemel(1994)(seealsoGrenander1976-1981;Mumford1994;Dayan,Hinton,Neal,andZemel1995)thatthecortexmightbeconstructingahierarchicalstochastic“generative”modelofitsinputinthetop-downcon-nections,whileimplementinginthebottom-upconnectionsa“recognition”modelthatinasenseistheinverseofthegenerativemodel.Therecognitionmodelprovideshigh-

Factor analysis using delta-rule wake-sleep learni

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

电子证据及司法应对浅析与未来

首都旅游股份有限公司经营现状分析（DOC 29页）

企业-承包经营合同

贵南卫生“十二五”发展规划

职业经理人成长的心理分析

工程指挥部岗位职责

XXXX年2月4日炼铁区域电机统计表(新版)

GB6313-86尖头千分尺

股票投资基础知识培训炒股快速入门

人教版七年级英语下册-重点短语及句型汇总(修订版)

相关文档

相关搜索