手机浏览更便捷

您好，欢迎访问三七文档

当前位置：首页 > IT计算机/网络 > 数据库 > Deep-Sparse-Rectifier-Neural-Networks

Deep-Sparse-Rectifier-Neural-Networks

举报
收藏

315DeepSparseRectierNeuralNetworksXavierGlorotAntoineBordesYoshuaBengioDIRO,UniversitedeMontrealMontreal,QC,Canadaglorotxa@iro.umontreal.caHeudiasyc,UMRCNRS6599UTC,Compiegne,FranceandDIRO,UniversitedeMontrealMontreal,QC,Canadaantoine.bordes@hds.utc.frDIRO,UniversitedeMontrealMontreal,QC,Canadabengioy@iro.umontreal.caAbstractWhilelogisticsigmoidneuronsaremorebi-ologicallyplausiblethanhyperbolictangentneurons,thelatterworkbetterfortrain-ingmulti-layerneuralnetworks.Thispa-pershowsthatrectifyingneuronsareanevenbettermodelofbiologicalneuronsandyieldequalorbetterperformancethanhy-perbolictangentnetworksinspiteofthehardnon-linearityandnon-dierentiabilityatzero,creatingsparserepresentationswithtruezeros,whichseemremarkablysuitablefornaturallysparsedata.Eventhoughtheycantakeadvantageofsemi-supervisedsetupswithextra-unlabeleddata,deeprectiernet-workscanreachtheirbestperformancewith-outrequiringanyunsupervisedpre-trainingonpurelysupervisedtaskswithlargelabeleddatasets.Hence,theseresultscanbeseenasanewmilestoneintheattemptsatunder-standingthedicultyintrainingdeepbutpurelysupervisedneuralnetworks,andclos-ingtheperformancegapbetweenneuralnet-workslearntwithandwithoutunsupervisedpre-training.1IntroductionManydierencesexistbetweentheneuralnetworkmodelsusedbymachinelearningresearchersandthoseusedbycomputationalneuroscientists.ThisisinpartAppearinginProceedingsofthe14thInternationalCon-ferenceonArticialIntelligenceandStatistics(AISTATS)2011,FortLauderdale,FL,USA.Volume15ofJMLR:W&CP15.Copyright2011bytheauthors.becausetheobjectiveoftheformeristoobtaincom-putationallyecientlearners,thatgeneralizewelltonewexamples,whereastheobjectiveofthelatteristoabstractoutneuroscienticdatawhileobtainingex-planationsoftheprinciplesinvolved,providingpredic-tionsandguidanceforfuturebiologicalexperiments.Areaswherebothobjectivescoincidearethereforeparticularlyworthyofinvestigation,pointingtowardscomputationallymotivatedprinciplesofoperationinthebrainthatcanalsoenhanceresearchinarticialintelligence.Inthispaperweshowthattwocom-mongapsbetweencomputationalneurosciencemodelsandmachinelearningneuralnetworkmodelscanbebridgedbyusingthefollowinglinearbypartactiva-tion:max(0;x),calledtherectier(orhinge)activa-tionfunction.Experimentalresultswillshowengagingtrainingbehaviorofthisactivationfunction,especiallyfordeeparchitectures(seeBengio(2009)forareview),i.e.,wherethenumberofhiddenlayersintheneuralnetworkis3ormore.Recenttheoreticalandempiricalworkinstatisticalmachinelearninghasdemonstratedtheimportanceoflearningalgorithmsfordeeparchitectures.Thisisinpartinspiredbyobservationsofthemammalianvi-sualcortex,whichconsistsofachainofprocessingelements,eachofwhichisassociatedwithadierentrepresentationoftherawvisualinput.Thisispartic-ularlyclearintheprimatevisualsystem(Serreetal.,2007),withitssequenceofprocessingstages:detectionofedges,primitiveshapes,andmovinguptogradu-allymorecomplexvisualshapes.Interestingly,itwasfoundthatthefeatureslearnedindeeparchitecturesresemblethoseobservedinthersttwoofthesestages(inareasV1andV2ofvisualcortex)(Leeetal.,2008),andthattheybecomeincreasinglyinvarianttofactorsofvariation(suchascameramovement)inhigherlay-ers(Goodfellowetal.,2009).316DeepSparseRectierNeuralNetworksRegardingthetrainingofdeepnetworks,somethingthatcanbeconsideredabreakthroughhappenedin2006,withtheintroductionofDeepBeliefNet-works(Hintonetal.,2006),andmoregenerallytheideaofinitializingeachlayerbyunsupervisedlearn-ing(Bengioetal.,2007;Ranzatoetal.,2007).Someauthorshavetriedtounderstandwhythisunsuper-visedprocedurehelps(Erhanetal.,2010)whileoth-ersinvestigatedwhytheoriginaltrainingprocedurefordeepneuralnetworksfailed(BengioandGlorot,2010).Fromthemachinelearningpointofview,thispaperbringsadditionalresultsintheselinesofinvestigation.Weproposetoexploretheuseofrectifyingnon-linearitiesasalternativestothehyperbolictangentorsigmoidindeeparticialneuralnetworks,inad-ditiontousinganL1regularizerontheactivationval-uestopromotesparsityandpreventpotentialnumer-icalproblemswithunboundedactivation.NairandHinton(2010)presentpromisingresultsoftheinu-enceofsuchunitsinthecontextofRestrictedBoltz-mannMachinescomparedtologisticsigmoidactiva-tionsonimageclassicationtasks.Ourworkextendsthisforthecaseofpre-trainingusingdenoisingauto-encoders(Vincentetal.,2008)andprovidesanexten-siveempiricalcomparisonoftherectifyingactivationfunctionagainstthehyperbolictangentonimageclas-sicationbenchmarksaswellasanoriginalderivationforthetextapplicationofsentimentanalysis.Ourexperimentsonimageandtextdataindicatethattrainingproceedsbetterwhenthearticialneuronsareeitherooroperatingmostlyinalinearregime.Sur-prisingly,rectifyingactivationallowsdeepnetworkstoachievetheirbestperformancewithoutunsupervisedpre-training.Hence,ourworkproposesanewcontri-butiontothetrendofunderstandingandmergingtheperformancegapbetweendeepnetworkslearntwithandwithoutunsupervisedpre-training(Erhanetal.,2010;BengioandGlorot,2010).Still,rectiernet-workscanbenetfromunsupervisedpre-traininginthecontextofsemi-sup

整理文档很辛苦，赏杯茶钱您下走！

¥ 10 元

还剩... 页未读，继续阅读

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

举报
收藏 word格式文档无特别注明外均可编辑修改；预览文档经过压缩，下载后原文更清晰！ 立即下载

关键词：

三七文档所有资源均是用户自行上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作他用。

Ta的文档更多...

关于本文

本文标题：Deep-Sparse-Rectifier-Neural-Networks

链接地址：https://www.777doc.com/doc-4509896 .html

共153篇文档

文档简介：

格式： pdf

大小： 1001 KB

时间： 2020-03-22

相关文档

相关搜索

< / 9 >

下载文档

扫描二维码
访问手机网站

联系我们

邮箱：2149211541@qq.com

Q Q：2149211541

Copyright © 三七文档 All Rights Reserved. 鲁ICP备2024069028号-1

保存成功