基于循环神经网络的语音识别模型-朱小燕

24　220012　　　　　　　　CHINESEJ.COMPUTERSVol.24No.2Feb.2001朱小燕　王　昱　徐　伟(　100084)(　100084):1999-12-21.(69982005)、(G199803050703).,,1957,,,、、.　,,1975,,.　,,1974,,CMU,.　　(HMM).HMM,HMM.HMM.,.,HMM.,.　,(HMM),:TP391SpeechRecognitionModelBasedonRecurrentNeuralNetworksZHUXiao-Yan　WANGYu　XUWei(StateKeyLaboratoryofIntelligentSystemandTechnology,TsinghuaUniversity,Beijing100084)(DepartmentofComputerScienceandTechnology,TsinghuaUniversity,Beijing100084)Abstract　ToovercomesomeweaknessesofhiddenMarkovmodelinspeechrecognition,HMM/NNhybridsystemshadbeenexploredbymanyresearchersinrecentyears.InthepreviousHMM/NNhybridsystems,theneuralnetworksadoptedaremostlymultilayerperceptron(MLP).Inoursystem,recurrentneuralnetworks(RNN)wereusedtotaketheplaceofMLPasthesyllableprobabilityestimator.RNNisMLPincorporatedwithafeedbackwhichcantransporttheoutputofsomeneuronstootherneuronsorthemselves.TheincorporationoffeedbackintoaMLPgivesthenettheabilitytoefficientlyprocessthecontextinformationoftimesequence,whichisespeciallyusefulforspeechrecognition.Inthispaper,thearchitectureoftheRNNismodifiedandcorrespondingtrainingschemaispresented.Followingtechniqueshavebeenadoptedinoursystem.1.Anetworkwithasinglelayerhasbeenadopted,whilethecontentoffeedbackisdifferentfromthenetworkusedbypreviousresearchers,i.e.,theexternaloutputisincludedinthefeed-back,notjusttheinternalstateoutput.2.Thetrainingalgorithmadoptedinoursystemisback-propagationthroughtime(BPTT)algorithm.InthecommonBPTTalgorithm,theinitialfeedbackvaluesaresetarbitrarilyaccord-ingtoexperience.Thismeansthattheinitialfeedbackisnotspecifictotheproblemwearedeal-ingwith.Soitshouldbepreferableiftheinitialfeedbackvaluesalsocanbetrained.Inourtrain-ingalgorithm,thisisachievedbyaddinganadditionallayertotheunfoldednetwork.3.Totrainthenetwork,propertargetvaluesmustbegiven.Toacquirethem,wetakeuseofHMMswhichhavebeentrainedtorecognizethesamesyllables.Theadvantageofthismethodisthatitavoidsthedifficultyandinaccuracyofthehand-setteachersignalsanditgivesasmoothtransitionbetweentwoadjacentstates.4.Inordertomakethenetworklearnfasterandacquirebettergeneralizationability,astrat-egywhichtrainsthenetworkbystageshasbeenused.Atfirst,shortfragmentsofspeechse-quencesaregiven.Aftersmallenougherrorhasbeenachievedontheseshortpieces,longerfrag-mentsareusedtolearn.Finally,wholesequencesarelearned.Experimentresultsshowthatthetrainingspeedcanbeacceleratedbythemethod,andtherecognitionperformanceisalsoimproved.Keywords　speechrecognition,hiddenmarkovmodel,recurrentneuralnetworks1　　..(HiddenMarkovModel,HMM),.HMM.;(MaximumLikelihood,ML);;.HMM..(NeuralNetworks,NN),,.NN、.NNHMM,NN/HMM[1—5,8].NN,.NN/HMM,(MultilayerPreceptrons,MLP)[1—3].NN/HMM[8,9]MLP,[9]MLP.90(RecurrentNeuralNetworks,RNN)MLP[5].RNN,.,,.,.、,,.2　(RNN),..,,.1(a),(b)RNN[5,7].90RNN[4,5].:,、、,.214　　　　　　　　　20012.1　RNN:,1(a).,,.,,2..2,I.RNN,,[7]..,,,,.,I.3,I1,I1.L,.I.2.2　(Back-PropagationThroughTime,BPTT)[6].,.24.I.IL().4,N,:(1)x(0)I,u(0).y(0)x(1).(2)t0,x(t),u(t),y(t)x(t+1).z(t)=1u(t)x(t)(1)yi(t+1)=f(Wiz(t))(2)Wi,f(x)=tanh(x)=21+e-x-1..,t=N-1.2152:.(3),.,y(N-1).MLP,x(N-1).ei(N-1)=　f′(Wiz(N-1))×(yi(N-1)-oi(N-1)),0≤iC0,(3)C,oi(t).(4)0≤t≤N-2,,y(t),t+1.x(t).ei(t)=　f′(Wiz(t))×(yi(t)-oi(t)+∑jwijej(t+1)),0≤iCf′(Wiz(t))∑jwijej(t+1),(4)(5)0,LI.eIi=f′(wIi)∑jwijej(0)(5)(6),.Δwij=T∑N-1t=0zi(t)ej(t)(6)ΔwIi=TeIi(7)(4),.600600,,.,,...,,.1,3,5,7,…,;,0,2,4,6,…;.(1),.,RNN.,.,,.3　RNN/HMMNN/HMM[2—5,8],HMM,NN/HMMHMM.RNN/HMMHMM,HMM.RNN/HMMHMMHMMP(ut|si),RNN/HMM,:P(ut|si)=P(si|ut)P(ut)P(si)(8)P(si)si,P(ut),,,P(si|ut).HMM.sii.sijij,P(sij|si).,,.Baum-welch.RNN/HMM5,HMM.4　CIDS,60.40,20.11.025KHz,216　　　　　　　　　200116+16+1.1.　.RNN.1(b).1,,L15,L22.L2L1.-10,11,:0,1.:40(,),;,6,8,,.RNN,..2.　RNN.,.Anew、Bnew、CnewDnewAold,Bold,Cold,Dold,.:·Anew:L.·Aold:L.·Bnew:.·Bold:,0.·Cnew:,1,-1.·Cold:,0.·Dnew:,1,3,5,7,…,0,2,4,6,…,.·Dold:..:Ⅰ,;Ⅱ,I,I;Ⅲ,;Ⅳ,.ⅣⅤ,.1.1　RNN,ⅠAold+Bnew+Cnew+Dold,,1.0%19.5%ⅡAnew+Bold+Cnew+DoldI,6.8%18.0%ⅢAnew+Bnew+Cold+Dold,0.0%21.5%ⅣAnew+Bnew+Cnew+Dold0.25%l5.0%ⅤAnew+Bnew+Cnew+Dnew0.25%l3.5%ⅠⅣ,Anew75%,75%,.ⅡⅣBnew,,.ⅢⅣ,Cnew,,;ⅣⅤ,Dnew,,,;4.5　　,BPTT2172:.,,HMM.、、,...,,.,,HMM,.,,,.1BourlardH,MorganN.ContinuousSpeechRecognition:AHy-bridApproach.Norwell,Massachusetts:KluwerAcademicPublishers,19942AbrashV,FrancoH,SankarAetal.Connectionistspeakernormalizationandadaptation.In:Proc4thEuropeanConferenceofSpeechCommunicationandTechnology(Eurospeech95),Madrid,Spain,19953CohenM,RumelhartD,MorganNetal.CombiningneuralnetworksandhiddenMarkovmodelsforcontinuousspeechrecognition.In:ProcDARPASpeechandNaturalLanguageWorkshop,Harriman,NY,19924TebelskisJ.Speechusingneuralnetworks.CarnegieMellonU-niversity:TechnicalReportCMU-CS-95-142,19955RobinsonT.Anapplicationofrecurrentnetstophoneprobabili-tyestimation.IEEETransNeuralNetworks,1994,5(3):298-3056WerbosPJ.Backpropagationthroughtime:Whatitdoesandhowtodoit.ProceedingsoftheIEEE,1990,78(10):1550-15607SeniorAW.Off-linecursivehandwritingrecognitionusingre-currentneuralnetworks[PhDdissertation].Cambridge:Uni-versityofCambridge,19948YuTie-Cheng,ZhouJian-Lai,SongYan-Tao.Anoverviewofspeechreco

基于循环神经网络的语音识别模型-朱小燕

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

第3章电子政务与政府管理模式的变革

X年一建施工管理知识点汇总

土木工程材料PPT

基础环、座环、蜗壳安装施工组织设计

施工记录表_2

城市园林绿地规划设计复习资料

S银行绩效管理体系的构建探讨

国民经济和社会发展“九五”计划和XXXX年远景目标纲要-

区长质量奖申请表-内容已填完整版

精品课件如何提高工作效率时间管理

相关文档

相关搜索

基于循环神经网络的语音识别模型-朱小燕

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

第3章 电子政务与政府管理模式的变革

X年一建施工管理知识点汇总

土木工程材料PPT

基础环、座环、蜗壳安装施工组织设计

施工记录表_2

城市园林绿地规划设计复习资料

S银行绩效管理体系的构建探讨

国民经济和社会发展“九五”计划和XXXX年远景目标纲要-

区长质量奖申请表-内容已填完整版

精品课件如何提高工作效率时间管理

相关文档

相关搜索

第3章电子政务与政府管理模式的变革