您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 管理学资料 > 基于循环神经网络的语音识别模型-朱小燕
24 220012 CHINESEJ.COMPUTERSVol.24No.2Feb.2001朱小燕 王 昱 徐 伟( 100084)( 100084):1999-12-21.(69982005)、(G199803050703).,,1957,,,、、. ,,1975,,. ,,1974,,CMU,. (HMM).HMM,HMM.HMM.,.,HMM.,. ,(HMM),:TP391SpeechRecognitionModelBasedonRecurrentNeuralNetworksZHUXiao-Yan WANGYu XUWei(StateKeyLaboratoryofIntelligentSystemandTechnology,TsinghuaUniversity,Beijing100084)(DepartmentofComputerScienceandTechnology,TsinghuaUniversity,Beijing100084)Abstract ToovercomesomeweaknessesofhiddenMarkovmodelinspeechrecognition,HMM/NNhybridsystemshadbeenexploredbymanyresearchersinrecentyears.InthepreviousHMM/NNhybridsystems,theneuralnetworksadoptedaremostlymultilayerperceptron(MLP).Inoursystem,recurrentneuralnetworks(RNN)wereusedtotaketheplaceofMLPasthesyllableprobabilityestimator.RNNisMLPincorporatedwithafeedbackwhichcantransporttheoutputofsomeneuronstootherneuronsorthemselves.TheincorporationoffeedbackintoaMLPgivesthenettheabilitytoefficientlyprocessthecontextinformationoftimesequence,whichisespeciallyusefulforspeechrecognition.Inthispaper,thearchitectureoftheRNNismodifiedandcorrespondingtrainingschemaispresented.Followingtechniqueshavebeenadoptedinoursystem.1.Anetworkwithasinglelayerhasbeenadopted,whilethecontentoffeedbackisdifferentfromthenetworkusedbypreviousresearchers,i.e.,theexternaloutputisincludedinthefeed-back,notjusttheinternalstateoutput.2.Thetrainingalgorithmadoptedinoursystemisback-propagationthroughtime(BPTT)algorithm.InthecommonBPTTalgorithm,theinitialfeedbackvaluesaresetarbitrarilyaccord-ingtoexperience.Thismeansthattheinitialfeedbackisnotspecifictotheproblemwearedeal-ingwith.Soitshouldbepreferableiftheinitialfeedbackvaluesalsocanbetrained.Inourtrain-ingalgorithm,thisisachievedbyaddinganadditionallayertotheunfoldednetwork.3.Totrainthenetwork,propertargetvaluesmustbegiven.Toacquirethem,wetakeuseofHMMswhichhavebeentrainedtorecognizethesamesyllables.Theadvantageofthismethodisthatitavoidsthedifficultyandinaccuracyofthehand-setteachersignalsanditgivesasmoothtransitionbetweentwoadjacentstates.4.Inordertomakethenetworklearnfasterandacquirebettergeneralizationability,astrat-egywhichtrainsthenetworkbystageshasbeenused.Atfirst,shortfragmentsofspeechse-quencesaregiven.Aftersmallenougherrorhasbeenachievedontheseshortpieces,longerfrag-mentsareusedtolearn.Finally,wholesequencesarelearned.Experimentresultsshowthatthetrainingspeedcanbeacceleratedbythemethod,andtherecognitionperformanceisalsoimproved.Keywords speechrecognition,hiddenmarkovmodel,recurrentneuralnetworks1 ..(HiddenMarkovModel,HMM),.HMM.;(MaximumLikelihood,ML);;.HMM..(NeuralNetworks,NN),,.NN、.NNHMM,NN/HMM[1—5,8].NN,.NN/HMM,(MultilayerPreceptrons,MLP)[1—3].NN/HMM[8,9]MLP,[9]MLP.90(RecurrentNeuralNetworks,RNN)MLP[5].RNN,.,,.,.、,,.2 (RNN),..,,.1(a),(b)RNN[5,7].90RNN[4,5].:,、、,.214 20012.1 RNN:,1(a).,,.,,2..2,I.RNN,,[7]..,,,,.,I.3,I1,I1.L,.I.2.2 (Back-PropagationThroughTime,BPTT)[6].,.24.I.IL().4,N,:(1)x(0)I,u(0).y(0)x(1).(2)t0,x(t),u(t),y(t)x(t+1).z(t)=1u(t)x(t)(1)yi(t+1)=f(Wiz(t))(2)Wi,f(x)=tanh(x)=21+e-x-1..,t=N-1.2152:.(3),.,y(N-1).MLP,x(N-1).ei(N-1)= f′(Wiz(N-1))×(yi(N-1)-oi(N-1)),0≤iC0,(3)C,oi(t).(4)0≤t≤N-2,,y(t),t+1.x(t).ei(t)= f′(Wiz(t))×(yi(t)-oi(t)+∑jwijej(t+1)),0≤iCf′(Wiz(t))∑jwijej(t+1),(4)(5)0,LI.eIi=f′(wIi)∑jwijej(0)(5)(6),.Δwij=T∑N-1t=0zi(t)ej(t)(6)ΔwIi=TeIi(7)(4),.600600,,.,,...,,.1,3,5,7,…,;,0,2,4,6,…;.(1),.,RNN.,.,,.3 RNN/HMMNN/HMM[2—5,8],HMM,NN/HMMHMM.RNN/HMMHMM,HMM.RNN/HMMHMMHMMP(ut|si),RNN/HMM,:P(ut|si)=P(si|ut)P(ut)P(si)(8)P(si)si,P(ut),,,P(si|ut).HMM.sii.sijij,P(sij|si).,,.Baum-welch.RNN/HMM5,HMM.4 CIDS,60.40,20.11.025KHz,216 200116+16+1.1. .RNN.1(b).1,,L15,L22.L2L1.-10,11,:0,1.:40(,),;,6,8,,.RNN,..2. RNN.,.Anew、Bnew、CnewDnewAold,Bold,Cold,Dold,.:·Anew:L.·Aold:L.·Bnew:.·Bold:,0.·Cnew:,1,-1.·Cold:,0.·Dnew:,1,3,5,7,…,0,2,4,6,…,.·Dold:..:Ⅰ,;Ⅱ,I,I;Ⅲ,;Ⅳ,.ⅣⅤ,.1.1 RNN,ⅠAold+Bnew+Cnew+Dold,,1.0%19.5%ⅡAnew+Bold+Cnew+DoldI,6.8%18.0%ⅢAnew+Bnew+Cold+Dold,0.0%21.5%ⅣAnew+Bnew+Cnew+Dold0.25%l5.0%ⅤAnew+Bnew+Cnew+Dnew0.25%l3.5%ⅠⅣ,Anew75%,75%,.ⅡⅣBnew,,.ⅢⅣ,Cnew,,;ⅣⅤ,Dnew,,,;4.5 ,BPTT2172:.,,HMM.、、,...,,.,,HMM,.,,,.1BourlardH,MorganN.ContinuousSpeechRecognition:AHy-bridApproach.Norwell,Massachusetts:KluwerAcademicPublishers,19942AbrashV,FrancoH,SankarAetal.Connectionistspeakernormalizationandadaptation.In:Proc4thEuropeanConferenceofSpeechCommunicationandTechnology(Eurospeech95),Madrid,Spain,19953CohenM,RumelhartD,MorganNetal.CombiningneuralnetworksandhiddenMarkovmodelsforcontinuousspeechrecognition.In:ProcDARPASpeechandNaturalLanguageWorkshop,Harriman,NY,19924TebelskisJ.Speechusingneuralnetworks.CarnegieMellonU-niversity:TechnicalReportCMU-CS-95-142,19955RobinsonT.Anapplicationofrecurrentnetstophoneprobabili-tyestimation.IEEETransNeuralNetworks,1994,5(3):298-3056WerbosPJ.Backpropagationthroughtime:Whatitdoesandhowtodoit.ProceedingsoftheIEEE,1990,78(10):1550-15607SeniorAW.Off-linecursivehandwritingrecognitionusingre-currentneuralnetworks[PhDdissertation].Cambridge:Uni-versityofCambridge,19948YuTie-Cheng,ZhouJian-Lai,SongYan-Tao.Anoverviewofspeechreco
本文标题:基于循环神经网络的语音识别模型-朱小燕
链接地址:https://www.777doc.com/doc-2869824 .html