您好,欢迎访问三七文档
TheXiph.OrgFoundation&TheMozillaCorporationOpus:TheSwiss-ArmyKnifeofAudioCodecsJean-MarcValin,KoenVos,GregoryMaxwell,andTimothyB.Terriberry2TheXiph.OrgFoundation&TheMozillaCorporationOutline●IntroductionandMotivation●OpusDesign–SILK–CELT●Conclusion3TheXiph.OrgFoundation&TheMozillaCorporationLossyAudioCodecs●Twocommontypes:–Speech/communication(G.72x,GSM,AMR,Speex)●Lowdelay(15-30ms)●Lowsamplingrate(8kHzto16kHz):limitedfidelity●Nosupportformusic–Generalpurpose(MP3,AAC,Vorbis)●Highsamplingrates(44.1kHzorhigher)●CD-qualitymusic●High-delay(100ms)–Wewantboth:highfidelitywithverylowdelay4TheXiph.OrgFoundation&TheMozillaCorporationCodingLatency●Lowdelayiscriticaltoliveinteraction–Preventscollisionsduringconversation–Reduceneedforechocancellation●Goodforsmall,embeddeddeviceswithoutmuchCPU–Highersenseofpresence–Allowssynchronizationforlivemusic●Needlessthan25mstotaldelay(Carôt2006)●Equivalenttositting8mapart(fartherrequiresaconductor)●Lowerdelayinthecodecincreasesrange–1ms=200kminfiberHighdelay(~250ms)Lowdelay(~15ms)5TheXiph.OrgFoundation&TheMozillaCorporationOpusvs.theCompetition:Latency6TheXiph.OrgFoundation&TheMozillaCorporationOpusvs.theCompetition:Quality7TheXiph.OrgFoundation&TheMozillaCorporationOpusFeatures●Samplingrate:8...48kHz(narrowbandtofullband)●Bitrates:6...510kbps●Framesizes:2.5...20ms●Monoandstereosupport●Speechandmusicsupport●Seamlessswitchingbetweenalloftheabove●Combinemultiplestreamsforupto255channels●ItjustworksforeverythingAdaptivesweep:8...64kbps8TheXiph.OrgFoundation&TheMozillaCorporationOutline●Introduction●OpusDesign–SILK–CELT●Conclusion9TheXiph.OrgFoundation&TheMozillaCorporationOpusCharacteristics●StandardizedbytheIETF(RFC6716)–Firstfree,state-of-the-artaudiocodecstandardized●Builtoutoftwoseparatecodecs–SILK:alinearprediction(speech)codec●In-developmentbySkype(nowMicrosoft)sinceJan.2007–CELT:anMDCT(music)codec●In-developmentbyXiphsinceNovember2007–BothweremodifiedalottoformOpus●StandardizationsawcontributionsfromMozilla,Microsoft(Skype),Xiph,Broadcom,Octasic,Google,etc.10TheXiph.OrgFoundation&TheMozillaCorporationOpusOperatingModes●SILK-only:Narrowband(NB),Mediumband(MB)orWideband(WB)speech●Hybrid:Super-wideband(SWB)orFullband(FB)speech●CELT-only:NBtoFBmusic11TheXiph.OrgFoundation&TheMozillaCorporationOutline●Introduction●OpusDesign–SILK●LinearPrediction●Short-termPrediction(LPCs)●Long-termPrediction(LTP)–CELT●Conclusion12TheXiph.OrgFoundation&TheMozillaCorporationSILK●Linearprediction–Short-termpredictionviaalinearIIRfilter●10or16coefficients(forNBorMB/WBrespectively)●Goodforspeech:filtercoefficientsdirectlyrelatedtocross-sectionalareaofhumanvocaltract–Long-termpredictionviaa“pitch”filter●Goodfor“periodic”signalsfrom55.6Hzto500Hz●Variablebitrate–Quantizationlevelcontrolsrateindirectly–Range(arithmetic)codingwithfixedprobabilities13TheXiph.OrgFoundation&TheMozillaCorporationLinearPrediction●IIRfilter:●Analysis“whitens”asignal●Quantization(lossycompression)addsnoise●Synthesis“shapes”thenoisethesameasthespectrumy[i]=x[i]+∑k=0D−1a[k]y[i−k−1]14TheXiph.OrgFoundation&TheMozillaCorporationLinearPrediction●SILK:differentanalysisandsynthesisfilters●De-emphasizesspectralvalleys–Distortionleastnoticiblethere–Reducesentropy(distancebetweensignalandnoisefloor)●Usesfewerbits15TheXiph.OrgFoundation&TheMozillaCorporationLPCCoefficients●Thefiltera[k]needstobequantizedandtransmitted–Quantizingthefiltercoefficientsdirectlyisbad●Drasticallychangesthefrequencyresponseofthefilter●Convertto“linespectralfrequencies”(LSFs)–Splitfilterintotwopolynomialswithrootsontheunitcircle(Itakura1975)●Eachrootrepresentsafrequency(0...π)●Mathat–SILKquantizesLSFsusingvectorquantization(VQ)+scalarquantization16TheXiph.OrgFoundation&TheMozillaCorporationVectorQuantization●Approximatesamultidimensionaldistributionwithafinitenumberofcodewords(vectors)ScalarQuantization(2bits/dim)VectorQuantization(2bits/dim)RMSerror=0.89RMSerror=0.71(20%better)17TheXiph.OrgFoundation&TheMozillaCorporationVectorQuantization●Easilyscalestolessthan1bitperdimension(OpususesVQwithupto176dims)ScalarQuantization(0.5bits/dim)VectorQuantization(0.5bits/dim)RMSerror=2.93RMSerror=1.63(44%better)18TheXiph.OrgFoundation&TheMozillaCorporationQuantizingLSFs:Stage1Useatrained,32-entryVQcodebook–Justsearchabigtableforthebestentry–4.27(NB)to4.49(WB)bitsonaverage●Goodquality:lessthan1dBspectraldistortion(SD)●Wehave10or16LSFsarrangedarbitrarilyonacircle(ignoringorder):32entriesisnotenough12π∫−ππ[10log(S(ω))−10log(̂S(ω))]2dω19TheXiph.OrgFoundation&TheMozillaCorporationQuantizingLSFs:Stage2●Scalarquantizationoferrorfromstage1–Alsousesadditionalfirst-orderpredictionoferror●ErrorinLSFshasanon-uniformeffectonSD–LSFsbunchedclosetogethermoreimportant●SILK:UseLSFsfromstage1tocomputeapproximateweights(Laroia1991)●Weightsdeterminescalarquantizationstepsizew[k]=1c[k]−c[k−1]+1c[k+1]−c[k]20TheXiph.OrgFoundation&TheMozillaCorporationLong-TermPrediction●LPCresidualnotreallywhite(stillperiodic)Pictureblatantlystolenfrom
本文标题:opus编码介绍
链接地址:https://www.777doc.com/doc-3582907 .html