您好,欢迎访问三七文档
1ErrorResilientLZ’77DataCompression:Algorithms,Analysis,andExperimentsStefanoLonardiMember,IEEE,WojciechSzpankowskiFellow,IEEE,MarkDanielWardMember,IEEEAbstract—Weproposeajointsource-channelcodingalgorithmcapableofcorrectingsomeerrorsinthepopularLempel-Ziv’77schemewithoutintroducinganymeasurabledegradationinthecompressionperformance.ThiscanbeachievedbecausetheLZ’77encoderdoesnotcompletelyeliminatetheredundancypresentintheinputsequence.OnesourceofredundancycanbeobservedwhenanLZ’77phrasehasmultiplematches.Inthiscase,LZ’77canissueapointertoanyofthosematches,andaparticularchoicecarriessomeadditionalbitsofinformation.WecallaschemewithembeddedredundantinformationtheLZS’77algorithm.Weanalyzethenumberoflongestmatchesinsuchaschemeandprovethatitfollowsthelogarithmicseriesdistributionwithmean1/h(plussomefluctuations),wherehisthesourceentropy.Thus,thedistributionassociatedwiththenumberofredundantbitsiswellconcentratedarounditsmean,ahighlydesirablepropertyforerrorcorrection.Theseanalyticresultsareprovedbyacombinationofcombinatorial,probabilisticandanalyticmethods(e.g.,Mellintransform,depoissonization,combinatoricsonwords).Infact,weanalyzeLZS’77bystudyingthemultiplicitymatchingparameterinasuffixtree,whichinturnisanalyzedviacomparisontoitsindependentversion,calledtrie.Finally,wepresentanalgorithminwhichachannelcoder(e.g.,Reed-Solomoncoder)succinctlyusestheinherentadditionalredundancyleftbytheLZS’77encodertodetectandcorrectalimitednumberoferrors.WecallsuchaschemetheLZRS’77algorithm.LZRS’77isperfectlybackward-compatiblewithLZ’77,thatis,afilecompressedwithourerror-resistantLZRS’77canstillbedecompressedbyagenericLZ’77decoder.IndexTerms—Lempel-Ziv’77scheme,multiplematches,jointsource-channelcoding,Reed-Solomoncode,suffixtrees,tries,Mellintransform,depoissonization,patternmatching,autocor-relationpolynomial,combinatoricsonwords.I.INTRODUCTIONERROR-RESILIENTadaptivelosslessdatacompressionisaparticularlychallengingproblembecauseoftwoopposing“forces.”Sourcecodingtriestodecorrelateasmuchaspossibletheinputsequence(i.e.,byremovingredundantinformation),whilechannelcodingintroducesadditionalcor-relation(i.e.,byaddingredundantinformation)inordertoprotectagainsterrors.Thedevastatingeffectoferrorsinadaptivedatacompressionisalong-standingopenproblem[25].Infact,inmanyapplications,apracticaldrawbackofadaptivedatacompressionalgorithmsistheirlackofresistanceS.LonardiiswiththeDepartmentofComputerScienceandEngineering,UniversityofCalifornia,Riverside,CA92521.W.SzpankowskiiswiththeDepartmentofComputerSciences,PurdueUniversity,WestLafayette,IN47907.M.D.WardiswiththeDepartmentofMathematics,UniversityofPennsylvania,Philadelphia,PA19104.PreliminaryversionsofportionsofthispaperwerepresentedatDCC’03,Snowbird,UTandISIT’04,Chicago,2004.toerrors.Jointsource-channelcodinghasemergedasaviablesolutiontothisproblem.TheseparationprincipleformulatedbyShannondividesacommunicationsystemintoseparatesourcecodingandchannelcodingsubsystemsthatrunindependently;however,intoday’scommunicationtechnologythisrigidseparationisverylimiting.Inparticular,thisprincipleignoresmanyimperfectionsofrealcommunicationsystems,suchasthefactthatchannelcodingisincapableofcorrectingallerrors.Uncorrectableerrorsareinevitable;designingencoderswhileignoringthisfactsimplyleadstoextremelyfragilesourcecodes,inwhichonesingleerrorcanpotentiallyyieldcatas-trophicfailures.Jointsource-channelcodingstrikesabalancebetweensourcebitsvs.channelbits,whichinturnrequiressomeadjustmentsinboththesourcecodingandchannelcodingstrategies.Ourapproachissomewhatorthogonaltomostworksinthisarea.Weuseredundancybitsleftbythesourcecodertoprotectagainsterrorswithoutdegradingthecompressionrate.Thepricewepayisthatweonlycorrectafewerrors,andwedonotachieveapositiveerrorbitrate(i.e.,weareunabletocorrectanumberoferrorsproportionaltothesizeofablock).Wedonotaddresshereerrorpropagation(cf.[25]);however,byeliminatingerrors,ouralgorithmimplicitlyprotectsagainstlimitederrorpropagation.Inthispaperwedealwithoneofthebest-knownadaptivedatacompressionschemes,namelythatofZivandLempelpublishedintheir1977seminalpaper[33].ThepopularLZ’77compressionschemeworkson-line.Itcompressesphrasesbyconsecutivelyreplacingthelongestprefixofthenon-compressedportionofafilewithapointerandthelengthoftheprefix.Thelackoferror-resistanceofLZ’77isawell-recognizedproblem.Afewyearsagowereadthefollowingpostingonthecomp.compressionnewsgroup:“...I’macasualtyofcorrupttar’d1gzippedfilesonSolaris8.(gzip1.3)...Isthereareasonwhytherearenocompressionutilitiesthatallowcontrolledamountsofredundancyforerrorcorrection?...Howmuchoverheadwouldbeneededtocorrectthese?”Indeed,weaskedourselves,howmuchoverheadisneededinLZ’77tocorrecterrors?ThesurprisingansweristhatthereisnoneedforadditionaloverheadinordertocorrectsomeerrorsinLZ’77.ThisseeminglyimpossiblegoalisachievedinpracticethankstothefactthattheLZ’77encoderisunabletocompletelydecorrelatetheinputsequence.Someimplicitredundancy,whichwepreciselyquantifyinthispaper,isstillpresentinthecompressedstreamandcanbeexploitedbythe1tarisacommonarchiverund
本文标题:Error-Resilient-LZ’77-Data-Compression-Algorithms-
链接地址:https://www.777doc.com/doc-8143164 .html