您好,欢迎访问三七文档
ContinuousProgramOptimization:ACaseStudyTHOMASKISTLERandMICHAELFRANZUniversityofCalifornia,IrvineMuchofthesoftwareineverydayoperationisnotmakingoptimaluseofthehardwareonwhichitactuallyruns.Amongthereasonsforthisdiscrepancyarehardware/softwaremismatches,modularizationoverheadsintroducedbysoftwareengineeringconsiderations,andtheinabilityofsystemstoadapttousers'behaviors.Theobvioussolutiontotheseproblemsistodelaycodegenerationuntilloadtime.Thisistheearliestpointatwhichapieceofsoftwarecanbene-tunedtotheactualcapabilitiesofthehardwareonwhichitisabouttobeexecuted,andalsotheearliestpointatwichmodularizationoverheadscanbeovercomebyglobaloptimization.Astillbettermatchbetweensoftwareandhardwarecanbeachievedbyreplacingthealreadyexecutingsoftwareatregularintervalsbynewversionsconstructedon-the-flyusingabackgroundcodere-optimizer.Thisnotonlyenablestheuseofliveprolingdatatoguideoptimizationdecisions,butalsofacilitatesadaptationtochangingusagepatternsandthelateadditionofdynamiclinklibraries.Thispaperpresentsasystemthatprovidescodegenerationatload-timeandcontinuouspro-gramoptimizationatrun-time.First,thearchitectureofthesystemispresented.Then,twooptimizationtechniquesarediscussedthatweredevelopedspecicallyinthecontextofcontin-uousoptimization.Therstoftheseoptimizationscontinuallyadjuststhestoragelayoutsofdynamicdatastructurestomaximizedatacachelocality,whilethesecondperformsprole-driveninstructionre-schedulingtoincreaseinstruction-levelparallelism.Thesetwooptimizationshaveverydierentcost/benetratios,presentedinaseriesofbenchmarks.Thepaperconcludeswithanoutlooktofutureresearchdirectionsandanenumerationofsomeremainingresearchproblems.Theempiricalresultspresentedinthispapermakeacaseinfavorofcontinuousoptimization,butindicatethatitneedstobeappliedjudiciously.Inmanysituations,thecostsofdynamicoptimizationsoutweightheirbenet,sothatnobreak-evenpointiseverreached.Infavorablecircumstances,ontheotherhand,speed-upsofover120%havebeenobserved.Itappearsasifthemainbeneciariesofcontinuousoptimizationaresharedlibraries,whichatdierenttimescanbeoptimizedinthecontextofthecurrentlydominantclientapplication.CategoriesandSubjectDescriptors:D.3.4[ProgrammingLanguages]:Processors|Run-timeEnvironments;D.3.4[ProgrammingLanguages]:Processors|CodeGeneration;D.3.4[Pro-grammingLanguages]:Processors|Compilers;D.3.4[ProgrammingLanguages]:Proces-sors|OptimizationGeneralTerms:DynamicCodeGeneration,ContinuousProgramOptimization,DynamicRe-OptimizationAuthors'currentaddresses:ThomasKistler,TransmetaCorporation,3940FreedomCircle,SantaClara,CA95054.MichaelFranz,DepartmentofInformationandComputerScience,UniversityofCaliforniaatIrvine,Irvine,CA92697{3425.PartsofthisworkarefundedbyaCAREERawardfromtheNationalScienceFoundation(CCR{97014000)andbytheCaliforniaMICROProgramwithindustrialsponsorMicrosoftResearch(ProjectNo.99-039).AprecursortothechapteronthesimilarityofprolingdatapreviouslyappearedasarefereedcontributionintheProceedingsoftheWorkshoponProleandFeedback-DirectedOptimization,Paris,France,October1998.2T.KistlerandM.Franz1.INTRODUCTIONInthewakeofdramaticimprovementsinprocessorspeed,itisoftenoverlookedthatmuchofthesoftwareineverydayoperationisnotmakingthebestuseofthehardwareonwhichitactuallyruns.Thevastmajorityofcomputersareeitherrunningapplicationprogramsthathavebeenoptimizedforearlierversionsofthetargetarchitecture,or,worsestill,areemulatinganentirelydierentarchitectureinordertosupportlegacycode.Therstreasonwhyhardwareandsoftwareareoftenmismatchedislinkedtothespeedoftechnologyevolution.Usersdemandbackwardcompatibilityandareoftenunwillingtogiveupexistingsoftwarewhenupgradinghardware.Asaresultofthis,animmenseamountoflegacycodeisinuseeveryday:16-bitsoftwareon32-bitpro-cessors,emulatedMC680x0codeonPowerPCMacintoshcomputers,andsoonalsoIA32codeonIA64hardware.Simultaneously,purelylogisticalconstraintsmakeitunfeasibleforsoftwarevendorstoprovideseparateversionsofeveryprogramforeveryparticularhardwareimplementationofaprocessorarchitecture.Justcon-sider:thereareseveralmajormanufacturersofIA32-compatibleCPUs,andeachofthesehasaproductlinespanningseveralprocessors|thetotalvariabilityisfartoogreattomanageinacentralizedfashion.Thesecondreasonwhythecapabilitiesofthehardwarearenotexploitedtothefullesthastodowithsoftwareengineeringconcerns.Increasingly,softwareisdevel-opedanddistributedassmallercomponentsthatarelinkedtogetherdynamicallyonlyattheend-userssite.Unfortunately,thereisamodularizationcostassociatedwithseparatecompilation|sinceneithertheend-user'scongurationnorthecom-ponents'interactionschemesareknownatcompile-time,manytraditionalglobalcodeoptimizationscannotbeappliedacrosscomponentboundaries.Theaddedbenetsofcomponent-orientationareusuallysogreatthatthisdrawbackisreadilyacceptedbysoftwaredevelopersaswellasusers.Theobvioussolutiontoovercomingbothoftheseperformanceimpedimentssi-multaneouslyistodelaycodegenerationatleastuntilloadtime.Notonlyarethehardwarecharacteristicsofthetargetmachinedeniteatthispoint,butload-timecodegenerationalsomakesitpossibletoperfor
本文标题:Continuous Program Optimization A Case Study
链接地址:https://www.777doc.com/doc-3215653 .html