您好,欢迎访问三七文档
当前位置:首页 > 财经/贸易 > 资产评估/会计 > XXXX年成都会计从业资格证报名
WilliamStallingsComputerOrganizationandArchitecture5thEditionChapter13InstructionLevelParallelismandSuperscalarProcessorsWhatisSuperscalar?•Thetermsuperscalarreferstoamachinethatdesignedtoimprovetheperformanceoftheexecutionofscalarinstructions.•Therearemultipleindependentinstructionpipelinesinasuperscalarprocessor.•Eachpipelineconsistsofmultiplestages,canhandlemultipleinstructionsatatime.•Multiplepipelinesintroduceanewlevelofparallelism,enablingmultiplestreamsofinstructionstobeprocessedatatime.WhatisSuperscalar?•Asuperscalarprocessorfetchesmultipleinstructionsatatime•Attemptstofindnearbyinstructionsthatareindependentofoneanotherandcanbeexecutedinparallel.•Theessenceofthesuperscalarapproachistheabilitytoexecuteinstructionsindependentlyindifferentpipelines.WhatisSuperscalar?•Commoninstructions(arithmetic,load/store,conditionalbranch)canbeinitiatedandexecutedindependentlyinasuperscalarprocessor•EquallyapplicabletoRISC&CISC•InpracticeusuallyRISCWhySuperscalar?•Mostoperationsareonscalarquantities(seeRISCnotes)•ImprovetheseoperationstogetanoverallimprovementGeneralSuperscalarOrganization•Twointeger,twofloating-point,andonememory(eitherloadorstore)operationscanbeexecutingatthesametime.Superpipelined•Manypipelinestagesneedlessthanhalfaclockcycle•DoubleinternalclockspeedgetstwotasksperexternalclockcycleSuperscalarvSuperpipelineSuperscalarvSuperpipeline•BasemachineSuperscalarvSuperpipeline•SuperpipelineSuperscalarvSuperpipeline•SuperscalarLimitations•Instructionlevelparallelism—Compilerbasedoptimisation—Hardwaretechniques•Limitedby—Truedatadependency数据相关—Proceduraldependency过程相关—Resourceconflicts资源冲突—Outputdependency输出相关—Antidependency反相关TrueDataDependency•ADDr1,r2(r1:=r1+r2;)•MOVEr3,r1(r3:=r1;)•Canfetchanddecodesecondinstructioninparallelwithfirst•CanNOTexecutesecondinstructionuntilfirstisfinished•Alsocalledflowdependency•orwrite-readdependencyTrueDataDependencyProceduralDependency•Cannotexecuteinstructionsafterabranchinparallelwithinstructionsbeforeabranch•Also,ifinstructionlengthisnotfixed,instructionshavetobedecodedtofindouthowmanyfetchesareneeded•ThispreventssimultaneousfetchesProceduralDependencyResourceConflict•Resources—Memories,caches,buses,register-file,ports,functionalunits•Twoormoreinstructionsrequiringaccesstothesameresourceatthesametime—e.g.twoarithmeticinstructions•Canduplicateresources—e.g.havetwoarithmeticunitsResourceConflictEffectofDependenciesDesignIssues•Instructionlevelparallelism—Instructionsinasequenceareindependent—Executioncanbeoverlapped—Governedbydataandproceduraldependency•MachineParallelism—Abilitytotakeadvantageofinstructionlevelparallelism处理器提供指令级并行性支持能力的度量—Governedbynumberofparallelpipelines•E.g.—LoadR1R2(23)AddR3R3,”1”—AddR3R3,”1”AddR4R3,R2—AddR4R4,R2Store[R4]R0InstructionIssuePolicy(指令发射策略)•Orderinwhichinstructionsarefetched—取指令的顺序•Orderinwhichinstructionsareexecuted—指令执行的顺序•Orderinwhichinstructionschangeregistersandmemory—指令改变寄存器和存储器内容的顺序In-OrderIssueIn-OrderCompletion•Issueinstructionsintheordertheyoccur•Notveryefficient•Mayfetch1instruction•InstructionsmuststallifnecessaryIn-OrderIssueIn-OrderCompletion(Diagram)In-OrderIssueOut-of-OrderCompletion•Outputdependency—R3:=R3+R5;(I1)—R4:=R3+1;(I2)—R3:=R5+1;(I3)—I2dependsonresultofI1-datadependency—IfI3completesbeforeI1,theresultfromI1willbewrong-output(read-write)dependencyIn-OrderIssueOut-of-OrderCompletion(Diagram)Out-of-OrderIssueOut-of-OrderCompletion•Decoupledecodepipelinefromexecutionpipeline•Cancontinuetofetchanddecodeuntilthispipelineisfull•Whenafunctionalunitbecomesavailableaninstructioncanbeexecuted•Sinceinstructionshavebeendecoded,processorcanlookaheadOut-of-OrderIssueOut-of-OrderCompletion(Diagram)Antidependency•Write-writedependency—R3:=R3+R5;(I1)—R4:=R3+1;(I2)—R3:=R5+1;(I3)—R7:=R3+R4;(I4)—I3cannotcompletebeforeI2startsasI2needsavalueinR3andI3changesR3RegisterRenaming•Outputandantidependenciesoccurbecauseregistercontentsmaynotreflectthecorrectorderingfromtheprogram•Mayresultinapipelinestall•Registersallocateddynamically—i.e.registersarenotspecificallynamedRegisterRenamingexample•Registerrenaming—R3b:=R3a+R5a(I1)—R4b:=R3b+1(I2)—R3c:=R5a+1(I3)—R7b:=R3c+R4b(I4)•Withoutsubscriptreferstologicalregisterininstruction•Withsubscriptishardwareregisterallocated•NoteR3aR3bR3cMachineParallelism•Threehardwaretechniques—DuplicationofResources—Outoforderissue—Renaming•Figure13.5showssimulationresults•Notworthduplicationfunctionswithoutregisterrenaming•Registerrenamingeliminatesantidependenciesandoutputdependencies•Needinstructionwindowlargeenough(morethan8)BranchPrediction•80486fetchesbothnextsequentialinstructionafterbranchandbranchtargetinstruction•GivestwocycledelayifbranchtakenRISC-DelayedBranch•Calculateresultofbranchbeforeunusableinstructionspre-fetched•Alwaysexecutesingleinstructionimmediatelyfollowingbranch•Keepspipelinefullwhilefetchingnewinstructionstream•Notasgoodforsuperscalar—Multipleinstructionsneedtoexecuteindelayslot—Instructiondependenceproblems•ReverttobranchpredictionSuperscalarExecutionSuperscalarImplementation•
本文标题:XXXX年成都会计从业资格证报名
链接地址:https://www.777doc.com/doc-1098741 .html