New Anticipatory Load Balancing Strategies for Par

TobepublishedinAmericanMathematicalSociety’sProc.intheDIMACSSeriesonDiscreteMathematicsandTheoreticalComputerSc.,Apr.1995.NewAnticipatoryLoadBalancingStrategiesforParallelA*AlgorithmsNiharR.MahapatraandShantanuDuttfmahapatra,duttg@ee.umn.eduDepartmentofElectricalEngineering,UniversityofMinnesota,Minneapolis,MN55455AbstractInthispaper,wedeveloploadbalancingstrategiesforscalablehigh-performanceparallelA*algorithmssuitablefordistributed-memoryma-chines.InparallelA*search,inecienciessuchasprocessorstarvationandsearchofnon-essentialspaces(searchspacesnotexploredbythesequentialalgorithm)growwiththenumberofprocessorsPused,thusrestrictingitsscalability.Toalleviatethiseect,weproposeanovelpar-allelstartupphaseandanecientdynamicloadbalancingstrategycalledthequalityequalizing(QE)strategy.Ournewparallelstartupschemeexecutesoptimallyin(logP)timeand,inaddition,achievesgoodini-tialloadbalance.TheQEstrategyemploysnear-neighborquantitativeandqualitativeloadbalancingschemestoachieveloadbalance.Theseschemesutilizeanticipatorymechanismstodetectandcorrectloadim-balancebeforeitsactualoccurrence;suchmechanismsareparticularlyusefulatlowerworkdensities(theratiooftheproblemsizetoP)andforlowergranularityapplications.TheQEstrategypossessescertainuniqueloadbalancingpropertiesthatenableittosignicantlyreducestarvationandnon-essentialwork,andthatmakeitsperformancero-bustacrossapplicationswithdierentcostdistributionsforsearch-spacenodes.Consequently,weobtainahighlyscalableparallelA*algorithmwithanalmost-linearspeedup.ThestartupandloadbalancingschemeswereemployedinparallelA*algorithmstosolvetheTravelingSalesmanProblemonannCUBE2hypercubemulticomputer.TheQEstrategyyieldsaveragespeedupimprovementsofabout20-185%and15-120%atlowandintermediateworkdensities,respectively,overthreewell-knownloadbalancingmethods|theround-robin(RR),therandomcommu-nication(RC)andtheneighborhoodaveraging(NA)strategies.Theaveragespeedupobservedon1024processorsisabout985,representingaveryhigheciencyof0:96.WealsotestedtheeectofincludingananticipatoryqualitativeloadbalancingschemeintheQEstrategyandfoundthatitreducestheaverageexecutiontimeby3:32%and8:77%onThisresearchwasfundedinpartbyaGrant-in-AidfromtheUniversityofMinnesotaandinpartbyNSFgrantMIP-9210049.SandiaNationalLabsprovidedaccesstotheir1024-processornCUBE2parallelcomputer.1256and512processors,respectively,atlowerworkdensities.Finally,wepresentanalyticalandempiricalresultsonthescalabilityofparallelA*algorithmsintermsoftheisoeciencymetric.Ouranalyticalre-sultsinclude(1)a(P:logP)lowerboundontheisoeciencyfunctionofanyparallelA*algorithm,and(2)ageneralexpressionfortheupperboundontheisoeciencyfunctionofourparallelA*algorithmusingtheQEstrategyonanytopology|forthehypercubeand2-Dmeshar-chitecturestheupperboundsontheisoeciencyfunctionarefoundtobe(P:log2P)and(P:pP),respectively.Experimentalresultsvalidateouranalysis,andalsoshowthatparallelA*searchusingtheQEloadbalancingstrategyhasbetterscalabilitythanwhenusingtheRR,RCorNAstrategies.1IntroductionTheA*algorithm[21]isawell-known,generalizedbranch-and-boundsearchprocedure,widelyusedinthesolutionofmanycomputationallydemandingcombinatorialoptimizationproblems(COPs)[4,23].Itsoperation,asde-tailedlater,canbeviewedessentiallyasabest-rstsearchofastatespacegraph.Parallelizationofbranch-and-boundmethodsprovidesaneectivemeanstomeetthecomputationalneedsofmanypracticalsearchproblems[3,8].Theaimofourworkistodevelopscalablehigh-performanceparallelA*algorithmsforsolvingCOPsondistributed-memorymachines.However,parallelizationofA*introducesanumberofineciencies.(1)First,thetimerequiredinitiallytosplitthewholesearchspaceamongallPprocessors,i.e.,thestartupphasetime,canbeasignicantfractionofthetotalexecutiontimeatlowworkdensities(theratiooftheproblemsizetoP).Thereforethestartupphaseneedstobeexecutedeciently.Also,itisdesirabletohaveagoodinitialloadbalancetoreduceidlingatthebeginningofparallelA*.(2)InsearchalgorithmssuchasA*,theamountofworkcorrespondingtodier-entsearchsubspacesisverydiculttoestimateandcanvarywidely.Hencesomeformofdynamic,quantitativeloadbalancingiscrucialtoreducingtheidlingthatwouldotherwiseoccur.(3)Finally,processorsperformingbest-rstsearchoftheirlocalsubspacesinparallelA*maysearchspacesthatasequentialA*algorithmwillnotexplore.Thiscanleadtosubstantial\non-essentialwork.Toaddressthisproblem,itisimperativetoperformdynamicqualitativeloadbalancingsothatatalltimesdierentprocessorssearchspacesthatarecomparablypromising.Inadditiontotheaboveineciencies,duplicatedworkamongprocessorscanoccurwhenthesearchspaceisagraph.1Thisproblemcanbetackledbyusingecientduplicatepruningtechniques[9,17,18].However,sincethefocusofthispaperisonloadbalancingstrategies,wewillrestrictour1Weusetheterm\graphmainlytodenotegraphsthatarenottrees,butsometimesweuseitmoregenerallytomeantreesaswell|thiswillbeclearfromthecontext.2attentiontotreesearchspacessothatperformancecomparisonofparallelA*algorithmsemployingdierentloadbalancingmethodsreectstheef-fectivenessofthesealgorithmsinachievingloadbalanc

New Anticipatory Load Balancing Strategies for Par

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

小说环境125

怎样提高风机维护质量

农村污染治理情况调查

蚌埠学院科研项目管理办法

员工满意调查在服务企业文化建设中的应用（DOC4页）

《酒店式湖景别墅商业计划书》(doc37)

中国电信陕西公司全业务运营对策

第九章经常项目外汇管理

经典推荐人力资源管理规划方案

新增网点财务培训

相关文档

相关搜索

New Anticipatory Load Balancing Strategies for Par

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

小说环境125

怎样提高风机维护质量

农村污染治理情况调查

蚌埠学院科研项目管理办法

员工满意调查在服务企业文化建设中的应用（DOC4页）

《酒店式湖景别墅商业计划书》(doc37)

中国电信陕西公司全业务运营对策

第九章 经常项目外汇管理

经典推荐人力资源管理规划方案

新增网点财务培训

相关文档

相关搜索

第九章经常项目外汇管理