您好,欢迎访问三七文档
FACULDADEDEINFORM´ATICAPUCRS-Brazilˆea,R.Chanin,A.Sales,R.Scheer,A.ZorzoTECHNICALREPORTSERIES—————————————————————————————————Number048July,2005Contact:mcorrea@inf.pucrs.brchanin@inf.pucrs.brasales@inf.pucrs.brroque.scheer@hp.comzorzo@inf.pucrs.brCopyrightcFaculdadedeInform´atica-PUCRSPublishedbyPPGCC-FACIN-PUCRSAv.Ipiranga,668190619-900PortoAlgre-RS-Brasil1IntroductionThedemandforcomputationalpowerhasbeenincreasingthroughoutthepastyears.Severalsolutionsarebeingusedinordertosupplysuchdemand,e.g.,clustersofworkstations[6]andsharedmemorymultiprocessors.Althoughclustershavelowercost,theiruseimpliesinagreatspecializedprogrammingeffort.ItisnecessarytobuildnewapplicationsorporttheexistingonestoexecuteintheseenvironmentsthroughtheuseofspecificAPI’s,suchasMPI(MessagePassingInterface)[23].Ontheotherhand,sharedmemorymultiprocessorcomputersaremoreexpensive,butsimplertouse,sinceallresourcesaremanagedbyasingleoperatingsystem.SharedmemorymultiprocessorcomputerscanbeclassifiedasUMA(UniformMemoryAccess)orNUMA(Non-UniformMemoryAccess)computers[16].InUMAcomputerseachprocessorcanaccessanymemoryareawiththesameaveragecost.Thispropertysimplifiestheloadbalancing:aprocesscanbemovedtoanyprocessorwithoutanyimpacttoitsmemoryaccesstime,iftheprocessisnotcache-hot.ThemajordrawbackofUMAarchitecturesisthatthenumberofprocessorsislimitedbythecontentiononaccesstomemory,whichbecamesabottleneck,sincetheyallsharethesamememorybus.NUMAarchitecturesallowagreaternumberofprocessorsbecauseprocessorsandmemoryaredistributedinnodes.Memoryaccesstimesdependontheprocessorthataprocessisexecutingandontheaccessedmemoryarea.Thus,theloadbalancingonthesemachinesismorecomplex,sincemovingaprocesstoanodethatisdistantfromitsmemoryareacanincreaseprocessexecutiontime.Loadbalancingforparallelenvironmentsisaproblemthathasbeendeeplystud-iedforalongtime.However,mostofthesestudiesarefocusedonuser-levelloadbalancing,whereusersofthesystemmustknowtheirapplicationsbehaviorandprovideinformationtotheloadbalancingalgorithm.Inthissense,therearemanyproposalsfordifferentplatforms,forexampleclusters[2,5,26]andcomputationalgrids[13,24].SomeauthorshavealsopresentedsolutionsorstudiesfortheloadbalancingproblemonNUMAcomputers.Zhu[25],forinstance,proposesaclusterqueuestructureforprocessesbasedonahierarchicalstructureinordertosolvethelimitationofsinglequeuesystemsandtheloadimbalanceproblemthatresultsofdistributedqueues.Focht[12],ontheotherhand,describesanalgorithmbasedontheLinuxloadbalancingalgorithmthattriestoattractprocessesbacktotheirorig-inalnodeswhentheyaremigrated.TherehasbeenalsosomestudiespresentinganalysisofloadbalancingalgorithmsonNUMAmachines[7].In[10]weproposedanalgorithmthatallowsLinuxtoperformmultilevelloadbalancinginNUMAcomputers.ThecurrentLinuxloadbalancingalgorithmusesastructurecalledscheddomaintobuildahierarchythatrepresentsthemachine’s3topology.Basedonthishierarchy,Linuxtriestokeepprocessesclosertotheirmemoryareas,movingthemamongprocessorsinthesamenodebeforeperform-inginter-nodemigration[4].ThealgorithmthatisresponsibleforconstructingthishierarchyassumesthatallNUMAmachineshaveonlytwomemoryaccesslevels.However,thereareNUMAcomputerswithmorethantwomemoryaccesslevels.Thehierarchybuiltforthesemachines,therefore,doesnotrepresenttheirtopologycorrectly,causingunappropriateloadbalancing.Tocopewiththisproblem,wepro-posedagenericalgorithmtobuildamultilevelscheddomainhierarchy,accordingtothenumberofmemoryaccesslevelsthattheNUMAcomputercontains.InthispaperweevaluatetheperformanceoftheLinuxloadbalancingalgorithmindifferentNUMAarchitectures,usingthe2-levelscheddomainhierarchybuiltbythecurrentLinuxversionandusingann-levelhierarchybuiltbyourproposedalgo-rithm.Toperformthisevaluationweusetwodifferentapproaches:simulationandanalyticalmodels.ThesimulationmodelisdevelopedusingtheJavaSimsimula-tiontool[17],andtheanalyticalmodelisdescribedusingtheStochasticAutomataNetworks(SAN)formalism[22].Thispaperisorganizedasfollows.Section2describesthecurrentLinuxloadbalancingalgorithmandourproposaltoallowLinuxtoperformmultilevelloadbalancing.Section3describestheimplementationofourproposal.Sections4and5presentthesimulationandanalyticalmodelsrespectively,whichwereusedtocompareLinuxloadbalancingperformanceusingthecurrentalgorithmandthepro-posedsolution.Section6showstheresultsofbothevaluationmodelsanddemon-stratethatmultilevelloadbalancingcanpresentabetterperformanceintermsofaverageprocessesexecutiontimethanthecurrentLinuxloadbalancingalgorithm.Finally,Section7assessesfutureworkandemphasizesthemaincontributionsofthispaper.2LoadBalancinginNUMAComputersInaNUMAcomputer,processorsandmainmemoryaredistributedinnodes.Eachprocessorcanaccesstheentirememoryaddressspace,butwithdifferentlatencytimes[16].Ingeneral,ifthesystemhasasmallnumberofprocessors,themachinehasonlytwomemoryaccesslevels.Forexample,Figure1showsthearchitectureofaHPIntegritySuperdomeserver[14]with4nodesand16processors.thismachinehastwodifferentmemorylatencies:whenaproc
本文标题:Performance Evaluation of a Multilevel Load Balanc
链接地址:https://www.777doc.com/doc-3762671 .html