您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 薪酬管理 > The Peregrine high-performance RPC system
ThePeregrineHigh-PerformanceRPCSystemDavidB.Johnson1WillyZwaenepoelDepartmentofComputerScienceRiceUniversityP.O.Box1892Houston,Texas77251-1892dbj@cs.cmu.edu,willy@cs.rice.eduAversionofthispaperappearedinSoftware—Practice&Experience,23(2):201–221,February1993.ThisworkwassupportedinpartbytheNationalScienceFoundationunderGrantsCDA-8619893andCCR-9116343,andbytheTexasAdvancedTechnologyProgramunderGrantNo.003604014.1Author’scurrentaddress:SchoolofComputerScience,CarnegieMellonUniversity,Pittsburgh,PA15213-3891.SummaryThePeregrineRPCsystemprovidesperformanceveryclosetotheoptimumallowedbythehardwarelimits,whilestillsupportingthecompleteRPCmodel.ImplementedonanEthernetnetworkofSun-3/60workstations,anullRPCbetweentwouser-levelthreadsexecutingonseparatemachinesrequires573microseconds.ThistimecompareswellwiththefastestnetworkRPCtimesreportedintheliterature,rangingfromabout1100to2600microseconds,andisonly309microsecondsabovethemeasuredhardwarelatencyfortransmittingthecallandresultpacketsinourenvironment.Forlargemulti-packetRPCcalls,thePeregrineuser-leveldatatransferratereaches8.9megabitspersecond,approachingtheEthernet’s10megabitpersecondnetworktransmissionrate.Betweentwouser-levelthreadsonthesamemachine,anullRPCrequires149microseconds.ThispaperidentifiessomeofthekeyperformanceoptimizationsusedinPeregrine,andquantitativelyassessestheirbenefits.Keywords:Peregrine,remoteprocedurecall,interprocesscommunication,performance,distributedsystems,operatingsystems1.IntroductionThePeregrineremoteprocedurecall(RPC)systemisheavilyoptimizedforprovidinghigh-performanceinterprocesscommunication,whilestillsupportingthefullgeneralityandfunctionalityoftheRPCmodel[3,10],includingargumentsandresultvaluesofarbitrarydatatypes.ThesemanticsoftheRPCmodelprovidesampleopportunitiesforoptimizingtheperformanceofinterprocesscommunication,someofwhicharenotavailableinmessage-passingsystemsthatdonotuseRPC.ThispaperdescribeshowPeregrineexploitstheseandotheropportunitiesforperformanceimprovement,andpresentsPeregrine’simplementationandmeasuredperformance.WeconcentrateprimarilyonoptimizingtheperformanceofnetworkRPC,betweentwouser-levelthreadsexecutingonseparatemachines,butwealsosupportefficientlocalRPC,betweentwouser-levelthreadsexecutingonthesamemachine.High-performancenetworkRPCisimportantforsharedserversandforparallelcomputationsexecutingonnetworksofworkstations.PeregrineprovidesRPCperformancethatisveryclosetothehardwarelatency.FornetworkRPCs,thehardwarelatencyisthesumofthenetworkpenalty[6]forsendingthecallandtheresultmessageoverthenetwork.Thenetworkpenaltyisthetimerequiredfortransmittingamessageofagivensizeoverthenetworkfromonemachinetoanother,andismeasuredwithoutoperatingsystemoverheadorinterruptlatency.Thenetworkpenaltyisgreaterthanthenetworktransmissiontimeforpacketsofthesamesizebecausethenetworkpenaltyincludesadditionalnetwork,device,andprocessorlatenciesinvolvedinsendingandreceivingpackets.LatencyforlocalRPCsisdeterminedbytheprocessorandmemoryarchitecture,andincludestheexpenseoftherequiredlocalprocedurecall,kerneltraphandling,andcontextswitchingoverhead[2].WehaveimplementedPeregrineonanetworkofSun-3/60workstations,connectedbya10megabitpersecondEthernet.Theseworkstationseachusea20-megahertzMotorolaMC68020processorandanAMDAm7990LANCEEthernetnetworkcontroller.TheimplementationusesanRPCpacketprotocolsimilartoCedarRPC[3],exceptthatablastprotocol[20]isusedformulti-packetmessages.TheRPCprotocolislayereddirectlyontopoftheIPInternetdatagramprotocol[13].Inthisimplementation,themeasuredlatencyforanullRPCwithnoargumentsorreturnvaluesbetweentwouser-levelthreadsexecutingonseparateSun-3/60workstationsontheEthernetis573microseconds.ThistimecompareswellwiththefastestnullnetworkRPCtimesreportedintheliterature,rangingfromabout1100to2600microseconds[3,12,8,15,17,19],andisonly309microsecondsabovethemeasuredhardwarelatencydefinedbythenetworkpenaltyforthecallandresultpacketsinourenvironment.AnullRPCwithasingle1-kilobyteargumentrequires1397microseconds,showinganincreaseoverthetimefornullRPCwithnoargumentsofjustthenetworktransmissiontimefortheadditionalbytesofthecallpacket.Thistimeis338microsecondsabovethenetworkpenalty,andisequivalenttoauser-leveldatatransferrateof5.9megabitspersecond.Forlargemulti-packetRPCcalls,thenetworkuser-leveldatatransferratereaches8.9megabitspersecond,achieving89percentofthehardwarenetworkbandwidthand95percentofthemaximumachievabletransmissionbandwidthbasedonthenetworkpenalty.Betweentwouser-levelthreadsexecutingonthesamemachine,anullRPCwithnoargumentsorreturnvaluesrequires149microseconds.InSection2ofthispaper,wepresentanoverviewofthePeregrineRPCsystem.Section3discussessomeofthekeyperformanceoptimizationsusedinPeregrine.InSection4,wedescribethePeregrineimplementation,includingsingle-packetnetworkRPCs,multi-packetnetworkRPCs,andlocalRPCs.ThemeasuredperformanceofPeregrineRPCispresentedinSection5.InSection6,wequantifytheeffectivenessoftheoptimizationsmentionedinSection3.Section7comparesourworktootherRPCsystems,andSection8presentsourconclusions.12.Overview
本文标题:The Peregrine high-performance RPC system
链接地址:https://www.777doc.com/doc-3295252 .html