您好,欢迎访问三七文档
Streamline:ASchedulingHeuristicforStreamingApplicationsontheGridBikashAgarwalla,NovaAhmed,DavidHilley,UmakishoreRamachandranCollegeofComputing,GeorgiaInstituteofTechnology801AtlanticDriveNW,Atlanta,GA30332-0280,USAEmail:fbikash,nova,davidhi,ramag@cc.gatech.eduAbstractStreamingapplicationssuchasvideo-basedsurveil-lance,habitatmonitoring,andemergencyresponsearegoodcandidatesforexecutingonhigh-performancecom-puting(HPC)resources,duetotheirhighcomputationandcommunicationneeds.Suchanapplicationcanberepre-sentedasacoarse-graindataflowgraph,eachnodecor-respondingtoastageofthepipelineoftransformationsthatmaybeappliedtothedataasitcontinuouslystreamsthrough.MappingsuchapplicationstoHPCresourceshastobesensitivetothecomputationandcommunicationneedsofeachstageofthepipelinetoensureQoScriteriasuchaslatencyandthroughput.Duetothedynamicna-tureofsuchapplications,theyareidealcandidatesforus-ingambientHPCresourcesmadeavailableviathegrid.Sincegridhasevolvedoutoftraditionalhigh-performancecomputing,thetoolsavailable,especiallyforscheduling,tendtobemoreappropriateforbatch-orientedapplica-tions.Wehavedevelopedascheduler,calledStreamline,thattakesintoaccountdynamicnatureofthegridandrunsperiodicallytoadaptschedulingdecisionsusingapplica-tionrequirements(per-stagecomputationandcommunica-tionneeds),applicationconstraints(suchasco-locationofstagesonthesamenode),andcurrentresourceavailability.Theschedulerisdesignedtobeintegratedwiththeexist-inggridframeworkusingGlobusToolkit.TheperformanceofStreamlineiscomparedwithanOptimalplacementforsmallnumberofresourcesandapproximationalgorithmsusingSimulatedAnnealingforlargeresourcesanddataflowgraphs.WehavealsocomparedStreamlinewithabaselinegridscheduler,E-Condor,builtontopofCondorforstream-ingapplications.Forkernelsofsuchstreamingapplica-tions,weshowthatourheuristicperformsclosetoOptimal,andcanbenearlyanorderofmagnitudebetterthantheE-Condorundernon-uniformloadconditions.WeconsideredtwoSimulatedAnnealingalgorithmswithdifferentexecu-tiontimes,andshowthatneighbor-selectionandannealingschedulehavearelativelylargerimpactontheperformanceofSimulatedAnnealingforcommunication-intensiveker-nelsthanforcomputationintensivekernels.WehavealsoconductedscalabilitystudiesandshowthatourschedulerismoreeffectivethanE-Condor,andperformsclosetoSim-ulatedAnnealingalgorithms,withsmallerschedulingtime,inallocatingresourcesforalargestreamingapplication.1IntroductionAdvancesinsensingcapabilities,andcomputingandcommunicationinfrastructuresarepavingthewayfornewanddemandingapplications.Video-basedsurveillance,emergencyresponse,disasterrecovery,habitatmonitor-ing,andtelepresenceareallexamplesofsuchapplications.Theseapplicationsintheirfullformarecapableofstress-ingtheavailablecomputingandcommunicationinfrastruc-turestotheirlimits.Streamingapplications,aswerefertosuchapplicationsinthispaper,havethefollowingcharac-teristics:(1)theyarecontinuousinnature,(2)theyrequireefcienttransportofdatafrom/todistributedsources/sinks,and(3)theyrequiretheefcientuseofhigh-performancecomputingresourcestocarryoutcompute-intensivetasksinatimelymanner.Thefocusofthisworkisinaddressingthethirdcom-ponentoftheabovecharacteristics,namely,theuseofhigh-performancecomputing(HPC)resourcestocarryoutcompute-intensivetasks.Considerforexample,avideo-basedsurveillanceapplication.Thecomputeintensivepartofsuchanapplicationmayconsistofanalyzingmultiplecamerafeedsfromaregiontoextracthigherlevelinfor-mationsuchasmotion,presenceorabsenceofahu-manface,orpresenceorabsenceofanykindofsuspi-ciousactivity.Suchanapplicationcanberepresentedasacoarse-graindataowgraph,whereinthenodesrepresentincreasingsophisticationofcomputationsthatmayneedtobeperformedonthedatastreamtofacilitatetheextractionofhigh-levelinformation.Atsomelevelsuchcoarse-graindataowgraphsresem-bletask-graphsthathavebeenthefocusofmultiprocessorschedulingworkfromthe70's[2,11,23,17,24,16,20,26].However,thereareseveralcrucialdifferences.Inmultipro-cessorscheduling,atask-graph(adirectedacyclicgraph)isusedasanorderingmechanismtoshowthedependen-ciesamongtheindividualtasksofaparallelcomputation.Thesedependenciesarerespectedandexploitedinarriv-ingatamappingheuristic(sincetheschedulingproblemisknowntobeNP-Complete)thatmaximizestheutiliza-tionofthecomputationalresourcesandreducesthecom-pletiontimeoftheapplication.Thecoarse-graindataowgraphofastreamingapplication,ontheotherhand,isarepresentationoftheprocessingthatiscarriedoutonthedataduringitspassagethroughthepipelineofstages.Inthesteadystate,allthestagesofthepipelineareworkingondifferentsnapshotsofthestreamdata.Forexampleinavideo-basedsurveillanceapplication,whenthen-thstageisworkingonthe(hithertotransformedresultsofthe)ithframeofvideo,therststageofthepipelineisworkingonthe(n+i)thframe.Sotheschedulingofsuchstreamingapplicationsisnotanorderingissue;rather,itisamatterofmappingthedifferentstagesofthepipelinetotheavailableresourcesrespectingthecomputationandcommunicationrequirementsofeachstagewithaviewtooptimizingthelatencyandthroughputmetricsoftheentirepipeline.Aninter
本文标题:Streamline A Scheduling Heuristic for Streaming Ap
链接地址:https://www.777doc.com/doc-3422729 .html