您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 项目/工程管理 > TORQUE-Administrators-Guide
TORQUE®AdministratorGuideversion3.0.2TORQUEAdminManualversion3.0.2LegalNoticesPrefaceDocumentationOverviewIntroductionGlossary1.0Overview1.1Installation1.2Initialize/ConfigureTORQUEontheServer(pbs_server)1.3AdvancedConfiguration1.4ManualSetupofInitialServerConfiguration1.5ServerNodeFileConfiguration1.6TestingServerConfiguration1.7TORQUEonNUMASystems1.8TORQUEMulti-MOM2.0SubmittingandManagingJobs2.1JobSubmission2.2MonitoringJobs2.3CancelingJobs2.4JobPreemption2.5KeepingCompletedJobs2.6JobCheckpointandRestart2.7JobExitStatus2.8ServiceJobs3.0ManagingNodes3.1AddingNode3.2ConfiguringNodeProperties3.3ChangingNodeState3.4HostSecurity3.5LinuxCpusetSupport3.6SchedulingCores3.7SchedulingGPUs4.0SettingServerPolicies4.1QueueConfiguration4.2ServerHighAvailability5.0InterfacingwithaScheduler5.1IntegratingSchedulersforTORQUE6.0ConfiguringDataManagement6.1SCP/RCPSetup6.2NFSandOtherNetworkedFilesystems6.3FileStage-In/Stage-Out7.0InterfacingwithMessagePassing7.1MPI(MessagePassingInterface)Support8.0ManagingResources8.1MonitoringResources9.0Accounting9.1AccountingRecords10.0Logging10.1JobLogging11.0TroubleShooting11.1Troubleshooting11.2ComputeNodeHealthCheck11.3DebuggingAppendicesAppendixA:CommandsOverviewClientCommandsmomctlpbsdshpbsnodesqalterqchkptqdelqholdqmgrqrerunqrlsqrunqsigqstatqsubqtermtracejobServerCommandspbs_mompbs_serverpbs_trackAppendixB:ServerParametersAppendixC:MOMConfigurationAppendixD:ErrorCodesandDiagnosticsAppendixE:ConsiderationsBeforeUpgradingAppendixF:LargeClusterConsiderationsAppendixG:PrologueandEpilogueScriptsAppendixH:RunningMultipleTORQUEServersandMomsontheSameNodeAppendixI:SecurityOverviewAppendixJ:SubmitFilter(akaqsubWrapper)AppendixK:torque.cfgFileAppendixL:TORQUEQuickStartGuideChangelogLegalNoticesCopyright©2011AdaptiveComputingEnterprises,Inc.Allrightsreserved.DistributionofthisdocumentforcommercialpurposesineitherhardorsoftcopyformisstrictlyprohibitedwithoutpriorwrittenconsentfromAdaptiveComputingEnterprises,Inc.TrademarksAdaptiveComputing,ClusterResources,Moab,MoabWorkloadManager,MoabClusterManager,MoabClusterSuite,MoabGridScheduler,MoabGridSuite,MoabAccessPortal,andotherAdaptiveComputingproductsareeitherregisteredtrademarksortrademarksofAdaptiveComputingEnterprises,Inc.TheAdaptiveComputinglogoandtheClusterResourceslogoaretrademarksofAdaptiveComputingEnterprises,Inc.Allothercompanyandproductnamesmaybetrademarksoftheirrespectivecompanies.AcknowledgmentsTORQUEincludessoftwaredevelopedbyNASAAmesResearchCenter,LawrenceLivermoreNationalLaboratory,andVeridianInformationSolutions,Inc.Visit(optional)qmgroptionsnecessarytogetthesystemupandrunning.SystemTestingisalsocovered.The2.0SubmittingandManagingJobssectioncoversdifferentactionsapplicabletojobs.Thefirstsection,2.1JobSubmission,detailshowtosubmitajobandrequestresources(nodes,softwarelicenses,andsoforth)andprovidesseveralexamples.Otheractionsincludemonitoring,canceling,preemption,andkeepingcompletedjobs.The3.0ManagingNodessectioncoversadministratortasksrelatingtonodes,whichincludesthefollowing:addingnodes,changingnodeproperties,andidentifyingstate.Alsoanexplanationofhowtoconfigurerestricteduseraccesstonodesiscoveredinsection3.4HostSecurity.The4.0SettingServerPoliciessectiondetailsserversideconfigurationsofqueueandhighavailability.The5.0InterfacingwithaSchedulersectionoffersinformationaboutusingthenativeschedulerversusanadvancedscheduler.The6.0ConfiguringDataManagementsectiondealswithissuesofdatamanagement.Fornon-networkfilesystems,theSCP/RCPSetupsectiondetailssettingupSSHkeysandnodestoautomatetransferringdata.TheNFSandOtherNetworkedFileSystemssectioncoversconfigurationforthesefilesystems.ThischapteralsoaddressestheuseofFileStage-In/Stage-Outusingthestageinandstageoutdirectivesoftheqsubcommand.The7.0InterfacingwithMessagePassingsectionoffersdetailssupportingMPI(MessagePassingInterface).The8.0ManagingResourcessectioncoversconfiguration,utilization,andstatesofresources.The9.0AccountingsectionexplainshowjobsaretrackedbyTORQUEforaccountingpurposes.The10.0Troubleshootingsectionisatroubleshootingguidethatoffershelpwithgeneralproblems;itincludesanFAQ(FrequentlyAskedQuestions)listandinstructionsforhowtosetupandusecomputenodechecksandhowtodebugTORQUE.Thenumerousappendicesprovidetablesofcommands,parameters,configurationoptions,errorcodes,theQuickStartGuide,andsoforth.A.CommandsOverviewB.ServerParametersC.MOMConfigurationD.ErrorCodesandDiagnosticsE.ConsiderationsBeforeUpgradingF.LargeClusterConsiderationsG.PrologueandEpilogueScriptsH.RunningMultipleTORQUEServersandMoms
本文标题:TORQUE-Administrators-Guide
链接地址:https://www.777doc.com/doc-6370486 .html