您好,欢迎访问三七文档
当前位置:首页 > 行业资料 > 国内外标准规范 > 2013-ANSYS-UGM-如何为结构仿真配置硬件
北京福思营销顾问有限公司呈送如何为结构力学仿真配置硬件李占营CAE高级工程师Today’sAgenda•ANSYSMechanicalSolutionsForEfficientlySolvingLargeModels•AccelerateANSYSMechanicalwithGPUComputing•AnExampleforAdditionalHardwareConsideration•RecommendedHardwareSolutionsforImplicitFEAANSYSMechanicalHPCSolutionsForEfficientlySolvingLargeStructuralModels•ImplicitstructuralFEAcodes•Runtimescanbehours,daysorevenweeks–Lotsofcomputations!!•Meshfidelitycontinuestoincrease•Moreequationstosolve–Morecomputations!!•Morecomplexphysicsbeinganalyzed•Morenonlinearsolutions–Morecomputations!!NeedforSpeedAHistoryofHPCPerformance1990►SharedMemoryMultiprocessing(SMP)available1994►IterativePCGSolverintroducedforlargeanalyses1999-2000►64-bitlargememoryaddressing2004►1stcompanytosolve100MstructuralDOF2007-2009►Optimizedformulticoreprocessors►Teraflopperformanceat512cores1980’s►VectorProcessingonMainframes2005-2007►DistributedPCGsolver►DistributedANSYS(DMP)released►Distributedsparsesolver►VariationalTechnology►SupportforclustersusingWindowsHPC198019902010200020122010►GPUacceleration(singleGPU;SMP)2012►GPUacceleration(multipleGPUs;DMP)HPC–ASoftwareDevelopmentImperative•ClockSpeed–Levelingoff•CoreCounts–Growing•Exploding(GPUs)•FutureperformancedependsonhighlyscalableparallelsoftwareSource:–Hardware+SoftwareLaptop/DesktoporWorkstation/ServerClusterANSYSYES--DistributedANSYSYESYESR9.0FirstreleaseofDistributedANSYSR11.0Supportfor1stdistributedeigensolver(LANPCG)SupportforfullharmonicanalysesR12.0SupportforunsymmetricmatricesAchievedover1Tflopsusing512coresR13.0Achievedover3Tflopsusingover1024cores•TheentireSOLVEphaseisparallel•Morecomputationsperformedinparallelfastersolutiontime•BetterspeedupsthanSMP•Canachieve4xon8cores(TrygettingthatwithSMP!!!!)•Canbeusedforjobsrunningonupto1024cores•Cantakeadvantageofresourcesonmultiplemachines•Wholenewclassofproblemscanbesolved!•Memoryusageandbandwidthscales•Disk(I/O)usagescales(i.e.parallelI/O)BenefitsofDistributedANSYS•Enhancedscalability•ParallelequationorderingschemeisnowdefaultforsparsesolverDistributedANSYS12481632641282561248163264128256CoreSolverRatingNumberofcoresSparseSolverImprovedScalabilityforR13-4MDOFR12.1R13.0IdealEachnodehas26-coreWestmereprocessors,24GBRAM&InfinibandDDR2(~2500MB/s)interconnect.Only8of12coresareusedoneachnode.SuperScaling!HPCstartonDesktop:ANRProjectMOISE4.3Mdof,Largedisplacement,plasticity,creep,(100000frictionalcontactelement)116itercum48hoursDellT75008coresw5580•MulticoreSpeeduponaPCDesktop:ANSYSspeeduponDesktop:ANRProjectMOISE1.36Mdof,LargeDisplacementplasticity,creep,contactwithfriction46itercum:5xspeedupwith8core(runwith12.11/2010)0123456712468MOISE:DELLT7500ScalabilityXMflopsSMPXMflopsDMPXtotSMpXtotDMPDistributedANSYSPerformance05101520250816243240485664SpeedupSolutionScalability•Minimumtimetosolutionmoreimportantthanscaling•Turbinemodel•2.1millionDOF•Nonlinearstaticanalysis•1Loadstep,7substeps,25equilibriumiterations•Linuxcluster(8corespernode)DistributedANSYSPerformance0500010000150002000025000300003500040000450000816243240485664SolutionElapsedTimeSolutionScalability11hrs,48mins30mins•Minimumtimetosolutionmoreimportantthanscaling1hr,20mins•Turbinemodel•2.1millionDOF•Nonlinearstaticanalysis•1Loadstep,7substeps,25equilibriumiterations•Linuxcluster(8corespernode)AccelerateANSYSMechanicalwithGPUComputing•Graphicsprocessingunits(GPUs)•Widelyusedforgaming,graphicsrendering•Recentlybeenmadeavailableasgeneral-purpose“accelerators”–Supportfordoubleprecisionarithmetic–PerformanceexceedingthelatestmulticoreCPUs•SohowcanANSYSMechanicalmakeuseofthisnewtechnologytoreducetheoveralltimetosolution??GPU:IntroductionNVIDIATeslaC2075NVIDIATeslaM2090NVIDIAQuadro6000NVIDIAQuadroK5000†NVIDIATeslaK10NVIDIATeslaK20†Power(W)225250225122250250Memory6GB6GB6GB4GB8GB6to24GBMemoryBandwidth(GB/s)144177.4144173320288PeakSpeedSP/DP(GFlops)1030/5151331/6651030/5152290/954577/1905184/1728•Targetedhardware†TheseNVIDIA“Kepler”basedproductsarenotreleasedyet,sospecificationsmaybeincorrectGPUAcceleratorCapabilityGPUAcceleratorCostandPerformanceBenefit1.02.45.11.01.351.380123456ANSYSMechanical2CoresANSYSMechanicalANSYSHPCPack8CoresANSYSMechanicalANSYSHPCPack8Cores+GPUFactorsGainOverBasePlatformResultsInvest38%moreoverbaselicensefor5x!GoMoreParallelHigherisBetterCPUSpeed-upCPU/GPUSpeed-upSolutionCostSolutionCostBasisTurbinegeometry2,100KDOFSOLID187FEsStatic,nonlinearOneiterationDirectsparseANSYSMechanicalANSYSHPCPack$10KforworkstationPerformanceBasisV14sp-5Model$2KforTeslaC2075TakeAdvantageofAdding1GPUtoyourCPUs•Modalanalysisofaradialimpeller•BlockLanczosEigensolver•Cyclicsymmetrymodelwith2millionDOF:–337916nodes–222725elements–10-nodetetrahedralsolidelement•Results(baselineis1core):•WithGPU,~6xspeedupon1core•~8.5xspeedupon4cores•If2coresistakenasbaselineinstead,2coreswithGPUAcceleratorresultsin3.7xspeedup!Windowsworkstation:TwoIntelXeon5530processors(2.4GHz,8corestotal),48GBRAM,NVIDIAQuadro6000CoresGPUSpeedup1no1.002no1.994no3.611yes5.922yes7.434ye
本文标题:2013-ANSYS-UGM-如何为结构仿真配置硬件
链接地址:https://www.777doc.com/doc-4699838 .html