2013-ANSYS-UGM-如何为结构仿真配置硬件

北京福思营销顾问有限公司呈送如何为结构力学仿真配置硬件李占营CAE高级工程师Today’sAgenda•ANSYSMechanicalSolutionsForEfficientlySolvingLargeModels•AccelerateANSYSMechanicalwithGPUComputing•AnExampleforAdditionalHardwareConsideration•RecommendedHardwareSolutionsforImplicitFEAANSYSMechanicalHPCSolutionsForEfficientlySolvingLargeStructuralModels•ImplicitstructuralFEAcodes•Runtimescanbehours,daysorevenweeks–Lotsofcomputations!!•Meshfidelitycontinuestoincrease•Moreequationstosolve–Morecomputations!!•Morecomplexphysicsbeinganalyzed•Morenonlinearsolutions–Morecomputations!!NeedforSpeedAHistoryofHPCPerformance1990►SharedMemoryMultiprocessing(SMP)available1994►IterativePCGSolverintroducedforlargeanalyses1999-2000►64-bitlargememoryaddressing2004►1stcompanytosolve100MstructuralDOF2007-2009►Optimizedformulticoreprocessors►Teraflopperformanceat512cores1980’s►VectorProcessingonMainframes2005-2007►DistributedPCGsolver►DistributedANSYS(DMP)released►Distributedsparsesolver►VariationalTechnology►SupportforclustersusingWindowsHPC198019902010200020122010►GPUacceleration(singleGPU;SMP)2012►GPUacceleration(multipleGPUs;DMP)HPC–ASoftwareDevelopmentImperative•ClockSpeed–Levelingoff•CoreCounts–Growing•Exploding(GPUs)•FutureperformancedependsonhighlyscalableparallelsoftwareSource:–Hardware+SoftwareLaptop/DesktoporWorkstation/ServerClusterANSYSYES--DistributedANSYSYESYESR9.0FirstreleaseofDistributedANSYSR11.0Supportfor1stdistributedeigensolver(LANPCG)SupportforfullharmonicanalysesR12.0SupportforunsymmetricmatricesAchievedover1Tflopsusing512coresR13.0Achievedover3Tflopsusingover1024cores•TheentireSOLVEphaseisparallel•Morecomputationsperformedinparallelfastersolutiontime•BetterspeedupsthanSMP•Canachieve4xon8cores(TrygettingthatwithSMP!!!!)•Canbeusedforjobsrunningonupto1024cores•Cantakeadvantageofresourcesonmultiplemachines•Wholenewclassofproblemscanbesolved!•Memoryusageandbandwidthscales•Disk(I/O)usagescales(i.e.parallelI/O)BenefitsofDistributedANSYS•Enhancedscalability•ParallelequationorderingschemeisnowdefaultforsparsesolverDistributedANSYS12481632641282561248163264128256CoreSolverRatingNumberofcoresSparseSolverImprovedScalabilityforR13-4MDOFR12.1R13.0IdealEachnodehas26-coreWestmereprocessors,24GBRAM&InfinibandDDR2(~2500MB/s)interconnect.Only8of12coresareusedoneachnode.SuperScaling!HPCstartonDesktop:ANRProjectMOISE4.3Mdof,Largedisplacement,plasticity,creep,(100000frictionalcontactelement)116itercum48hoursDellT75008coresw5580•MulticoreSpeeduponaPCDesktop:ANSYSspeeduponDesktop:ANRProjectMOISE1.36Mdof,LargeDisplacementplasticity,creep,contactwithfriction46itercum:5xspeedupwith8core(runwith12.11/2010)0123456712468MOISE:DELLT7500ScalabilityXMflopsSMPXMflopsDMPXtotSMpXtotDMPDistributedANSYSPerformance05101520250816243240485664SpeedupSolutionScalability•Minimumtimetosolutionmoreimportantthanscaling•Turbinemodel•2.1millionDOF•Nonlinearstaticanalysis•1Loadstep,7substeps,25equilibriumiterations•Linuxcluster(8corespernode)DistributedANSYSPerformance0500010000150002000025000300003500040000450000816243240485664SolutionElapsedTimeSolutionScalability11hrs,48mins30mins•Minimumtimetosolutionmoreimportantthanscaling1hr,20mins•Turbinemodel•2.1millionDOF•Nonlinearstaticanalysis•1Loadstep,7substeps,25equilibriumiterations•Linuxcluster(8corespernode)AccelerateANSYSMechanicalwithGPUComputing•Graphicsprocessingunits(GPUs)•Widelyusedforgaming,graphicsrendering•Recentlybeenmadeavailableasgeneral-purpose“accelerators”–Supportfordoubleprecisionarithmetic–PerformanceexceedingthelatestmulticoreCPUs•SohowcanANSYSMechanicalmakeuseofthisnewtechnologytoreducetheoveralltimetosolution??GPU:IntroductionNVIDIATeslaC2075NVIDIATeslaM2090NVIDIAQuadro6000NVIDIAQuadroK5000†NVIDIATeslaK10NVIDIATeslaK20†Power(W)225250225122250250Memory6GB6GB6GB4GB8GB6to24GBMemoryBandwidth(GB/s)144177.4144173320288PeakSpeedSP/DP(GFlops)1030/5151331/6651030/5152290/954577/1905184/1728•Targetedhardware†TheseNVIDIA“Kepler”basedproductsarenotreleasedyet,sospecificationsmaybeincorrectGPUAcceleratorCapabilityGPUAcceleratorCostandPerformanceBenefit1.02.45.11.01.351.380123456ANSYSMechanical2CoresANSYSMechanicalANSYSHPCPack8CoresANSYSMechanicalANSYSHPCPack8Cores+GPUFactorsGainOverBasePlatformResultsInvest38%moreoverbaselicensefor5x!GoMoreParallelHigherisBetterCPUSpeed-upCPU/GPUSpeed-upSolutionCostSolutionCostBasisTurbinegeometry2,100KDOFSOLID187FEsStatic,nonlinearOneiterationDirectsparseANSYSMechanicalANSYSHPCPack$10KforworkstationPerformanceBasisV14sp-5Model$2KforTeslaC2075TakeAdvantageofAdding1GPUtoyourCPUs•Modalanalysisofaradialimpeller•BlockLanczosEigensolver•Cyclicsymmetrymodelwith2millionDOF:–337916nodes–222725elements–10-nodetetrahedralsolidelement•Results(baselineis1core):•WithGPU,~6xspeedupon1core•~8.5xspeedupon4cores•If2coresistakenasbaselineinstead,2coreswithGPUAcceleratorresultsin3.7xspeedup!Windowsworkstation:TwoIntelXeon5530processors(2.4GHz,8corestotal),48GBRAM,NVIDIAQuadro6000CoresGPUSpeedup1no1.002no1.994no3.611yes5.922yes7.434ye

2013-ANSYS-UGM-如何为结构仿真配置硬件

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

PLC控制系统的可靠性设计Thereliability

中国联通企业信息化规划IT架构规划研讨会(埃森哲)

智慧粮库信息化项目建设规划方案书

淄博市人民政府关于推进信息化与工业化融合的意见

SK32O-6E发动机电子调速器

物业管理公司人事管理

1城乡居民大病保险政策介绍培训资料

中国生物医学文献数据库

中国通信行业10年

新旧世界葡萄酒品种。八大酒庄

相关文档

相关搜索