Fluent-UDF-16.0-L05-Parallel

1©2015ANSYS,Inc.June7,201516.0ReleaseLecture5:UDFsforCalculationsinParallelUserDefinedFunctionsinANSYSFluent2©2015ANSYS,Inc.June7,2015•Undersomeconditions,aUDFthatworksinserialmustbemodifiedtoensurethatitwillalsoworkcorrectlyinparallel•TheuseofparallelcomputingforFluentsimulationshasbecomecommonplaceduetocontinualadvancesinHPCtechnologyanddecreasingcomputerhardwarecosts.•SimulationsforwhichUDFsarerequiredmustbeabletoruninparallelIntroduction~84%efficiencyfor96Mcellcaseat10240cores3©2015ANSYS,Inc.June7,2015•“Parallelizing”aUDFmeansmodifyingaUDFthatworksinserialsothatitworksproperlybothinserialandparallel–SomeUDFsneedtobeparallelized,othersdonot•ThemotivationforthissessionistointroduceafewbasicconceptsthatillustratehowtoparallelizeaUDF–ItisnotintendedtobeatrainingsessionortodiscusseverypossibleconsiderationthatappliestoUDFsinparallel–Moreadvancedtopicssuchaslowlevelmessagepassing,filewriting,GPUprograming,…willnotbediscussedinthissession•Theobjectivesaretoexplain–HowtoknowwhetheraUDFneedsparallelization–AfewbasicconceptsthatneedtobeunderstoodtoparallelizeaUDF–HowtotroubleshootaUDFthatisnotworkingcorrectlyinparallelObjectives4©2015ANSYS,Inc.June7,2015FluentParallelArchitectureImagineaFluentparallelsessionusing4CPUs•Thesessionhas6computeprocesses,connectedasshowninthefigure•Thegridandsolutiondataaredistributedtoandstoredonthecompute-nodeprocesses•Thecortex(GUI)andprocessesdonothaveanydata•Thehostprocesscommunicatescommandsfromthecortextonode-0,whichpassesthecommandstotheothernodes•Whensolutioninformationisrequired,itiscollectedbynode-0fromtheothernodesandtransferredtocortexviathehostCortexHostCompute-Node-0Compute-Node-1Compute-Node-2Compute-Node-35©2015ANSYS,Inc.June7,2015ASimpleExampleProblem•Thecaseshownherewillbeusedasthebasisfornumerousexamplesinthissession•IfitisreadintoaFluentparallelsessionusing4cpus,themeshandsolutiondatawillbedistributedintogridpartitionsasshownonthenextslidew_rightw_bottomw_topw_left6©2015ANSYS,Inc.June7,2015Inserial,thereisjustoneUDF,butinparallel,therearemultiple,identicalinstancesoftheUDFexecutingindependentlyofoneanother.Meshwith4Partitions#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}Herethereare4nodeprocessesandalsothehostprocess,sooneinstanceoftheUDFexecutesindependentlyon5differentprocesses#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}7©2015ANSYS,Inc.June7,2015ExecutingtheUDFThetextuserinterface(TUI)commandisusedinordertoindicatethepointatwhichtheDEFINE_ON_DEMANDfunctionwasexecuted.SerialParallel–4cpusWhathappenedhere?𝟔𝟓+𝟔𝟖+𝟔𝟕+𝟔𝟖=𝟐𝟔𝟖8©2015ANSYS,Inc.June7,2015FourBasicComponentsofParallelUDFsTheoutputfromtheparallelsessioncanbeunderstoodbyintroducingfourbasiccomponentsofparallelUDFs•CompilerDirectives(preprocessorcommands)•Looping(internalandexternalcellsandfaces)•GlobalReductions(synchronization)•Node-to-HostandHost-to-NodeDataTransfer9©2015ANSYS,Inc.June7,2015TorestrictcertaincommandsinaUDFtobeexecutedonlyonanodeprocess,oronlyonahostprocesscompilerdirectivesareused:Sincemanyoftheoperationswillalsoberequiredintheserialversion,thenegatedversionsaremorecommonlyused:CompilerDirectives#ifRP_HOST/*CodinghereonlyperformedonHOSTprocess*/#endif#ifRP_NODE/*CodinghereonlyperformedonNODEprocesses*/#endif#ifPARALLEL/*CodinghereonlyperformedonHOST&NODEprocesses*/#endif#if!RP_HOST/*CodinghereonlyperformedonNODE&SERIALprocesses*/#endif#if!RP_NODE/*CodinghereonlyperformedonHOST&SERIALprocesses*/#endif#if!PARALLEL/*CodinghereonlyperformedonSERIALprocess*/#endif10©2015ANSYS,Inc.June7,2015PartitionBoundaries•DomainDecomposition:SplitsthecellsinthedomainacrossComputeNodes•BecauseFluent’salgorithmsexpectacelltobeonbothsidesofaninteriorface,copiesoftheneighboringpartition’scellsarekeptoneachNode•ComputeNode0hascopiesofthecellsontheothersideofallpartitionfacesandComputeNode1hascorrespondingcellcopiesfromNode0DomainDecompositionComputeNode0ComputeNode1DistributionacrossComputeNodes11©2015ANSYS,Inc.June7,2015InteriorandExteriorCellsandFaces•Themaincellsofeachpartitionaredesignatedas“Interior”cells.•TheadditionalcopiedcellsfromotherComputeNodes

Fluent-UDF-16.0-L05-Parallel

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

供应链经营管理--追求最佳绩效（PPT 54页）(1)

14级汽修机械制图四圆弧连接

技术交底-北京建工集团-通用-室外给水管道及设备安装

机械制图知识产品图样技术要求

围护桩及冠梁施工方案

浅析汽车金融

争创国家AAAAA级旅游景区打造国内一流的科普旅游景点

国家食品安全相关法规与世界卫生组织

34个省级行政区

外包承揽供应商环境安全卫生管制程序

相关文档

相关搜索