您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 咨询培训 > Fluent-UDF-16.0-L05-Parallel
1©2015ANSYS,Inc.June7,201516.0ReleaseLecture5:UDFsforCalculationsinParallelUserDefinedFunctionsinANSYSFluent2©2015ANSYS,Inc.June7,2015•Undersomeconditions,aUDFthatworksinserialmustbemodifiedtoensurethatitwillalsoworkcorrectlyinparallel•TheuseofparallelcomputingforFluentsimulationshasbecomecommonplaceduetocontinualadvancesinHPCtechnologyanddecreasingcomputerhardwarecosts.•SimulationsforwhichUDFsarerequiredmustbeabletoruninparallelIntroduction~84%efficiencyfor96Mcellcaseat10240cores3©2015ANSYS,Inc.June7,2015•“Parallelizing”aUDFmeansmodifyingaUDFthatworksinserialsothatitworksproperlybothinserialandparallel–SomeUDFsneedtobeparallelized,othersdonot•ThemotivationforthissessionistointroduceafewbasicconceptsthatillustratehowtoparallelizeaUDF–ItisnotintendedtobeatrainingsessionortodiscusseverypossibleconsiderationthatappliestoUDFsinparallel–Moreadvancedtopicssuchaslowlevelmessagepassing,filewriting,GPUprograming,…willnotbediscussedinthissession•Theobjectivesaretoexplain–HowtoknowwhetheraUDFneedsparallelization–AfewbasicconceptsthatneedtobeunderstoodtoparallelizeaUDF–HowtotroubleshootaUDFthatisnotworkingcorrectlyinparallelObjectives4©2015ANSYS,Inc.June7,2015FluentParallelArchitectureImagineaFluentparallelsessionusing4CPUs•Thesessionhas6computeprocesses,connectedasshowninthefigure•Thegridandsolutiondataaredistributedtoandstoredonthecompute-nodeprocesses•Thecortex(GUI)andprocessesdonothaveanydata•Thehostprocesscommunicatescommandsfromthecortextonode-0,whichpassesthecommandstotheothernodes•Whensolutioninformationisrequired,itiscollectedbynode-0fromtheothernodesandtransferredtocortexviathehostCortexHostCompute-Node-0Compute-Node-1Compute-Node-2Compute-Node-35©2015ANSYS,Inc.June7,2015ASimpleExampleProblem•Thecaseshownherewillbeusedasthebasisfornumerousexamplesinthissession•IfitisreadintoaFluentparallelsessionusing4cpus,themeshandsolutiondatawillbedistributedintogridpartitionsasshownonthenextslidew_rightw_bottomw_topw_left6©2015ANSYS,Inc.June7,2015Inserial,thereisjustoneUDF,butinparallel,therearemultiple,identicalinstancesoftheUDFexecutingindependentlyofoneanother.Meshwith4Partitions#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}Herethereare4nodeprocessesandalsothehostprocess,sooneinstanceoftheUDFexecutesindependentlyon5differentprocesses#includeudf.h#defineZONE_ID2DEFINE_ON_DEMAND(counting){Domain*d=Get_Domain(1);Thread*t=Lookup_Thread(d,ZONE_ID);cell_tc;intncount=0;begin_c_loop(c,t){ncount+=1;}end_c_loop(c,t)Message(Numberofcells%d\n,ncount);}7©2015ANSYS,Inc.June7,2015ExecutingtheUDFThetextuserinterface(TUI)commandisusedinordertoindicatethepointatwhichtheDEFINE_ON_DEMANDfunctionwasexecuted.SerialParallel–4cpusWhathappenedhere?𝟔𝟓+𝟔𝟖+𝟔𝟕+𝟔𝟖=𝟐𝟔𝟖8©2015ANSYS,Inc.June7,2015FourBasicComponentsofParallelUDFsTheoutputfromtheparallelsessioncanbeunderstoodbyintroducingfourbasiccomponentsofparallelUDFs•CompilerDirectives(preprocessorcommands)•Looping(internalandexternalcellsandfaces)•GlobalReductions(synchronization)•Node-to-HostandHost-to-NodeDataTransfer9©2015ANSYS,Inc.June7,2015TorestrictcertaincommandsinaUDFtobeexecutedonlyonanodeprocess,oronlyonahostprocesscompilerdirectivesareused:Sincemanyoftheoperationswillalsoberequiredintheserialversion,thenegatedversionsaremorecommonlyused:CompilerDirectives#ifRP_HOST/*CodinghereonlyperformedonHOSTprocess*/#endif#ifRP_NODE/*CodinghereonlyperformedonNODEprocesses*/#endif#ifPARALLEL/*CodinghereonlyperformedonHOST&NODEprocesses*/#endif#if!RP_HOST/*CodinghereonlyperformedonNODE&SERIALprocesses*/#endif#if!RP_NODE/*CodinghereonlyperformedonHOST&SERIALprocesses*/#endif#if!PARALLEL/*CodinghereonlyperformedonSERIALprocess*/#endif10©2015ANSYS,Inc.June7,2015PartitionBoundaries•DomainDecomposition:SplitsthecellsinthedomainacrossComputeNodes•BecauseFluent’salgorithmsexpectacelltobeonbothsidesofaninteriorface,copiesoftheneighboringpartition’scellsarekeptoneachNode•ComputeNode0hascopiesofthecellsontheothersideofallpartitionfacesandComputeNode1hascorrespondingcellcopiesfromNode0DomainDecompositionComputeNode0ComputeNode1DistributionacrossComputeNodes11©2015ANSYS,Inc.June7,2015InteriorandExteriorCellsandFaces•Themaincellsofeachpartitionaredesignatedas“Interior”cells.•TheadditionalcopiedcellsfromotherComputeNodes
本文标题:Fluent-UDF-16.0-L05-Parallel
链接地址:https://www.777doc.com/doc-4084326 .html