您好,欢迎访问三七文档
ANOUT-OF-CORESPARSESYMMETRICINDEFINITEFACTORIZATIONMETHODOMERMESHARANDSIVANTOLEDOAbstract.Wepresentanewout-of-coresparsesymmetric-indefinitefactor-izationalgorithm.Themostsignificantinnovationofthenewalgorithmisadynamicpartitioningmethodforthesparsefactor.Thispartitioningmethodresultsinverylowinput-outputtrafficandallowsthealgorithmtorunathighcomputationalrateseventhoughthefactorisstoredonaslowdisk.Ourimplementationofthenewcodecompareswellwithbothhigh-performancein-coresparsesymmetric-indefinitecodesandwithahigh-performanceout-of-coresparseCholeskycode.Morespecifically,thenewcodeprovidesanewcapa-bilitythatnoneoftheseexistingcodeshas:itcanfactorsymmetricindefinitematriceswhosefactorsarelargerthanmainmemory;itissomewhatslower,butnotbymuch.Forexample,itfactors,onaconventional32-bitworksta-tion,anindefinitefinite-elementmatrixwhosefactorsizeisabout10GBinlessthananhour.1.IntroductionWepresentamethodforfactoringalargesparsesymmetricindefinitematrixA.BystoringthetriangularfactorofAondiskthemethodcanhandlelargematriceswhosefactorsdonotfitwithinthemainmemoryofthecomputer.AdynamicI/O-awarepartitioningofthematrixensuresthatthemethodperformslittlediskI/Oevenwhenthefactorismuchlargerthanmainmemory.Ourexperimentsindicatethatthemethodcanfactorfinite-elementmatriceswithfactorslargerthan10GBonanordinary32-bitworkstation(a2.4GHzIntel-basedPC)inlessthananhour.ThismethodallowsustosolvelinearsystemsAx=bwithasingleright-hand-sideandlinearsystemsAX=Bwithmultipleright-handsideefficientlyandaccurately.Linearsystemswithasymmetricindefinitecoefficientmatrixariseinoptimization,infinite-elementanalysis,andinshift-inverteigensolvers(evenwhenthematrixwhoseeigendecompositionissoughtisdefinite).Linearsolversthatfactorthecoefficientmatrixintoaproductofpermuta-tion,triangular,diagonal,andorthogonalfactorsarecalleddirectmethods.OurKeywordsandphrases.symmetric-indefinitefactorization,sparsefactorizations,out-of-corefactorizations.12OMERMESHARANDSIVANTOLEDOmethodisdirect,anditdecomposesAintopermutation,triangular,andblock-diagonalfactors(theblock-diagonalfactorhas1-by-1and2-by-2blocks).Com-paredtoiterativelinearsolvers,directsolverstendtobemorereliableandaccu-rate,buttheysometimesrequiresignificantlymoretimeandmemory.Ingeneral,directsolversarepreferredeitherwhentheuserhaslittleexpertiseiniterativemethods,orwheniterativemethodsfailorconvergetooslowly,orwhenmanylinearsystemswiththesamecoefficientmatrixmustbesolved.Inmanyappli-cations,suchasfinite-elementanalysisandshift-inverteigensolvers,manylinearsystemswiththesamecoefficientmatrixareindeedsolved,andinsuchcasesadirectsolverisoftenthemostappropriate.Thesizeofthetriangularfactorofasymmetricmatrixandtheamountofworkrequiredtocomputethefactorizationaresensitivetotheorderingoftherowsandcolumnsofthematrix.Therefore,matricesarenormallypermutedintoaformwhosefactorsarerelativelysparsepriortothefactorizationitself.Althoughtheproblemoffindingaminimal-fillorderingisNP-complete,thereexistseffectiveheuristicsthatworkwellinpractice(aswellasprovableapproximationsthathavenotbeenshowntoworkwellinpractice).Evenwhenthematrixhasbeenprepermutedusingafill-reducingpermutation,thefactorisoftenmuchlarger(denser)thanthematrix,anditmaynotfitinmemoryevenwhenthematrixfitscomfortably.Whenthefactordoesnotfitwithinmainmemory,theuserhasthreechoices:eithertoresorttoaso-calledout-of-corealgorithm,whichstoresthefactorsondisk,toswitchtoaniterativealgorithm,ortoswitchtoamachinewithalargermemory.Sincemachineswithmorethanafewgigabytesofmainmemoryarestillbeyondthereachofmostusers,andsinceiterativesolversarenotalwaysappropriate,therearecaseswhenanout-of-coremethodisthebestsolution.Themainchallengeindesigninganout-of-corealgorithmisensuringthatitdoesnotperformtoomuchdiskinput/output(I/O).Thedisk-to-memorybandwidthisusuallyabouttwoordersofmagnitudelowerthanthememory-to-processorbandwidth.Therefore,toachieveahighcomputationalrate,anout-of-corealgorithmmustaccessdatastructuresondiskinfrequently;mostdataaccessesshouldbetodatathatisstored,perhapstemporarily,inmainmemory.Algorithmsinnumericallinearalgebraachievethisgoalbypartitioningmatricesintoblocksofrowsandcolumns.Whenthematricesaredense,relativelysimple1-and2-dimensionalpartitionsintoblocksofconsecutiverowsandcolumnsworkwell;whenthematricesaresparse,thepartitioningalgorithmmustconsiderthenonzerostructureofthematrices.EssentiallythesamepartitioningstrategiesareusedwhethertheI/Oisperformedautomaticallybythevirtual-memorysys-tem(orbycachepolicieshigherinthememoryhierarchy),orexplicitlyusingsystemcalls.Ingeneral,explicitI/Otendstoworkbetterthanvirtualmemorywhendatastructuresondiskaresignificantlylargerthanmemory.ExplicitI/OistheonlychoicewhenthedatastructuresondiskaretoolargetofitintotheANOUT-OF-CORESPARSESYMMETRICINDEFINITEFACTORIZATIONMETHOD3virtualaddressspaceoftheprogram(largerthan2–4GBon32-bitprocessors,dependingontheoperatingsystem).Tothebestofourknowledge,ouralgorithmisthefirstout-of-coresparsesymmetricindefinitefactorizationmethodtobeproposed.Severalout-of-coremethodshavebeenproposedforthesomewhateasierproblemoffactoringasym
本文标题:AN OUT-OF-CORE SPARSE SYMMETRIC INDEFINITE FACTORI
链接地址:https://www.777doc.com/doc-3326593 .html