Parallel direct methods for sparse linear systems,

PARALLELDIRECTMETHODSFORSPARSELINEARSYSTEMSMichaelT.HeathDepartmentofComputerScienceandNCSAUniversityofIllinoisUrbana,Illinois61801ABSTRACTWepresentanoverviewofparalleldirectmethodsforsolvingsparsesystemsoflinearequations,focusingonsymmetricpositivedenitesystems.Weexaminetheperformanceimplicationsoftheimportantdierencesbetweendenseandsparsesystems.Ourmainemphasisisonparallelimplementationofthenumericallyintensivefactorizationprocess,butwealsobrieyconsidertheothermajorcomponentsofdirectmethods,suchasparallelordering.IntroductionInthispaperwepresentabriefoverviewofparalleldirectmeth-odsforsolvingsparselinearsystems.Paradoxically,sparsematrixfactorizationoersadditionalopportunitiesforexploitingparallelismbeyondthoseavailablewithdensematrices,yetitisoftenmoredif-culttoattaingoodeciencyinthesparsecase.Weexaminebothsidesofthisparadox:theadditionalparallelisminducedbysparsity,andthedicultyinachievinghigheciencyinspiteofit.WefocusonCholeskyfactorization,primarilybecausethisallowsustodiscussparallelisminrelativeisolation,withouttheadditionalcomplicationsofpivotingfornumericalstability.Mostofthelessonslearnedarealsoapplicabletoothermatrixfactorizations,suchasLUandQR.Ourmainpointinthecurrentdiscussionistoexplainhowthesparsecasediersfromthedensecase,andexaminetheperformanceimplicationsofthosedierences.ConsiderasystemoflinearequationsAx=b;whereAisannnsymmetricpositivedenite(SPD)matrix,bisaknownvector,andxistheunknownsolutionvectortobecomputed.OnewaytosolvethelinearsystemisrsttocomputetheCholeskyfactorizationA=LLT;wheretheCholeskyfactorLisalowertriangularmatrixwithpositivediagonalelements.ThenthesolutionvectorxcanbecomputedbysuccessiveforwardandbacksubstitutionstosolvethetriangularsystemsLy=b;LTx=y:CholeskyFactorizationAlgorithmThealgorithmforCholeskyfactorizationisavariantofGaussianeliminationthattakesadvantageofsymmetrytoreducebothworkandstoragebyabouthalf.LikeGaussianelimination,thealgorithmconsistsofatriplenestedloop.Oneofthe3!waysofarrangingthatloopisshowninFigure1.forj=1;nfork=1;j1fori=j;naij=aijaikajkfcmod(j;k)gajj=pajjfork=j+1;nakj=akj=ajjfcdiv(j)gFigure1:SerialCholeskyfactorizationalgorithmWemakethefollowingimportantobservationsaboutthisalgorithm:SinceAisSPD,thesquarerootsareallofpositivenumbers,sothealgorithmiswelldened.Pivotingisnotrequiredfornumericalstability.OnlythelowertriangularportionofAisaccessed.ThefactorLiscomputedinplace,overwritingthelowertrian-gleofA.Eachcolumnjismodiedbyamultipleofeachpriorcolumnk.Wedenotethisoperationbycmod(j;k).Ifthatmultipleiszero(i.e.,ajk=0),thentheinnermostloophasnoeectandmayaswellbeskipped.ElementsofAthatwereinitiallyzeromaybecomenonzeroduetocmodoperationsbynonzeroelementsfrompreviouscolumns.Suchnewnonzerosarecalledll.Whenallmodicationstocolumnjarecomplete,itisscaledbythesquarerootofitsdiagonalelementtoproducecolumnjofthefactor.Wedenotethisoperationbycdiv(j).ForfurtherdetailsonCholeskyfactorization,see(GolubandVanLoan,1989).ThreeFormsofCholeskyFactorizationThethreechoicesofindexfortheouterloopyieldmarkedlydier-entmemoryaccesspatterns,asillustratedinFigure2,andthesehaveimportantperformanceimplicationsinvariousarchitecturalsettings,suchaseectivecacheutilization,vectorization,parallelization,orout-of-coresolutions.Row-Cholesky:Withiintheouterloop,theinnerloopssolveatriangularsystemforeachnewrowintermsofthepreviouslycomputedrows.Column-Cholesky:Withjintheouterloop,theinnerloopscomputethematrix-vectorproductthatgivestheeectofpre-viouslycomputedcolumnsonthecolumncurrentlybeingcom-puted.Submatrix-Cholesky:Withkintheouterloop,theinnerloopsapplythecurrentcolumnasarank-1updatetotheremainingunreducedsubmatrix.Althoughrow-orientedalgorithmscanbeeectiveinsomecon-texts,column-orientedalgorithmstendtobemuchmoreeectiveinpracticeforsparseproblems,sowewillrestrictourattentiontothelatter.row-Choleskycolumn-Choleskysubmatrix-CholeskymodiedusedformodicationFigure2:ThreeformsofCholeskyfactorization.Column-Choleskyissometimessaidtobea\left-lookingalgo-rithm,sinceateachstageitaccessesneededcolumnstotheleftofthecurrentcolumninthematrix.Itcanalsobeviewedasa\demand-drivenalgorithm,sincetheinnerproductsthataectagivencolumnarenotaccumulateduntilactuallyneededtomod-ifyandcompletethatcolumn.Forthisreason,column-Choleskyiscalleda\delayed-updatealgorithm.Itisalsoreferredtoasa\fan-inalgorithm,sincethebasicoperationistocombinetheeectsofmultiplepreviouscolumnsonasingletargetcolumn.Insubmatrix-Cholesky,assoonacolumnhasbeencomputed,itseectsonallsubsequentcolumnsarecomputedimmediately.Thus,submatrix-Choleskyissaidtobea\right-lookingalgorithm,sinceateachstagecolumnstotherightofthecurrentcolumnaremodi-ed.Itcanalsobeviewedasa\data-drivenalgorithm,sinceeachnewcolumnisusedassoonasitiscompletedtomakeallmodi-cationstoallthesubsequentcolumnsitaects.Forthisreason,submatrix-Choleskyiscalledan\immediate-updatealgorithm.Itisalsoreferredtoasa\fan-outalgorithm,sincethebasicopera-tionisforasinglecolumntoaectmultiplesubsequentcolumns.Wewillseethatthesecharacterizationsofthecolumn-Cho

Parallel direct methods for sparse linear systems,

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

工期定额最终印刷稿(09-12-10)页面整理

32PH(破碎机)电子版说明书

建筑专业术语2

脚手架专项施工方案

01 金融大未来危机之后

临床药学与临床药师-PowerPointTemplat

地产品牌连城诀(1)

第五届素质拓展大赛活动策划书

战略与竞争(柴少青)

希腊主权债务危机的成因与影响4046711620

相关文档

相关搜索

Parallel direct methods for sparse linear systems,

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

工期定额最终印刷稿(09-12-10)页面整理

32PH(破碎机)电子版说明书

建筑专业术语2

脚手架专项施工方案

01 金融大未来 危机之后

临床药学与临床药师-PowerPointTemplat

地产品牌连城诀(1)

第五届素质拓展大赛活动策划书

战略与竞争(柴少青)

希腊主权债务危机的成因与影响4046711620

相关文档

相关搜索

01 金融大未来危机之后