您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 项目/工程管理 > plink1.9的GWAS数据处理流程
DatamanagementGeneratebinaryfileset--make-bed--make-bedcreatesanewPLINK1binaryfileset,afterapplyingsample/variantfiltersandotheroperationsbelow.Forexample,plink--filetext_fileset--maf0.05--make-bed--outbinary_filesetdoesthefollowing:1.Autogeneratebinary_fileset-temporary.bed+.bim+.fam.(TheMAFfilterhasnotyetbeenappliedatthisstage.SeetheOrderofoperationspageformoredetails.)2.Readbinary_fileset-temporary.bed+.bim+.fam.CalculateMAFs.RemoveallvariantswithMAF0.05fromthecurrentanalysis.3.Generatebinary_fileset.bed+.bim+.fam.Anysamples/variantsremovedfromthecurrentanalysisarealsonotpresentinthisfileset.(Thisisthe--make-bedstep.)4.Deletebinary_fileset-temporary.bed+.bim+.fam.Incontrast,thefilesetleftbehindby--keep-autoconvisjusttheresultofstep1.--make-just-bim--make-just-fam--make-just-bimisavariantof--make-bedwhichonlygeneratesa.bimfile,and--make-just-famplaysthesamerolefor.famfiles.UnlikemostotherPLINKcommands,thesedonotrequirethemaininputtoincludea.bedfile(thoughyouwon'thaveaccesstomanyfilteringflagswhenusingtheseinno-.bedmode).Usethesecautiously.Itisveryeasytodesynchronizeyourbinarygenotypedataandyour.bim/.famindexesifyouusethesecommandsimproperly.Ifyouhaveanydoubt,stickwith--make-bed.Generatetextfileset--recode01|1223|A|A-transpose|AD|beagle|beagle-nomap|bimbam|bimbam-1chr|compound-genotypes|fastphase|fastphase-1chr|HV|HV-1chr|lgen|lgen-ref|list|oxford|rlist|structure|transpose|vcf|vcf-fid|vcf-iidtab|tabx|spacex|bgz|gen-gzinclude-altomit-nonmale-y--recode-allele[filename]--recodecreatesanewtextfileset,afterapplyingsample/variantfiltersandotheroperations.Bydefault,thefilesetincludesa.pedanda.mapfile,readablewith--file.The'12'modifiercausesA1(usuallyminor)allelestobecodedas'1'andA2allelestobecodedas'2',while'01'mapsA1→0andA2→1.(PLINKforcesyoutocombine'01'with--{output-}missing-genotypewhenthisisnecessarytopreventmissinggenotypesfrombecomingindistinguishablefromA1calls.)The'23'modifiercausesa23andMe-formattedfiletobegenerated.Thiscanonlybeusedonasinglesample'sdata(aone-line--keepfilemaycomeinhandyhere).ThereiscurrentlynospecialhandlingoftheXYpseudo-autosomalregion.The'AD'modifiercausesanadditive(0/1/2)+dominant(het=1,otherwise0)componentfile,suitableforloadingfromR,tobegenerated.'A'isthesame,exceptwithoutthedominancecomponent.oBydefault,A1allelesarecounted;thiscanbecustomizedwith--recode-allele.--recode-allele'sinputfileshouldhavevariantIDsinthefirstcolumnandalleleIDsinthesecond.oBydefault,theheaderlinefor.rawfilesonlynamesthecountedalleles.Toincludethealternateallelecodesaswell,addthe'include-alt'modifier.oHaploidadditivecomponentsare0/2-valuedinsteadof0/1-valued,tomaintainaconsistentscaleontheXchromosome.Seealso--R.The'A-transpose'modifiercausesavariant-majoradditivecomponentfiletobegenerated.Thiscanalsobeusedwith--recode-allele.The'beagle'modifiercausesunphasedper-autosome.datand.mapfiles,readablebyBEAGLE3.3andearlier,tobegenerated,while'beagle-nomap'generatesasingle.datfile(nochromosomesplittingoccursinthiscase).The'bimbam'modifiercausesaBIMBAM-formattedfilesettobegenerated.Ifyourinputdataonlycontainsonechromosome,youcanuse'bimbam-1chr'insteadtowriteatwo-column.pos.txtfile.Ifallallelecodesaresingle-character,youcanusethe'compound-genotypes'modifiertoomitthespacebetweeneachpairofallelecodesinasinglegenotypecallwhengeneratinga.ped+.mapfileset.Youwillneedtousethe--compound-genotypesflagtoloadthisdatainPLINK1.07,butit'snotneededforPLINK1.9.The'fastphase'modifiercausesper-chromosomefastPHASEfilestobegenerated.Ifyourinputdataonlycontainsonechromosome,youcanuse'fastphase-1chr'insteadtoexcludethechromosomenumberfromthefileextension.The'HV'modifiercausesaHaploview-format.ped+.infofilesettobegeneratedperchromosome.'HV-1chr'isanalogousto'fastphase-1chr'.The'lgen'modifiercausesalong-formatfileset,loadablewith--lfile,tobegenerated.'lgen-ref'isequivalenttoPLINK1.07--recode-lgen--with-reference.The'list'modifiercausesagenotype-basedlisttobegenerated.Thisdoesnotproducea.famor.mapfile.The'oxford'modifiercausesaOxford-format.gen+.samplefilesettobegenerated.Ifyoualsoincludethe'gen-gz'modifier,the.genfileisgzipped.The'rlist'modifiercausesarare-genotypefilesettobegenerated(similarto--list'soutput,butwith.famand.mapfiles,andwithouthomozygousmajorgenotypes).Withthe'list'and'rlist'formats,the'omit-nonmale-y'modifiercausesnonmalegenotypestobeomittedontheYchromosome.The'structure'modifiercausesaStructure-formatfiletobegenerated.The'transpose'modifiercausesatransposedtextfileset,loadablewith--tfile,tobegenerated.The'vcf','vcf-fid',and'vcf-iid'modifiersresultinproductionofaVCFv4.2file.'vcf-fid'and'vcf-iid'causefamilyIDsandwithin-familyIDsrespectivelytobeusedforthesampleIDsinthelastheaderrow,while'vcf'mergesbothIDsandputsanunderscorebetweenthem(inthiscase,awarningwillbegivenifanIDalreadycontainsanunderscore).Ifthe'bgz'modifierisadded,theVCFfileisblock-gzipped.(Gzippingofother--recodeoutputfilesisnotcurrentlysupported.)TheA2alleleissavedasthereferenceandnormallyflaggedasnotbasedonarealreferencegenome('PR'INFOfie
本文标题:plink1.9的GWAS数据处理流程
链接地址:https://www.777doc.com/doc-1872126 .html