您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 管理学资料 > QTL作图中的问题和对策
1QTL作图中的问题和对策第四届“QTL作图和育种模拟研讨会”,2009年8月17-20日,山东泰安中国农科院作科所王建康E-mail:wangjk@caas.net.cnQTL作图中的遗传学概念Locus,Allele,Gene,ChromosomeGenotype,Phenotype,HaplotypePopulationstratificationHardy–WeinbergequilibriumEffectivepopulationsizeLinkagephaseLinkagedisequilibriumAdditive,dominance,andepistasisGenotypebyenvironmentinteraction2QTL作图中的统计学方法T-testANOVA:analysisofvariance(estimateheritability)Maximum-likelihoodestimateLikelihoodratiotestBinomialandMultinomialdistributionsPrincipal-componentsanalysisRegressionmodelsStepwiseselectionprocedureShrinkagemethodsPermutationtestFrequentistandBayesianstatistics34QTL作图的基本原理一个标记位点上3种基因型的性状平均数QTL作图的2大类别连锁作图(Linkagemapping)关联作图(Associationmapping)56问题1:什么是假阳性和假阴性?TypeIerrorTypeIIerror7统计假设检验真阳性(Truepositive)真阴性(Truenegative)假阳性(Falsepositive,TypeIerror)假阴性(Falsenegative,TypeIIerror)第一类错误(TypeIerrorrate)的概率=P{拒绝零假设H0|零假设H0为真}第二类错误的概率(TypeIIerrorrate)的概率=P{未拒绝零假设H0|零假设H0为假}显著性水平α与假阳性(TypeIerror)Error:把有说成没有N次独立检验的显著性水平为:1-(1-α)NN次独立检验的Bonferroni校正:≈α/N问题:很多时候多重检验是不独立的!排列检验(permutationtest)(非参数方法的一种),用于计算多重检验中检验统计量的临界值89问题2:什么是LOD值?LRT的一般定义LOD(likelihoodofodd)值二者的关系或)ln(20ALLLRT−=)log()log()log(00LLLLLODAA−==61.4)10ln(2LRTLRTLOD≈=LODLRT61.4≈10问题3:什么是QTL的检测功效?定义:Thepowerofastatisticaltestistheprobabilitythatthetestwillrejectafalsenullhypothesis(i.e.,itwillnotmakeaTypeIIerror).功效=1.0-第二类错误的概率(即对于一个真实存在的QTL,你通过检验说的确有)问题4:什么是功效分析?功效分析的作用是什么?定量分析一个统计方法的有效性比较不同的统计方法功效分析一般通过MonteCarlo模拟来实现11完备区间作图(ICIM,上图)和简单区间作图(SIM,下图)的功效分析12RIL群体大小与检测功效13FDR:falsediscoveryrate(在所有检验显著的QTL中假QTL的比例)14问题5:ICIM需要哪些输入参数?回归模型中变量进出的概率PINPOUT(=2*PIN)PIN的经验值:0.05~0.01LOD临界值15163种作图参数的ICIM一维扫描LOD曲线17问题6:用多大的LOD临界值?一次检验:α(例如0.1,0.05,0.01)n次独立检验犯错误的概率:1-(1-α)n全基因组QTL作图包含许多次检验这些检验不独立经验LOD临界值:2.0–3.0问题7:如何提高QTL作图精确度?作图方法扩大群体减少表型测量误差增加标记?CSSL1819问题8:有利等位基因的来源?遗传效应的解释亲本标记类型编码:2(P1),0(P2),1(F1)P1:m+a;F1:m+d;P2:m-a基因型A2A2A1A2A1A1中亲值m基因型值G22G12G11加性效应a加性效应a显性效应d20问题9:什么是PVE?为什么报道QTL时要指明PVE的大小?PVE=Phenotypicvariationexplained(%)单个QTL解释表型变异的百分数PVEg=Vg/Vp*100%BC,DH和RIL群体中,Vg=a2(a为加性效应)F2群体中,Vg=a2/2+d2/4(d为显性效应)遗传效应大是否PVE一定高?当有奇异分离存在时,回交、DH或RIL,Vg=(1-q)*a2+q*a2-[(1-2q)*a]2=4q(1-q)a221问题10:PVE能相加吗?Z=X+Y,X和Y是随机变量E(X+Y)=E(X)+E(Y)V(X+Y)=V(X)+V(Y)+2Cov(X,Y)QTL独立遗传时,多个QTL的PVE等于单个QTL的PVE之和;QTL连锁时,PVE不能简单相加22问题11:PVE能否超过100%?RIL群体中,位点A-a的遗传方差:a12,位点B-b的遗传方差:a22,总遗传方差:a12+a22+2(1-2r)a1a2QTL连锁时,多个QTL的PVE之和有可能100%;但多个独立的QTL的PVE之和不可能100%23问题12:什么是奇异分离?DistortionsegregationP1(AA)XP2(aa)无选择P1BC1:AA:Aa=1:1P2BC1:Aa:aa=1:1F2:AA:Aa:aa=1:2:1DH,RIL:AA:aa=1:1奇异分离的原因:随机飘变配子/合子的生活力不同(自然选择)24一个大麦DH作图群体中的奇异分离上图:2个等位基因频率的比值下图:奇异分离的显著性概率25问题13:什么是缺失数据?26标记数据缺失根据连锁关系补齐表型数据确实群体平均数替代删除不用大麦DH作图群体中的标记缺失(空白处为缺失标记)27问题14:标记缺失对QTL作图的影响?(FirstsimulatedF2populationfromQTLdistributionmodelIandpopulationsize500)28无标记缺失5%标记缺失10%标记缺失15%标记缺失问题15:该用多密的标记?经验连锁作图:10-20cM有一个标记关联作图:越多越好29PowerandFDRfortwomarkerdensities:10cM(up),and20cM(down)(置信区间为整条染色体)3031PowerandFDRfortwomarkerdensities:10cM(up),and20cM(down)(置信区间10cM,QTL居中)32每条染色体上只要3个标记行吗?PowerandFDRformarkerdensity80cM上:置信区间10cM,QTL居中;下:置信区间为整条染色体问题16:什么QTLbyE?3334Chromosome1233557778889101212SegmentM4M18M21M23M35M39M49M50M51M54M56M57M59M69M77M79LODaE10.000.260.010.001.870.000.120.030.030.010.026.060.470.010.170.61E20.001.052.884.660.950.140.350.300.110.135.843.922.890.220.120.05E32.064.030.000.200.842.600.059.614.640.040.043.983.364.350.892.17E40.040.040.004.030.050.320.040.020.490.700.0515.896.262.570.100.24E50.350.660.413.130.400.004.0912.5116.532.043.185.569.040.310.420.15E60.130.381.061.350.300.220.320.020.180.000.161.844.150.900.290.05E70.085.070.220.210.040.261.470.330.070.035.526.334.071.664.040.01E80.010.000.020.452.161.540.010.020.010.670.1716.8910.010.010.000.00ADDbE1-0.11-0.72-0.12-0.08-2.130.06-0.36-0.190.25-0.130.174.131.270.08-0.35-1.99E2-0.072.21-2.984.35-2.110.78-0.88-0.89-0.720.544.434.514.62-0.72-0.43-0.78E33.894.690.11-0.84-2.03-3.660.32-6.125.11-0.320.354.625.163.57-1.215.68E4-0.24-0.23-0.042.05-0.25-0.61-0.160.120.76-0.67-0.185.843.711.32-0.20-0.91E51.061.22-0.752.42-0.940.072.30-5.198.28-1.592.203.866.400.60-0.550.98E6-1.221.73-2.292.88-1.53-1.30-1.11-0.30-1.140.130.863.797.38-1.87-0.851.04E7-0.523.74-0.560.61-0.31-0.78-1.37-0.69-0.41-0.193.114.294.011.44-1.91-0.21E8-0.090.05-0.110.63-1.63-1.31-0.09-0.120.09-0.62-0.335.944.95-0.090.030.09PVEcE10.010.480.020.014.150.000.230.060.060.030.0415.601.000.010.341.24E20.002.246.6011.442.040.280.690.590.240.2614.549.366.660.460.250.10E33.617.760.010.331.454.730.0721.289.190.070.077.546.358.531.533.91E40.040.050.005.620.060.380.050.030.590.860.0534.669.493.350.110.29E50.350.680.413.510.410.004.6119.8631.262.183.576.7912.650.310.410.15E60.461.373.884.991.070.781.070.070.600.020.556.6016.933.060.980.17E70.118.280.290.290.060.362.130.460.100.049.2510.896.442.336.370.01E80.010.000.020.522.661.700.020.020.010.730.1735.0816.530.010.000.00ChromosomesegmentsshowingQTLforACE问题17:表型数据是否要求服从正态分布?35一对显性主基因和多基因混合遗传模型但随机效应仍要求符合正态分布问题18:能用复合性状进行QTL作图吗?36EffectTraitITraitIIAdditi
本文标题:QTL作图中的问题和对策
链接地址:https://www.777doc.com/doc-1877005 .html