您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 其它文档 > 因子分析课件北京大学公共卫生学院郑迎东多元统计
因子分析FactorAnalysis外显变量和潜变量外显变量潜变量上课迟到早退按时完成作业学习态度自觉复习功课坚持朗诵课文作文词汇言语表达能力口语因子分析因子分析可以看成是主成份分析的一种推广。它的基本目的是用少数几个因子F1、F2、…去描述许多变量之间的关系。被描述的变量X1、X2…是可以观测的随机变量,即显在变量,而这些因子F1、F2、…是不可观测的潜在变量。因子分析要求变量间有一定的相关性。因子分析的发展起源于心理学Spearman的“智力结构”理论(1904)一般能力/特殊能力Pearson:主因子法Thurstone:公因子数目/简单结构验证性因子分析因子模型mmkmkmmmkkkkuvfafafaxuvfafafaxuvfafafax............221122222212121112121111vuAfx00mmkmkmmkkmuuvvfffaaaaaaaaaxxx........................112121222211121121因子模型矩阵表示0ufuf,IuIf0u0f)()cov()()()()(EVarVarEE因子模型假设),...,(,,1...221**22222212miiikiiivvdiagVAARVRRvHaaaH因子模型的性质因子载荷(负荷)矩阵的统计意义A的元素aij是xi与fj之间的相关系数A的行元素平方和体现原始变量对公因子的依赖程度(xj,xj*)A的列元素平方和体现公因子fj对x的贡献1..................1...121221112pppprrrrrrRpppppprrrrrrrrrR.....................212222111211221...p00V因子模型的参数估计主因子法:对R*进行分解(SMC)主成分法:用R代替R*(默认的方法)ML:精度较高的估计,计算量大。如果样本来自多维正态总体,则能给出关于模型的假设检验,如回答公因子是否“显著”的问题。因子个数的确定原则不大于(实际上远远少于)原变量的个数研究者事先确定从R*的特征值出发看累积比例看特征值大小(大于1准则)看特征值的变化速率(碎石图)假设检验(仅适用ML估计)碎石图因子旋转因子负荷矩阵不是唯一的一般要求因子满足“简单结构原理”负荷矩阵的每行至少有一个0对于每个原始变量x,在少数公因子上的负载较大而在其余的公因子上负载接近0常用的因子正交旋转方法正交旋转(寻找正交矩阵)varimax(方差最大法):使因子载荷矩阵的(标度)列元素的平方的方差和达到最大。有利因子解释quartimax(四次方最大法):使因子载荷矩阵的(标度)行元素的平方的方差和达到最大。有利原始变量解释equimax(等量最大法):前二种方法的妥协结果。常用的因子斜交旋转方法牺牲因子间的独立性,获得更好的因子解释。负荷矩阵(模式矩阵factorpattern)和结构矩阵(factorstructure)S=BWS结构矩阵(元素为因子和变量之间的相关系数)B模式矩阵即旋转后的负荷阵W因子间的相关阵因子得分因子(潜变量)在理论一般不能表示成原始变量(可观测变量)的线性组合,但实际可以用线性模型(回归模型)来估计因子得分。因子得分的意义因子分析实例数据来自2000年全国体质调研,北京、男20-30岁组CorrelationMatrix1.000-.147.019.042-.025-.1471.000-.032-.093-.097.019-.0321.000.535.264.042-.093.5351.000.207-.025-.097.264.2071.000单脚站立反应时1握力背力身高Correlation单脚站立反应时1握力背力身高公因子方差Hi2Communalities1.000.5961.000.5601.000.7071.000.6631.000.324单脚站立反应时1握力背力身高InitialExtractionExtractionMethod:PrincipalComponentAnalysis.R*ReproducedCorrelations.596b-.567-.050.024-.005-.567.560b-.074-.141-.078-.050-.074.707b.680.478.024-.141.680.663b.463-.005-.078.478.463.324b单脚站立反应时1握力背力身高ReproducedCorrelation单脚站立反应时1握力背力身高ExtractionMethod:PrincipalComponentAnalysis.Reproducedcommunalitiesb.R*的特征值TotalVarianceExplained1.72034.39034.3901.72034.39034.3901.13122.61857.0091.13122.61857.009.90818.16775.176.78515.69190.867.4579.133100.000Component12345Total%ofVarianceCumulative%Total%ofVarianceCumulative%InitialEigenvaluesExtractionSumsofSquaredLoadingsExtractionMethod:PrincipalComponentAnalysis.因子初始解—载荷矩阵(A)FactorPatterna.098-.766-.238.710.823.171.811.072.564.079单脚站立反应时1握力背力身高12FactorExtractionMethod:PrincipalComponentAnalysis.2componentsextracted.a.因子图因子旋转后的载荷矩阵RotatedComponentMatrixa-.044.771-.104-.741.840-.017.811.077.569.026单脚站立反应时1握力背力身高12ComponentExtractionMethod:PrincipalComponentAnalysis.RotationMethod:VarimaxwithKaiserNormalization.Rotationconvergedin3iterations.a.正交旋转后的因子图ML估计结果MLE的优点Goodness-of-fitTest2.1061.147Chi-SquaredfSig.因子旋转改变了什么?TotalVarianceExplained1.72034.39034.3901.70033.99733.9971.13122.61857.0091.15123.01257.009Component12Total%ofVarianceCumulative%Total%ofVarianceCumulative%ExtractionSumsofSquaredLoadingsRotationSumsofSquaredLoadingsExtractionMethod:PrincipalComponentAnalysis.例2CorrelationMatrix1.000.264.343.303.264.2641.000.391.209.438.343.3911.000.844.177.303.209.8441.000.204.264.438.177.2041.000握力身高体重胸围肺活量Correlation握力身高体重胸围肺活量未旋转因子图Varimax正交旋转Promax斜交旋转因子图因子载荷矩阵初始解和斜交旋转后的结果比较FactorMatrixa.418.158.492.450.954-.288.799-.284.393.544握力身高体重胸围肺活量12FactorExtractionMethod:PrincipalAxisFactoring.Attemptedtoextract2factors.Morethan25iterationsrequired.(Convergence=.005).Extractionwasterminated.a.PatternMatrixa.233.305.089.628.984.033.853-.016-.057.690握力身高体重胸围肺活量12FactorExtractionMethod:PrincipalAxisFactoring.RotationMethod:PromaxwithKaiserNormalization.Rotationconvergedin3iterations.a.斜交旋转的因子结构和相关阵StructureMatrix.346.392.323.661.996.399.847.302.200.669握力身高体重胸围肺活量12FactorExtractionMethod:PrincipalAxisFactoring.RotationMethod:PromaxwithKaiserNormalization.FactorCorrelationMatrix1.000.373.3731.000Factor1212ExtractionMethod:PrincipalAxisFactoring.RotationMethod:PromaxwithKaiserNormalization.关于特征值、累积方差和变量数目CorrelationMatrix1.000.041.019.188.0411.000.844.146.019.8441.000.188.188.146.1881.000脉搏体重胸围台阶指数Correlation脉搏体重胸围台阶指数TotalVarianceExplained1.91447.85647.8561.73243.29843.2981.13728.42576.281.3588.95252.250.79419.85796.138.1543.862100.000Factor1234Total%ofVarianceCumulative%Total%ofVarianceCumulative%InitialEigenvaluesExtractionSumsofSquaredLoadingsExtractionMethod:PrincipalAxisFactoring.关于特征值、累积方差和变量数目CorrelationMatrix1.000.041.019.188.032-.035.0411.000.844.146.867.851.019.8441.000.188.810.814.188.146.1881.000.161.171.032.867.810.1611.000.812-.035.851.814.171.8121.000脉搏体重胸围台阶指数腰围臀围Correlation脉搏体重胸围台阶指数腰围臀围TotalVarianceExplained3.54359.05559.0553.37056.16456.1641.16319.38178.437.4016.67862.841.79913.31691.752.1923.20194.953.1833.05498.007.1201.993100.000Factor123456Total%ofVarianceCumulative%Total%ofVarianceCumulative%InitialEigenvaluesExtractionSumsofSquaredLoadingsExtractionM
本文标题:因子分析课件北京大学公共卫生学院郑迎东多元统计
链接地址:https://www.777doc.com/doc-5104246 .html