您好,欢迎访问三七文档
应用线性回归课后作业姓名:xxx学号:xxxxxxxxx年级:2013级指导老师:xxx第2章2.14为了调查某广告对销售收入的影响,某商店记录了5个月的销售收入y(万元)和广告费用x(万元),数据如表2-6所示月份12345x12345y1010202040(表2-6)(1)画散点图:解:x-c(1,2,3,4,5)y-c(10,10,20,20,40)plot(x,y)(2)x与y之间是否大致呈线性关系:解:由上题的散点图可以看出五个点基本在一条直线附近,因此可以看出x与y之间大致呈线性关系(3)用最小二乘估计求出回归方程:1234510152025303540xy解:R语言程序如下mystat1-data.frame(x,y)mystat1xy11102210332044205540regress1-lm(y~x,data=mystat1)summary(regress1)Call:lm(formula=y~x,data=mystat1)Residuals:123454.000e+00-3.000e+005.004e-16-7.000e+006.000e+00Coefficients:EstimateStd.ErrortvaluePr(|t|)(Intercept)-1.0006.351-0.1570.8849x7.0001.9153.6560.0354*---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1Residualstandarderror:6.055on3degreesoffreedomMultipleR-squared:0.8167,AdjustedR-squared:0.7556F-statistic:13.36on1and3DF,p-value:0.03535得出回归方程为:y=-x+7(4)求回归标准误差:解:从上述分析看出=6.055(5)给出的置信度为95%的区间估计:解:confint(regress1)2.5%97.5%(Intercept)-21.211248519.21125x0.906079313.09392得出置信度为95%的区间估计为(-21.2112485,19.21125)置信度为95%的区间估计为(0.9060793,13.09392)(6)计算x与y的决定系数:解:由第三问的分析看出:R^2=0.8167,接近1,表明原方程的拟合程度较好。(7)对回归方程作方差分析:解:anova(regress1)AnalysisofVarianceTableResponse:yDfSumSqMeanSqFvaluePr(F)x1490490.0013.3640.03535*Residuals311036.67---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1(8)作回归系数:解:同样从第三问的分析可以看出的p值为0.0354,在显著性水平为0.05时,影响显著。(9)作相关系数的显著性检验:解:sqrt(0.8167)[1]0.9037146相关系数为0.9037146,查表知,x与y有显著的线性关系(10)对回归方程作残差图并作相应的分析:解:y2-regress1$residualsplot(x,y2,type='b',pch=15,lty=3)y3-c(0,0,0,0,0)lines(x,y3,type='b',pch=20,lty=1)由残差图可以看出残差在0附近随机变化,并在变化幅度不大的一个区域内。(11)求当广告费用为4.2万元时,销售收入将达到多少,并给出置信度为95%的置信区间:解:new2-data.frame(x=4.2)pred-predict(regress1,new2,interval=prediction)predfitlwrupr128.46.05931850.74068当x为4.2时,预测值为28.4,置信度为95%的置信区间为12345-6-4-20246xy2[6.059318,50.74068]2.15一家保险公司十分关心其总公司营业部加班的程度,决定认真调查一下现状。经过10周时间,收集了每周加班时间的数据和签发的新保单书目,y为每周加班时间(小时),数据如表2-7所示。(1)画散点图:解:R语言程序如下x-c(825,215,1070,550,480,920,1350,325,670,1215)y-c(3.5,1.0,4.0,2.0,1.0,3.0,4.5,1.5,3.0,5.0)plot(x,y)表2-7周序号12345678910x825215107055048092013503256701215y3.51.04.02.01.03.04.51.53.05.0(2)x与y之间是否大致呈线性关系:解:由图可以看出y与x大致呈线性关系(3)用最小二乘估计求出回归方程:解:mystat-data.frame(x,y)mystatxy18253.522151.0310704.045502.054801.069203.0713504.583251.596703.01012155.0regress2-lm(y~x,data=mystat)summary(regress2)2004006008001000120012345xyCall:lm(formula=y~x,data=mystat)Residuals:Min1QMedian3QMax-0.83899-0.334830.078420.372280.52594Coefficients:EstimateStd.ErrortvaluePr(|t|)(Intercept)0.11812910.35514770.3330.748x0.00358510.00042148.5092.79e-05***---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1Residualstandarderror:0.48on8degreesoffreedomMultipleR-squared:0.9005,AdjustedR-squared:0.8881F-statistic:72.4on1and8DF,p-value:2.795e-05利用最小二乘法手算:设一元线性回归方程为要使得参数满足.004看出两种结果相同,即回归方程为y=0.1181291+0.0035851*x(4)求回归标准误差:解:从第三问看出回归标准误差为0.48(5)给出的置信度为95%的区间估计:解:confint(regress2)2.5%97.5%(Intercept)-0.7008430040.937101152x0.0026134860.004556779a0的置信度为95%的区间估计为[-0.700843004,0.937101152]a1的置信度为95%的区间估计为[0.002613486,0.004556779](6)计算x与y的决定系数:解:决定系数为R^2=0.9005(7)对回归方程作方差分析:解:anova(regress2)AnalysisofVarianceTableResponse:yDfSumSqMeanSqFvaluePr(F)x116.681616.681672.3962.795e-05***Residuals81.84340.2304---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1以上为对回归方程作方差分析,可以看出F值为72.396,显著性p值为2.795e-05,表明回归方程高度显著。(8)作回归系数的显著性检验:解:因此拒绝原假设,认为y与x有显著的线性关系,并且从第三问的分析中看出,回归系数的P值为2.795e-05,远小于显著性水平,故影响显著(9)作相关系数的显著性检验:解:sqrt(0.9005)[1]0.9489468相关系数为0.9489468,查表知,大于显著性水平为0.01时的值,故x与y有高度的显著性关系(10)对回归方程作残差图并作相应分析:解:y2-regress2$residualsplot(x,y2,type='b',pch=15,lty=3)y3-c(0,0,0,0,0,0,0,0,0,0)lines(x,y3,type='b',pch=20,lty=1)由残差图可以看出残差在0附近随机变化,并在变化幅度不大的一个区域内20040060080010001200-0.8-0.6-0.4-0.20.00.20.4xy2(11)该公司预计下一周签发新保单张,需要加班时间是多少?解:new2-data.frame(x=1000)pred-predict(regress2,new2,interval='prediction')predfitlwrupr13.7032622.519494.887033由回归方程预测的当x=1000时,需要的加班时间为3.7(小时)(12)给出的置信度为95%的精确预测区间和近似预测区间:解:new3-data.frame(x=825)pred2-predict(regress2,new3,interval='prediction')pred2fitlwrupr13.0758631.9132874.23844sigma-c(0.48)3.075863+2*sigma[1]4.0358633.075863-2*sigma[1]2.115863y0的置信度为95%的精确预测区间为[1.913287,4.23844]y0的置信度为95%的近似预测区间为[2.115863,4.035863]2.16表2-8是1985年美国50个州和哥伦比亚特区公立学校中教师的人均年工资y(美元)和对学生的人均经费投入x(美元)序号yx序号yx序号yx119583334618208163059351953826422202633114191809529673620460312432032535542020939328537214192752426800454221226443914382516034295294704669222462445173922482394762661048882327186434940209692509730678571024339905020412722454408271705536252338235944225892404292585341682620627282143226443402102450035472722795336644246402829112427431592821570292045223412297122717036212922080298046256102932133016837823022250373147260153705142652542473120940285348257884123152736039823221800253349291323608162169035683322934272950414808349172197431553418443230551258453766(1)绘制y对x的散点图。可以用直线回归描述两者之间的关系吗?解:R语言如下:mystat-read.table('C:/Users/Administrator/Desktop/1.csv',header=T,sep=',')mystatyx119583334622026331143203253554426800454252947046696266104888730678571082717055369258534168102450035471124274315912271703621
本文标题:回归分析作业
链接地址:https://www.777doc.com/doc-7209425 .html