您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 质量控制/管理 > 机器学习考试卷-final2008f-solution
MachineLearning(10-701)Fall2008FinalExamProfessor:EricXingDate:December8,2008.Thereare9questionsinthisexam(18pagesincludingthiscoversheet).Questionsarenotequallydifficult..Thisexamisopentobookandnotes.Computers,PDAs,Cellphonesarenotallowed..Youhavethreehours..Goodluck!11AssortedQuestions[20points]1.(TrueorFalse,2pts)PCAandSpectralClustering(suchasAndrewNg’s)performeigen-decompositionontwodifferentmatrices.However,thesizeofthesetwomatricesarethesame.Solutions:F2.(TrueorFalse,2pts)Thedimensionalityofthefeaturemapgeneratedbypolynomialkernel(e.g.,K(x,y)=(1+x·y)d)ispolynomialwrtthepowerdofthepolynomialkernel.Solutions:T3.(TrueorFalse,2pts)Sinceclassificationisaspecialcaseofregression,logisticregressionisaspecialcaseoflinearregression.Solutions:F4.(TrueorFalse,2pts)Foranytwovariablesxandyhavingjointdistributionp(x,y),wealwayshaveH[x,y]≥H[x]+H[y]whereHisentropyfunction.Solutions:F5.(TrueorFalse,2pts)TheMarkovBlanketofanodexinagraphwithvertexsetXisthesmallestsetZsuchthatx⊥X/{Z∪x}|ZSolutions:T6.(TrueorFalse,2pts)Forsomedirectedgraphs,moralizationdecreasesthenumberofedgespresentinthegraph.Solutions:F7.(TrueorFalse,2pts)TheL2penaltyinaridgeregressionisequivalenttoaLaplacepriorontheweights.Solutions:F8.(TrueorFalse,2pts)Thereisatleastonesetof4pointsinℜ3thatcanbeshatteredby2thehypothesissetofall2Dplanesinℜ3.Solutions:T9.(TrueorFalse,2pts)Thelog-likelihoodofthedatawillalwaysincreasethroughsuccessiveiterationsoftheexpectationmaximationalgorithm.Solutions:F10.(TrueorFalse,2pts)OnedisadvantageofQ-learningisthatitcanonlybeusedwhenthelearnerhaspriorknowledgeofhowitsactionsaffectitsenvironment.Solutions:F32SupportVectorMachine(SVM)[10pts]1.PropertiesofKernel1.1.(2pts)ProvethatthekernelK(x1,x2)issymmetric,wherexiandxjarethefeaturevectorsforithandjthexamples.hints:Yourproofwillnotbelongerthan2or3lines.Solutions:LetΦ(x1)andΦ(x2)bethefeaturemapsforxiandxj,respectively.Then,wehaveK(x1,x2)=Φ(x1)′Φ(x2)=Φ(x2)′Φ(x1)=K(x2,x1)1.2.(4pts)Givenntrainingexamples(xi,xj)(i,j=1,...,n),thekernelmatrixAisann×nsquarematrix,whereA(i,j)=K(xi,xj).ProvethatthekernelmatrixAissemi-positivedefinite.hints:(1)Rememberthatann×nmatrixAissemi-positivedefiniteiff.foranyndimensionalvectorf,wehavef′Af≥0.(2)Forsimplicity,youcanprovethisstatementjustforthefollowingparticularkernelfunction:K(xi,xj)=(1+xixj)2.Solutions:LetΦ(xi)bethefeaturemapfortheithexampleanddefinethema-trixB=[Φ(x1),...,Φ(xn)].ItiseasytoverifythatA=B′B.Then,wehavef′Af=(Bf)′Bf=kBfk2≥02.Soft-MarginLinearSVM.Giventhefollowingdatasetin1-dspace(Figure1),whichconsistsof4positivedatapoints{0,1,2,3}and3negativedatapoints{−3,−2,−1}.Supposethatwewanttolearnasoft-marginlinearSVMforthisdataset.Rememberthatthesoft-marginlinearSVMcanbeformalizedasthefollowingconstrainedquadraticoptimizationproblem.Inthisformulation,Cistheregularizationparameter,whichbalancesthesizeofmargin(i.e.,smallerwtw)vs.theviolationofthemargin(i.e.,smallerPmi=1ǫi).argmin{w,b}12wtw+CmXi=1ǫiSubjectto:yi(wtxi+b)≥1−ǫiǫi≥0∀iFigure1:Dataset42.1(2pts)ifC=0,whichmeansthatweonlycarethesizeofthemargin,howmanysupportvectorsdowehave?Solutions:72.2(2pts)ifC→∞,whichmeansthatweonlycaretheviolationofthemargin,howmanysupportvectorsdowehave?Solutions:253PrincipleComponentAnalysis(PCA)[10pts]1.1(3pts)BasicPCAGiven3datapointsin2-dspace,(1,1),(2,2)and(3,3),(a)(1pt)whatisthefirstprinciplecomponent?Solutions:pc=(1/√2,1/√2)′=(0.707,0.707)′,(thenegationisalsocorrect)(b)(1pt)Ifwewanttoprojecttheoriginaldatapointsinto1-dspacebyprinciplecomponentyouchoose,whatisthevarianceoftheprojecteddata?Solutions:4/3=1.33(c)(1pt)Fortheprojecteddatain(b),nowifwerepresentthemintheoriginal2-dspace,whatisthereconstructionerror?Solutions:01.2(7pts)PCAandSVDGiven6datapointsin5-dspace,(1,1,1,0,0),(−3,−3,−3,0,0),(2,2,2,0,0),(0,0,0,−1,−1),(0,0,0,2,2),(0,0,0,−1,−1).Wecanrepresentthesedatapointsbya6×5matrixX,whereeachrowcorrespondstoadatapoint:X=11100−3−3−30022200000−1−100022000−1−1(a)(1pt)Whatisthesamplemeanofthedataset?Solutions:[0,0,0,0,0](b)(3pts)WhatisSVDofthedatamatrixXyouchoose?hints:TheSVDforthismatrixmusttakethefollowingform,wherea,b,c,d,σ1,σ2aretheparametersyouneedtodecide.X=a0−3a02a00b0−2b0b×σ100σ2×ccc00000ddSolutions:a=±1/√14=±0.267,b=±1/√6=±0.408,σ1=1/(a·c)=√42=6.48,σ2=1/(b·d)=√12=3.46,c=±1/√3=±0.577,d=±1/√2=±0.707.6(c)(1pt)Whatisfirstprinciplecomponentfortheoriginaldatapoints?Solutions:pc=±[c,c,c,0,0]=±[0.577,0.577,0.577,0,0](Intuition:First,wewanttonoticethatthefirstthreedatapointsareco-linear,andsodothelastthreedatapoints.Andalsothefirstthreedatapointsareorthogonaltotherestthreedatapoints.Then,wewantnoticethatthenormofthefirstthreearemuchbiggerthanthelastthree,therefor,thefirstpchasthesamedirectionasthefirstthreedatapoints)(d)(1pt)Ifwewanttoprojecttheoriginaldatapointsinto1-dspacebyprinciplecomponentyouchoose,whatisthevarianceoftheprojecteddata?Solutions:var=σ21/6=7(Intuition:wejustthekeepthefirstthreedatapoints,andsettherestthreedatapointsas[0,0,0,0,0](sincetheyareorthogonaltopc),andthencomputethevarianceamongthem)(e)(1pt)Fortheprojecteddatain(d),nowifwer
本文标题:机器学习考试卷-final2008f-solution
链接地址:https://www.777doc.com/doc-6863028 .html