机器学习-试卷-finals14

CS189Spring2014IntroductiontoMachineLearningFinal•Youhave3hoursfortheexam.•Theexamisclosedbook,closednotesexceptyourone-pagecribsheet.•Pleaseusenon-programmablecalculatorsonly.•MarkyouranswersONTHEEXAMITSELF.Ifyouarenotsureofyouransweryoumaywishtoprovideabriefexplanation.•Fortrue/falsequestions,llintheTrue/Falsebubble.•Formultiple-choicequestions,llinthebubblesforALLCORRECTCHOICES(insomecases,theremaybemorethanone).Wehaveintroducedanegativepenaltyforfalsepositivesforthemultiplechoicequestionssuchthattheexpectedvalueofrandomlyguessingis0.Don'tworry,forthissection,yourscorewillbethemaximumofyourscoreand0,thusyoucannotincuranegativescoreforthissection.FirstnameLastnameSIDFirstandlastnameofstudenttoyourleftFirstandlastnameofstudenttoyourrightForstauseonly:Q1.TrueorFalse/9Q2.MultipleChoice/24Q3.Softmaxregression/11Q4.PCAandleastsquares/10Q5.Mixtureoflinearregressions/10Q6.Trainingsetaugmentation/10Q7.KernelPCA/12Q8.Autoencoder/14Total/1001Q1.[9pts]TrueorFalse(a)[1pt]Thesingularvaluedecompositionofarealmatrixisunique.TrueFalse(b)[1pt]Amultiple-layerneuralnetworkwithlinearactivationfunctionsisequivalenttoonesingle-layerperceptronthatusesthesameerrorfunctionontheoutputlayerandhasthesamenumberofinputs.TrueFalse(c)[1pt]Themaximumlikelihoodestimatorfortheparameterofauniformdistributionover[0;]isunbiased.TrueFalse(d)[1pt]Thek-meansalgorithmforclusteringisguaranteedtoconvergetoalocaloptimum.TrueFalse(e)[1pt]Increasingthedepthofadecisiontreecannotincreaseitstrainingerror.TrueFalse(f)[1pt]Thereexistsaone-to-onefeaturemappingforeveryvalidkernelk.TrueFalse(g)[1pt]Forhigh-dimensionaldata,k-dtreescanbeslowerthanbruteforcenearestneighborsearch.TrueFalse(h)[1pt]Ifwehadinnitedataandinnitelyfastcomputers,kNNwouldbetheonlyalgorithmwewouldstudyinCS189.TrueFalse(i)[1pt]Fordatasetswithhighlabelnoise(manydatapointswithincorrectlabels),randomforestswouldgenerallyperformbetterthanboosteddecisiontrees.TrueFalse2Q2.[24pts]MultipleChoice(a)[2pts]InHomework4,youtalogisticregressionmodelonspamandhamdataforaKaggleCompetition.Assumeyouhadaverygoodscoreonthepublictestset,butwhentheGSIsranyourmodelonaprivatetestset,yourscoredroppedalot.Thisislikelybecauseyouoverttedbysubmittingmultipletimesandchangingthefollowingbetweensubmissions:,yourpenaltyterm,yourstepsize,yourconvergencecriterionFixingarandombug(b)[2pts]Givend-dimensionaldatafxigNi=1,yourunprinciplecomponentanalysisandpickPprinciplecompo-nents.Canyoualwaysreconstructanydatapointxifori2f1:::NgfromthePprinciplecomponentswithzeroreconstructionerror?Yes,ifPdYes,ifPnYes,ifP=dNo,always(c)[2pts]PuttingastandardGaussianpriorontheweightsforlinearregression(wN(0;I))willresultinwhattypeofposteriordistributionontheweights?LaplacePoissonUniformNoneoftheabove(d)[2pts]SupposewehaveNinstancesofd-dimensionaldata.Lethbetheamountofdatastoragenecessaryforahistogramwithaxednumberofticksperaxis,andletkbetheamountofdatastoragenecessaryforkerneldensityestimation.Whichofthefollowingistrueabouthandk?handkgrowlinearlywithNhandkgrowexponentiallywithdhgrowsexponentiallywithd,andkgrowslinearlyNhgrowslinearlywithN,andkgrowsexpo-nentiallywithd(e)[2pts]Whichofthetheseclassierscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove3(f)[2pts]Whichofthetheseclassierscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove(g)[2pts]Whichofthetheseclassierscouldhavegeneratedthisdecisionboundary?LinearSVMLogisticregression1-NNNoneoftheabove(h)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclustering4(i)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclustering(j)[2pts]Youwanttoclusterthisdatainto2clusters.Whichofthethesealgorithmswouldworkwell?K-meansGMMclusteringMeanshiftclusteringThefollowingquestionsareabouthowtohelpCS189TAJonathanSnowtosolvethehomework.(k)[2pts]Jonathanjusttrainedadecisiontreeforadigitrecognition.Henoticesanextremelylowtrainingerror,butanabnormallylargetesterror.HealsonoticesthatanSVMwithalinearkernelperformsmuchbetterthanhistree.Whatcouldbethecauseofhisproblem?DecisiontreeistoodeepLearningratetoohighDecisiontreeisoverttingThereistoomuchtrainingdata(l)[2pts]Jonathanhasnowswitchedtomultilayerneuralnetworksandnoticesthatthetrainingerrorisgoingdownandconvergestoalocalminimum.Thenwhenhetestsonthenewdata,thetesterrorisabnormallyhigh.Whatisprobablygoingwrongandwhatdoyourecommendhimtodo?Thetrainingdatasizeisnotlargeenough.Collectalargertrainingdataandretrainit.Useadierentinitializationandtrainthenet-workseveraltimes.Usetheaverageofpredictionsfromallnetstopredicttestdata.Playwithlearningrateandaddregularizationtermtotheobjectivefunction.Usethesametrainingdatabutaddtwomorehiddenlayers.5Q3.[11pts]SoftmaxregressionRecallthesetupoflogisticregression:Weassumethattheposteriorprobabilityisoftheformp(Y=1jx)=11+exThisassumesthatYjXisaBernoullirandomvariable.WenowturntothecasewhereYjXisamultinomialrandomvariableoverKout

机器学习-试卷-finals14

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

度分析说明全国污染源普查电子制

立城商务中心楼施工组织设计1

企业财产保险

实训5：运输单据

肯创XXXX零售与复合型商业地产开发及设计国际研讨会—北京

旅游企业会计_账簿及账务处理流程（PPT40页)

每周汽车市场要闻XXXX1011-1015

酒店管理的七种最重要的管理方法p3

管理学整合观点与创新思维

以用户为中心的产品研发流程

相关文档

相关搜索