您好,欢迎访问三七文档
当前位置:首页 > IT计算机/网络 > AI人工智能 > 人工智能10学习2(PPT58页)
第10章学习Supervisedlearning•监督学习Supervisedlearning•正规的参数表示分类回归Classification(分类)•WearegivenasetofNobservations{(xi,yi)}i=1..N•Needtomapx∈Xtoalabely∈Y•Examples:DecisionTrees决策树•教材Section18.3学习决策树Problem:基于以下属性决定是否在一家餐馆等座位:1.Alternate(别的选择):isthereanalternativerestaurantnearby?2.Bar:isthereacomfortablebarareatowaitin?3.Fri/Sat:istodayFridayorSaturday?4.Hungry:arewehungry?5.Patrons(顾客):numberofpeopleintherestaurant(None,Some,Full)6.Price:pricerange($,$$,$$$)7.Raining:isitrainingoutside?8.Reservation(预约):havewemadeareservation?9.Type:kindofrestaurant(French,Italian,Thai,Burger)10.WaitEstimate:estimatedwaitingtime(0-10,10-30,30-60,60)Attribute-basedrepresentations以下是12个基于这10个属性描述的例子,属性值是布尔、离散和连续的E.g.,situationswhereIwill/won'twaitforatable:Classification(分类)ofexamplesispositive(T)ornegative(F)Decisiontrees一个可能的假设表示E.g.,hereisthe“true”treefordecidingwhethertowait:DecisionTreeLearningExpressiveness(表达能力)决策树能够表达关于输入属性的任何函数E.g.,forBooleanfunctions,truthtablerow→pathtoleaf(函数真值表的每行对应于树中的一条路径):Trivially,thereisaconsistentdecisiontreeforanytrainingsetwithonepathtoleafforeachexample(unlessfnondeterministicinx)butitprobablywon‘tgeneralizetonewexamples需要找到一颗更紧凑的决策树Decisiontreelearning目标:找到一颗小的决策树来满足训练样本Idea:(递归地)选择最佳属性作为(子)树的根ChoosinganattributeIdea:一个好的属性选择将样本分割成理想的子集,例如“allpositive”or“allnegative“Patrons?isabetterchoiceUsinginformationtheory(信息论)algorithm落实DTL算法中Choose-Attribute函数的实施InformationContent信息量(Entropy熵):对于一个包含p个正例和n个反例的训练集:Informationgain(信息增益)任何属性A都可以根据属性A的值将训练集E划分为几个子集E1,…,Ev,其中A可以有v个不同的值从属性A测试中得到的信息增益(IG)是原始的信息需求和新的信息需求之间的差异:ChoosetheattributewiththelargestIG信息增益对于训练集,p=n=6,I(6/12,6/12)=1bit考虑属性PatronsandType(andotherstoo):PatronshasthehighestIGofallattributesandsoischosenbytheDTLalgorithmastherootExamplecontd.Decisiontreelearnedfromthe12examples:明显比前面那颗“true”tree要简单得多性能评估Howdoweknowthath≈f?1.Usetheoremsofcomputational/statisticallearningtheory2.Tryhonanewtestset(测试集)ofexamples(usesamedistributionoverexamplespaceastrainingset)Learningcurve(学习曲线)=%correctontestsetasafunctionoftraining评论基于决策树的分类Advantages:易于构造在分类位置记录时速度快对于“小号”树易于解释在简单数据集上分类精度相当于其他分类算法Example:C4.5Simpledepth-firstconstruction.UsesInformationGainKnearestneighborclassifier最近邻模型•教材Section20.4•Linearpredictions线性预测LearningFrameworkFocusofthispart•Binaryclassification(e.g.,predictingspamornotspam):•Regression(e.g.,predictinghousingprice):ClassificationClassification=learningfromdatawithfinitediscretelabels.DominantprobleminMachineLearning线性分类器Binaryclassificationcanbeviewedasthetaskofseparatingclassesinfeaturespace(特征空间):Roadmap线性分类器h(x)=sign(wTx+b)•需要寻找合适的w(direction)和b(location)of分界线•Wanttominimizetheexpectedzero/oneloss(损失)forclassifierh:X→Y,whichish(x)=sign(wTx+b)理想情况下,完全分割线性分类器→损失最小化理想情况下我们想找到一个分类器h(x)=sign(wTx+b)来最小化0/1lossUnfortunately,thisisahardproblem..替换的损失函数:LearningasOptimizationLeastSquaresClassification最小二乘分类Leastsquareslossfunction:目标:学习一个分类器h(x)=sign(wTx+b)来使最小二乘损失最小最小二乘分类解决方案W解决方案通用的线性分类Regression(回归)Regression=learningfromcontinuouslylabeleddata.(连续的标签数据)线性回归一般的线性/多项式回归模型复杂度及过拟合模型复杂度及过拟合欠拟合高偏离模型复杂度及过拟合模型复杂度及过拟合过拟合高方差模型复杂度及过拟合模型复杂度及过拟合PredictionErrors预测误差•Trainingerrors(apparenterrors)—训练误差Errorscommittedonthetrainingset•Testerrors—测试误差Errorscommittedonthetestset•Generalizationerrors—泛化误差Expectederrorofamodeloverrandomselectionofrecordsfromsamedistribution(未知记录上的期望误差)模型复杂度及过拟合欠拟合:whenmodelistoosimple,bothtrainingandtesterrorsarelarge过拟合:whenmodelistoocomplex,trainingerrorissmallbuttesterrorislargeIncorporatingModelComplexity基本原理:Ockham’sRazor奥卡姆剃刀原则Giventwomodelsofsimilargeneralizationerrors,oneshouldpreferthesimplermodeloverthemorecomplexmodelAcomplexmodelhasagreaterchanceofbeingfittedaccidentallybyerrorsindata复杂的模型在拟合上更容易受错误数据误导因此在评估一个模型时需要考虑其模型复杂度Regularization(规范化)直观的:smallvaluesforparameters“Simpler”hypothesisLesspronetooverfittingRegularizationL-2andL-1regularization•L-2:easytooptimize,closedformsolution•L-1:sparsityMorethantwoclasses?Morethantwoclasses评论最小二乘分类•不是分类问题最好的办法•But易于训练,closedformsolution(闭式解)可以与很多经典的学习原理相结合Cross-validation(交叉验证)•基本思想:如果一个模型有一些过拟合(对训练数据敏感),那么这个模型是不稳定的。也就是说移除部分数据会显著地改变拟合结果。•因此我们先取出部分数据,在剩余数据中做拟合,然后在取出的数据中做测试Cross-validationCross-validationCross-validationCross-validationLearningFrameworkModel/parameterlearningparadigm•ChooseamodelclassNB,kNN,decisiontree,loss/regularizationcombination•ModelselectionCrossvalidation•TrainingOptimization•TestingSummarySupervisedlearning(1)ClassificationNaïveBayesmodelDecisiontreeLeastsquaresclassification(2)RegressionLeastsquaresregression课后思考题•试证明对于不含冲突数据(即特征向量完全相同但标记不同)的训练集,必存在与训练集一致(即训练误差为0)的决策树。
本文标题:人工智能10学习2(PPT58页)
链接地址:https://www.777doc.com/doc-26468 .html