您好,欢迎访问三七文档
RegressionShrinkageandSelectionviatheLassoAuthor(s):RobertTibshiraniSource:JournaloftheRoyalStatisticalSociety.SeriesB(Methodological),Vol.58,No.1(1996),pp.267-288Publishedby:BlackwellPublishingfortheRoyalStatisticalSocietyStableURL::05/01/201102:51YouruseoftheJSTORarchiveindicatesyouracceptanceofJSTOR'sTermsandConditionsofUse,availableat.://=black..EachcopyofanypartofaJSTORtransmissionmustcontainthesamecopyrightnoticethatappearsonthescreenorprintedpageofsuchtransmission.JSTORisanot-for-profitservicethathelpsscholars,researchers,andstudentsdiscover,use,andbuilduponawiderangeofcontentinatrusteddigitalarchive.Weuseinformationtechnologyandtoolstoincreaseproductivityandfacilitatenewformsofscholarship.FormoreinformationaboutJSTOR,pleasecontactsupport@jstor.org.BlackwellPublishingandRoyalStatisticalSocietyarecollaboratingwithJSTORtodigitize,preserveandextendaccesstoJournaloftheRoyalStatisticalSociety.SeriesB(Methodological).(1996)58,No.1,pp.267-288RegressionShrinkageandSelectionviatheLassoByROBERTTIBSHIRANItUniversityofToronto,Canada[ReceivedJanuary1994.RevisedJanuary1995]SUMMARYWeproposeanewmethodforestimationinlinearmodels.The'lasso'minimizestheresidualsumofsquaressubjecttothesumoftheabsolutevalueofthecoefficientsbeinglessthanaconstant.Becauseofthenatureofthisconstraintittendstoproducesomecoefficientsthatareexactly0andhencegivesinterpretablemodels.Oursimulationstudiessuggestthatthelassoenjoyssomeofthefavourablepropertiesofbothsubsetselectionandridgeregression.Itproducesinterpretablemodelslikesubsetselectionandexhibitsthestabilityofridgeregression.ThereisalsoaninterestingrelationshipwithrecentworkinadaptivefunctionestimationbyDonohoandJohnstone.Thelassoideaisquitegeneralandcanbeappliedinavarietyofstatisticalmodels:extensionstogeneralizedregressionmodelsandtree-basedmodelsarebrieflydescribed.Keywords:QUADRATICPROGRAMMING;REGRESSION;SHRINKAGE;SUBSETSELECTION1.INTRODUCTIONConsidertheusualregressionsituation:wehavedata(xi,yi),i=1,2,...,N,wherex=(x,...,xP)Tandyiaretheregressorsandresponsefortheithobservation.Theordinaryleastsquares(OLS)estimatesareobtainedbyminimizingtheresidualsquarederror.TherearetworeasonswhythedataanalystisoftennotsatisfiedwiththeOLSestimates.Thefirstispredictionaccuracy:theOLSestimatesoftenhavelowbiasbutlargevariance;predictionaccuracycansometimesbeimprovedbyshrinkingorsettingto0somecoefficients.Bydoingsowesacrificealittlebiastoreducethevarianceofthepredictedvaluesandhencemayimprovetheoverallpredictionaccuracy.Thesecondreasonisinterpretation.Withalargenumberofpredictors,weoftenwouldliketodetermineasmallersubsetthatexhibitsthestrongesteffects.ThetwostandardtechniquesforimprovingtheOLSestimates,subsetselectionandridgeregression,bothhavedrawbacks.Subsetselectionprovidesinterpretablemodelsbutcanbeextremelyvariablebecauseitisadiscreteprocess-regressorsareeitherretainedordroppedfromthemodel.Smallchangesinthedatacanresultinverydifferentmodelsbeingselectedandthiscanreduceitspredictionaccuracy.Ridgeregressionisacontinuousprocessthatshrinkscoefficientsandhenceismorestable:however,itdoesnotsetanycoefficientsto0andhencedoesnotgiveaneasilyinterpretablemodel.Weproposeanewtechnique,calledthelasso,for'leastabsoluteshrinkageandselectionoperator'.Itshrinkssomecoefficientsandsetsothersto0,andhencetriestoretainthegoodfeaturesofbothsubsetselectionandridgeregression.tAddressforcorrespondence:DepartmentofPreventiveMedicineandBiostatistics,andDepartmentofStatistics,UniversityofToronto,12Queen'sParkCrescentWest,Toronto,Ontario,M5S1A8,Canada.E-mail:tibs@utstat.toronto.edu?1996RoyalStatisticalSociety0035-9246/96/58267268TIBSHIRANI[No.1,InSection2wedefinethelassoandlookatsomespecialcases.ArealdataexampleisgiveninSection3,whileinSection4wediscussmethodsforestimationofpredictionerrorandthelassoshrinkageparameter.ABayesmodelforthelassoisbrieflymentionedinSection5.WedescribethelassoalgorithminSection6.SimulationstudiesaredescribedinSection7.Sections8and9discussextensionstogeneralizedregressionmodelsandotherproblems.SomeresultsonsoftthresholdingandtheirrelationshiptothelassoarediscussedinSection10,whileSection11containsasummaryandsomediscussion.2.THELASSO2.1.DefinitionSupposethatwehavedata(xi,yi),i=1,2,...,N,wherexi=(xi,...X,)Tarethepredictorvariablesandyiaretheresponses.Asintheusualregressionset-up,weassumeeitherthattheobservationsareindependentorthattheyisareconditionallyindependentgiventhexys.Weassumethatthexyarestandardizedsothat2ixyl/N?,Eix2/N=1.Letting,3=(PI,...,pp)T,thelassoestimate(&,/3)isdefinedby(&3)=argminf(Y
本文标题:38Regression Shrinkage and Selection via the Lasso
链接地址:https://www.777doc.com/doc-6042941 .html