您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 质量控制/管理 > WoLF-PSORT-蛋白亚细胞定位预测
NucleicAcidsResearch,2007,Vol.35,WebServerissueW585–W587doi:10.1093/nar/gkm259WoLFPSORT:proteinlocalizationpredictorPaulHorton1,Keun-JoonPark1,2,TakeshiObayashi3,NaoyaFujita1,3,HajimeHarada1,C.J.Adams-Collier4andKentaNakai3,*1ComputationalBiologyResearchCenter,AIST,Tokyo,Japan,2CenterforGenomeScience,NationalInstituteofHealth,KoreaCenterforDiseaseControl&Prevention,5Nokbeon-Dong,Eunpyung-Gu,Seoul122-701Korea,3HumanGenomeCenter,InstituteofMedicalScience,UniversityofTokyo,Tokyo,Japanand4CollierTechnologies,Everett,WA,USAReceivedJanuary30,2007;RevisedMarch26,2007;AcceptedApril8,2007ABSTRACTWoLFPSORTisanextensionofthePSORTIIprogramforproteinsubcellularlocationprediction.WoLFPSORTconvertsproteinaminoacidsequencesintonumericallocalizationfeatures;basedonsortingsignals,aminoacidcompositionandfunctionalmotifssuchasDNA-bindingmotifs.Afterconversion,asimplek-nearestneighborclassifierisusedforprediction.Usinghtml,theevidenceforeachpredictionisshownintwoways:(i)alistofproteinsofknownlocalizationwiththemostsimilarlocalizationfeaturestothequery,and(ii)tableswithdetailedinformationaboutindividuallocalizationfeatures.Forconvenience,sequencealignmentsofthequerytosimilarproteinsandlinkstoUniProtandGeneOntologyareprovided.Takentogether,thisinformationallowsausertounderstandtheevidence(orlackthereof)behindthepredictionsmadeforparticularproteins.WoLFPSORTisavailableatwolfpsort.orgINTRODUCTIONBilipidmembranesdivideeukaryoticcellsintovarioustypesoforganellescontainingcharacteristicproteinsandperformingspecializedfunctions.Thus,subcellularlocalizationinformationgivesanimportantcluetoaprotein’sfunction.AlthoughlocalizationsignalsinmRNAappeartoplaysomerole(1),themaindetermi-nantofaprotein’slocalizationresiduesintheprotein’saminoacidsequence.(Werecommendwikipedia.org/wiki/Protein_targetingforabriefoverviewandAlbertsetal.(2)foratextbookdescription.)Numerousexperimentstodetermineproteinlocaliza-tionhavebeenperformedtodate.Thesecanbroadlybeclassifiedas:small-scaleexperiments—theresultsofwhichcontinuetoaccumulateinpublicdatabases,suchasUniProt(3)andGeneOntology(4);andlarge-scaleexperimentsusingepitope(5)orgreenfluorescentprotein(GFP)(6)tagging,orbyseparationoforganellesbycentrifugationcombinedwithproteinidentificationbymassspectrometry(7,8).Althoughtheyprovideinvaluableinformation,thecoverageofexperimentaldataisonlyhighformodelorganisms,particularlyyeast.Moreover,theagreementamongstlarge-scaleexperimentaldataisonly75–80%(6–9).Thus,computationalpredictionoflocalizationfromaminoacidremainsanimportanttopic.Numerouscomputationalmethodsareavailable[reviewedin(10,11)].Some(includingWoLFPSORT)haverecentlybeenbenchmarkedbySprengeretal.(12),whofoundthecomputationalmethodstobeusefulforsites,suchasthenucleus,forwhichmanytrainingexamplescanbeeasilyobtainedfromUniProt(whichisthesourceofmostorallofthetrainingdataformostpredictionmethods—includingWoLFPSORT).Thedifferentmethodstheybenchmarkedwerefoundtohavedifferentstrengths.Here,wedescribethepublicserverforourWoLFPSORTmethod.PREDICTIONMETHODWoLFPSORTisanextensionofPSORTII(13,14)andalsousesthePSORT(15)localizationfeaturesforprediction.Inaddition,WoLFPSORTusessomefeaturesfromiPSORT(16)andaminoacidcomposition.Thosefeaturesareusedtoconvertaminoacidsequencesintonumericalvectors,whicharethenclassifiedwithaweightedk-nearestneighborclassifier.WoLFPSORTusesawrappermethodtoselectanduseonlythemostrelevantfeatures.Thisreducestheamountofinformationwhichneedstobeconsidered(anddisplayed)fortheusertointerpretindividualpredictionsandmayalsomakethepredictorlesspronetooverlearning.Thepredictionmethodhasdescribedinmoredetailelsewhere(17).*Towhomcorrespondenceshouldbeaddressed.Tel:þ81-3-5449-5131;Fax:þ81-3-5449-5133;Email:knakai@ims.u-tokyo.ac.jp2007TheAuthor(s)ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionNon-CommercialLicense()whichpermitsunrestrictednon-commercialuse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited.DatasetTheWoLFPSORTdatasetisdividedintofungi,plantandanimalcontaining2113,2333and12771proteins,respectively.ThecurrentdatawasprimarilyobtainedfromUniProt(3)version45,butsubcellularlocalizationinformationfromGeneOntology(4)wasalsoused.Entrieswithevidencecodes{TAS,IDA,IMP}wereincluded,withmanualrevisionsinafewcases.Weintendtoupdatethesedatasetsregularlyinthefuture.LOCALIZATIONSITESANDPREDICTIONACCURACYWoLFPSORTclassifiesproteinsintomorethan10loca-lizationsites,includingduallocalizationsuchasproteinswhichshuttlebetweenthecytosolandnucleus.Basedonourcross-validationstudies(17),weestimatesensitivityandspecificityofaround70%for:nucleus,mitochondria,cytosol,plasmamembrane,extracellularand(inplants)chloroplast.Forothersites,suchasperoxisome,Golgi,etc.thesensitivityisverylow,butusefulpredictionsarestillmadeinsomecases.Forexample,theArabidopsisseedprotein12S1_ARATHisreasonablypredictedtolocalizetothevacuoleeventhoughonlyoneofitsneighbors(seebelow)sharessignificantsequencesimilarity.Anindependenttest(12)onmouseprotein
本文标题:WoLF-PSORT-蛋白亚细胞定位预测
链接地址:https://www.777doc.com/doc-1872114 .html