您好,欢迎访问三七文档
arXiv:cond-mat/0309662v1[cond-mat.soft]29Sep2003DesignofaProteinPotentialEnergyLandscapebyParameterOptimizationJulianLee1,2,3,Seung-YeonKim3,andJooyoungLee3∗1DepartmentofBioinformaticsandLifeSciences,SoongsilUniversity,Seoul156-743,Korea2BioinformaticsandMolecularDesignTechnologyInnovationCenter,SoongsilUniversity,Seoul156-743,Korea3SchoolofComputationalSciences,KoreaInstituteforAdvancedStudy,Seoul130-722,KoreaWeproposeanautomatedprotocolfordesigningtheenergylandscapeofaproteinenergyfunctionbyoptimizingitsparameters.Theparametersareoptimizedsothatnotonlytheglobalminimumenergyconformationbecomesnative-like,butalsotheconformationsdistinctfromthenativestruc-turehavehigherenergiesthanthoseclosetothenativeone.Weclassifylow-energyconformationsintothreegroups,super-native,native-like,andnon-nativeones.Thesuper-nativeconformationshaveallbackbonedihedralanglesfixedtotheirnativevalues,andonlytheirside-chainsaremini-mizedwithrespecttoenergy.Ontheotherhand,thenative-likeandnon-nativeconformationsallcorrespondtothelocalminimaoftheenergyfunction.Theseconformationsarerankedaccordingtotheirroot-mean-squaredeviation(RMSD)ofbackbonecoordinatesfromthenativestructure,andafixednumberofconformationswiththelowestRMSDvaluesaredefinedtobenative-likeconfor-mations,whereastherestaredefinedasnon-nativeones.WedefinetwoenergygapsE(1)gapandE(2)gap.TheenergygapE(1)gap(E(2)gap)istheenergydifferencebetweenthelowestenergyofthenon-nativeconformationsandthehighestenergyofthenative-like(super-native)ones.TheparametersaremodifiedtodecreasebothE(1)gapandE(2)gap.Inaddition,thenon-nativeconformationswithlargervaluesofRMSDaremadetohavehigherenergyrelativetothosewithsmallerRMSDvalues.WesuccessfullyapplyourprotocoltotheparameteroptimizationoftheUNRESpotentialenergy,usingthetrainingsetofbetanova,1fsd,the36-residuesubdomainofchickenvillinheadpiece(PDBID1vii),andthe10-55residuefragmentofstaphylococcalproteinA(PDBID1bdd).Thenewprotocoloftheparameteroptimizationshowsbetterperformancethanearliermethodswhereonlythediffer-encebetweenthelowestenergiesofnative-likeandnon-nativeconformationswasadjustedwithoutconsideringvariousdegreesofnative-likenessoftheconformations.Wealsoperformjackknifetestsonotherproteinsnotincludedinthetrainingsetandobtainpromisingresults.Theresultssuggestthattheparametersweobtainedusingthetrainingsetofthefourproteinsaretransferabletootherproteinstosomeextent.I.INTRODUCTIONThepredictionofthethree-dimensionalstructureandthefoldingpathwayofaproteinsolelyfromitsaminoacidsequenceisoneofthemostchallengingproblemsinbiophysicalchemistry.Therearetwomajorapproachestotheproteinstructureprediction,socalledknowledge-basedmethodsandenergy-basedmethods.Theknowledge-basedmethods,1–4whichincludecomparativemodelingandfoldrecognition,usestatisticalrelationshipbetweensequencesandtheirthree-dimensionalstructuresintheProteinDataBank(PDB),withoutdeepunderstandingoftheinteractionsgoverningtheproteinfolding.Therefore,althoughthesemethodscanbeverypowerfulforpredictingthestructureofaproteinsequencethathasacertaindegreeofsimilaritytothoseinPDB,theycannotprovidethefundamentalunderstandingsoftheproteinfoldingmechanism.Ontheotherhand,theenergy-basedmethods,5–11whicharealsocalledthephysics-basedmethods,arebasedonthethermodynamichypothesisthatproteinsadoptnativestructuresthatminimizetheirfreeenergies.12Understandingthefundamentalprinciplesoftheproteinfoldingbythesemethodswillleadnotonlytothesuccessfulstructureprediction,especiallyforproteinshavingnosimilarsequencesinPDB,butalsototheclarificationoftheproteinfoldingmechanism.However,therehavebeenseveralmajorobstaclestothesuccessfulapplicationofenergy-basedmethodstotheproteinfoldingproblem.First,thereareinherentinaccuraciesinthepotentialenergyfunctionswhichdescribetheenergeticsofproteins.Second,eveniftheglobalminimum-energyconformationisnative-like,thisdoesnotguaranteethataproteinwillfoldintoitsnativestructureinareasonabletime-scaleunlesstheenergylandscapeisproperlydesigned,assummarizedintheLevinthalparadox.∗Correspondingauthor:jlee@kias.re.kr1Physics-basedpotentialsaregenerallyparameterizedfromquantummechanicalcalculationsandexperimentaldataonmodelsystems.13However,suchcalculationsanddatadonotdeterminetheparameterswithperfectaccuracy.Theresidualerrorsinpotentialenergyfunctionsmayhavesignificanteffectsonsimulationsofmacromoleculessuchasproteinswherethetotalenergyisthesumofalargenumberofinteractionterms.Moreover,thesetermsareknowntocanceleachothertoahighdegree,makingtheirsystematicerrorsevenmoresignificant.Thusitiscrucialtorefinetheparametersofapotentialenergyfunctionbeforeitcanbesuccessfullyappliedtotheproteinfoldingproblem.Aniterativeprocedurewhichsystematicallyrefinestheparametersofagivenpotentialenergyfunctionwasrecentlyproposed13andsuccessfullyappliedtotheparameteroptimization13–16ofaUNRESpotentialenergy.17–19Themethodexploitsthehighefficiencyoftheconformationalspaceannealing(CSA)method20–24infindingdistinctlowenergyconformations.Foragivensetofproteins,whoselow-lyinglocalminimum-energyconformationsforagivenenergyfunctionisfoundbytheCSAm
本文标题:Design of a Protein Potential Energy Landscape by
链接地址:https://www.777doc.com/doc-3212120 .html