您好,欢迎访问三七文档
20.6.urllib2—extensiblelibraryforopeningURLs¶NoteTheurllib2modulehasbeensplitacrossseveralmodulesinPython3namedurllib.requestandurllib.error.The2to3toolwillautomaticallyadaptimportswhenconvertingyoursourcestoPython3.Theurllib2moduledefinesfunctionsandclasseswhichhelpinopeningURLs(mostlyHTTP)inacomplexworld—basicanddigestauthentication,redirections,cookiesandmore.Theurllib2moduledefinesthefollowingfunctions:urllib2.urlopen(url[,data][,timeout])¶OpentheURLurl,whichcanbeeitherastringoraRequestobject.WarningHTTPSrequestsdonotdoanyverificationoftheserver’scertificate.datamaybeastringspecifyingadditionaldatatosendtotheserver,orNoneifnosuchdataisneeded.CurrentlyHTTPrequestsaretheonlyonesthatusedata;theHTTPrequestwillbeaPOSTinsteadofaGETwhenthedataparameterisprovided.datashouldbeabufferinthestandardapplication/x-()functiontakesamappingorsequenceof2-tuplesandreturnsastringinthisformat.urllib2modulesendsHTTP/1.1requestswithConnection:closeheaderincluded.Theoptionaltimeoutparameterspecifiesatimeoutinsecondsforblockingoperationsliketheconnectionattempt(ifnotspecified,theglobaldefaulttimeoutsettingwillbeused).ThisactuallyonlyworksforHTTP,HTTPSandFTPconnections.Thisfunctionreturnsafile-likeobjectwithtwoadditionalmethods:ogeturl()—returntheURLoftheresourceretrieved,commonlyusedtodetermineifaredirectwasfollowedoinfo()—returnthemeta-informationofthepage,suchasheaders,intheformofanmimetools.Messageinstance(seeQuickReferencetoHTTPHeaders)ogetcode()—returntheHTTPstatuscodeoftheresponse.RaisesURLErroronerrors.NotethatNonemaybereturnedifnohandlerhandlestherequest(thoughthedefaultinstalledglobalOpenerDirectorusesUnknownHandlertoensurethisneverhappens).Inaddition,defaultinstalledProxyHandlermakessuretherequestsarehandledthroughtheproxywhentheyareset.Changedinversion2.6:timeoutwasadded.urllib2.install_opener(opener)¶InstallanOpenerDirectorinstanceasthedefaultglobalopener.Installinganopenerisonlynecessaryifyouwanturlopentousethatopener;otherwise,simplycallOpenerDirector.open()insteadofurlopen().ThecodedoesnotcheckforarealOpenerDirector,andanyclasswiththeappropriateinterfacewillwork.urllib2.build_opener([handler,...])¶ReturnanOpenerDirectorinstance,whichchainsthehandlersintheordergiven.handlerscanbeeitherinstancesofBaseHandler,orsubclassesofBaseHandler(inwhichcaseitmustbepossibletocalltheconstructorwithoutanyparameters).Instancesofthefollowingclasseswillbeinfrontofthehandlers,unlessthehandlerscontainthem,instancesofthemorsubclassesofthem:ProxyHandler,UnknownHandler,HTTPHandler,HTTPDefaultErrorHandler,HTTPRedirectHandler,FTPHandler,FileHandler,HTTPErrorProcessor.IfthePythoninstallationhasSSLsupport(i.e.,ifthesslmodulecanbeimported),HTTPSHandlerwillalsobeadded.BeginninginPython2.3,aBaseHandlersubclassmayalsochangeitshandler_orderattributetomodifyitspositioninthehandlerslist.Thefollowingexceptionsareraisedasappropriate:exceptionurllib2.URLError¶Thehandlersraisethisexception(orderivedexceptions)whentheyrunintoaproblem.ItisasubclassofIOError.reason¶Thereasonforthiserror.Itcanbeamessagestringoranotherexceptioninstance(socket.errorforremoteURLs,OSErrorforlocalURLs).exceptionurllib2.HTTPError¶Thoughbeinganexception(asubclassofURLError),anHTTPErrorcanalsofunctionasanon-exceptionalfile-likereturnvalue(thesamethingthaturlopen()returns).ThisisusefulwhenhandlingexoticHTTPerrors,suchasrequestsforauthentication.code¶AnHTTPstatuscodeasdefinedinRFC2616.ThisnumericvaluecorrespondstoavaluefoundinthedictionaryofcodesasfoundinBaseHTTPServer.BaseHTTPRequestHandler.responses.reason¶Thereasonforthiserror.Itcanbeamessagestringoranotherexceptioninstance.Thefollowingclassesareprovided:classurllib2.Request(url[,data][,headers][,origin_req_host][,unverifiable])¶ThisclassisanabstractionofaURLrequest.urlshouldbeastringcontainingavalidURL.datamaybeastringspecifyingadditionaldatatosendtotheserver,orNoneifnosuchdataisneeded.CurrentlyHTTPrequestsaretheonlyonesthatusedata;theHTTPrequestwillbeaPOSTinsteadofaGETwhenthedataparameterisprovided.datashouldbeabufferinthestandardapplication/x-()functiontakesamappingorsequenceof2-tuplesandreturnsastringinthisformat.headersshouldbeadictionary,andwillbetreatedasifadd_header()wascalledwitheachkeyandvalueasarguments.Thisisoftenusedto“spoof”theUser-Agentheader,whichisusedbyabrowsertoidentifyitself–someHTTPserversonlyallowrequestscomingfromcommonbrowsersasopposedtoscripts.Forexample,MozillaFirefoxmayidentifyitselfasMozilla/5.0(X11;U;Linuxi686)Gecko/20071127Firefox/2.0.0.11,whileurllib2‘sdefaultuseragentstringisPython-urllib/2.6(onPython2.6).Thefinaltwoargumentsareonlyofinterestforcorrecthandlingofthird-partyHTTPcookies:origin_req_hostshouldbetherequest-hostoftheorigintransaction,asdefinedbyRFC2965.Itdefaultstocookielib.request_host(self).ThisisthehostnameorIPaddressoftheoriginalrequestthatwasinitiatedbytheuser.Forexample,iftherequestisforanimageinanHTMLdocument,thisshouldbetherequest-hostoftherequestforthepagecontainingtheimage.unverifiableshoul
本文标题:Python urllib2 ― extensible library for opening UR
链接地址:https://www.777doc.com/doc-4210391 .html