|
目录Python快速上手(二十九)-Pythonrequests模块详解Python3requests模块详解1.requests模块简介2.基本用法3.处理请求参数4.处理响应5.处理请求头6.会话管理7.文件上传8.文件下载9.超时设置10.重定向与历史记录11.异常处理12.认证13.高级用法14.实际应用案例Python快速上手(二十九)-Pythonrequests模块详解Python3requests模块详解Python的requests模块是一个用于发送HTTP请求的流行库,它比内置的urllib库更易用且功能强大。requests模块提供了简单的API,可以处理复杂的HTTP请求,处理Cookie、会话、重定向等功能。本文将详细讲解requests模块的基本用法、发送请求、处理响应、会话管理、异常处理和实际应用案例。1.requests模块简介requests模块是一个用于发送HTTP请求的第三方库,支持HTTP/1.1,并且具有良好的支持和文档。要使用requests模块,首先需要安装它:pipinstallrequests12.基本用法2.1发送GET请求发送GET请求是requests模块中最基本的操作,可以使用requests.get方法:importrequestsresponse=requests.get('https://api.github.com')print(response.status_code)#输出状态码print(response.text)#输出响应内容123452.2发送POST请求发送POST请求可以使用requests.post方法,并传递数据:importrequestsdata={'key':'value'}response=requests.post('https://httpbin.org/post',data=data)print(response.status_code)print(response.json())#输出JSON响应内容1234563.处理请求参数3.1URL参数可以使用params参数来传递URL参数:importrequestsparams={'q':'python','sort':'stars'}response=requests.get('https://api.github.com/search/repositories',params=params)print(response.url)#查看完整的请求URLprint(response.json())1234563.2表单数据可以使用data参数来传递表单数据:importrequestsdata={'username':'user','password':'pass'}response=requests.post('https://httpbin.org/post',data=data)print(response.json())123453.3JSON数据可以使用json参数来传递JSON数据:importrequestsjson_data={'key':'value'}response=requests.post('https://httpbin.org/post',json=json_data)print(response.json())123454.处理响应4.1响应状态码可以使用status_code属性来获取响应的状态码:importrequestsresponse=requests.get('https://api.github.com')print(response.status_code)12344.2响应内容可以使用text属性来获取响应的文本内容,使用json方法来获取JSON格式的响应内容:importrequestsresponse=requests.get('https://api.github.com')print(response.text)print(response.json())123454.3响应头可以使用headers属性来获取响应头:importrequestsresponse=requests.get('https://api.github.com')print(response.headers)12345.处理请求头可以使用headers参数来设置请求头:importrequestsheaders={'User-Agent':'my-app/0.0.1'}response=requests.get('https://api.github.com',headers=headers)print(response.status_code)123456.会话管理requests模块提供了会话对象,可以在多个请求之间保持某些参数:6.1创建会话可以使用requests.Session创建会话对象:importrequestssession=requests.Session()response=session.get('https://httpbin.org/cookies/set/sessioncookie/123456789')print(response.text)123456.2共享会话参数会话对象可以共享Cookies、头信息等参数:importrequestssession=requests.Session()session.headers.update({'User-Agent':'my-app/0.0.1'})response=session.get('https://httpbin.org/headers')print(response.json())12345677.文件上传可以使用files参数来上传文件:importrequestsfiles={'file'pen('report.txt','rb')}response=requests.post('https://httpbin.org/post',files=files)print(response.json())123458.文件下载可以使用requests.get方法并将响应内容写入文件来下载文件:importrequestsurl='https://httpbin.org/image/png'response=requests.get(url)withopen('image.png','wb')asfile:file.write(response.content)print('Filedownloadedsuccessfully')12345679.超时设置可以使用timeout参数来设置请求超时时间:importrequeststry:response=requests.get('https://httpbin.org/delay/10',timeout=5)exceptrequests.exceptions.Timeout:print('Therequesttimedout')12345610.重定向与历史记录requests会自动处理重定向,可以使用history属性查看重定向历史:importrequestsresponse=requests.get('http://github.com')print(response.url)print(response.history)1234511.异常处理requests模块定义了一系列异常类,用于处理请求过程中可能发生的错误:importrequeststry:response=requests.get('https://httpbin.org/status/404')response.raise_for_status()#如果响应状态码不是200,抛出HTTPErrorexceptrequests.exceptions.HTTPErroraserr:print('HTTPerroroccurred:',err)exceptrequests.exceptions.ConnectionErroraserr:print('Connectionerroroccurred:',err)exceptrequests.exceptions.Timeoutaserr:print('Timeouterroroccurred:',err)exceptrequests.exceptions.RequestExceptionaserr:print('Anerroroccurred:',err)12345678910111213'运行运行12.认证12.1基本认证可以使用auth参数进行基本认证:importrequestsfromrequests.authimportHTTPBasicAuthresponse=requests.get('https://httpbin.org/basic-auth/user/pass',auth=HTTPBasicAuth('user','pass'))print(response.json())1234512.2其他认证方式requests模块还支持OAuth、JWT等其他认证方式,可以使用第三方库如requests-oauthlib来实现。13.高级用法13.1代理可以使用proxies参数来设置代理:importrequestsproxies={'http':'http://10.10.1.10:3128','https':'http://10.10.1.10:1080',}response=requests.get('https://httpbin.org/get',proxies=proxies)print(response.json())1234567813.2流式请求可以使用stream参数来进行流式请求,适用于下载大文件:importrequestsurl='https://httpbin.org/stream/20'response=requests.get(url,stream=True)forlineinresponse.iter_lines():ifline:print(line)1234567813.3自定义适配器可以自定义HTTP适配器来控制底层的HTTP连接行为:fromrequests.adaptersimportHTTPAdapterfromrequests.packages.urllib3.util.retryimportRetryclassMyAdapter(HTTPAdapter):defsend(self,request,**kwargs):print('Customsendmethod')returnsuper().send(request,**kwargs)session=requests.Session()session.mount('https://',MyAdapter())response=session.get('https://httpbin.org/get')print(response.status_code)12345678910111214.实际应用案例14.1API调用以下示例展示了如何使用requests模块调用GitHubAPI:importrequestsurl='https://api.github.com/repos/psf/requests'response=requests.get(url)data=response.json()print(f"Repository:{data['name']}")print(f"Description:{data['description']}")print(f"Stars:{data['stargazers_count']}")print(f"Forks:{data['forks_count']}")12345678914.2网页抓取以下示例展示了如何使用requests模块抓取网页内容并使用BeautifulSoup进行解析:importrequestsfrombs4importBeautifulSoupurl='https://www.python.org'response=requests.get(url)soup=BeautifulSoup(response.text,'html.parser')print('Pagetitle:',soup.title.string)1234567814.3自动化表单提交以下示例展示了如何使用requests模块自动化提交表单:importrequestsurl='https://httpbin.org/forms/post'data={'name':'John','email':'john@example.com'}response=requests.post(url,data=data)print(response.json())123456
|
|