python如何安装requests库,python如何安装requests模块

　　Requests是一个非常实用的pythonHTTP客户端库，经常被爬虫和测试服务器在响应数据时使用。下面这篇文章主要介绍Python中安装使用requests库的相关信息，有需要的朋友可以参考一下。

　　00-1010前言1、请求简介2、安装请求库3、请求库常用方法4、响应对象常用属性5、发送带请求的get请求5.1不带参数的get请求5.2带参数的get请求5.2.1查询参数params5.2.2 SSL证书认证参数verify5.2.3设置超时5.2.4代理IP参数proxies5.3获取JSON数据5.4获取二进制数据6、发送带请求的post请求7、发送带请求会话的请求摘要。

前言

　　Requests是Python的一个非常实用的HTTP客户端，完全满足当今网络爬虫的需求。

　　Urllib库和requests库功能差不多，但是requests库功能更多，更实用。

1、Requests介绍

　　命令安装(方法1)

　　Windows操作系统：pip安装请求Mac操作系统：pip3安装请求Linux操作系统：sodo pip安装请求源代码安装(方法2)

　　下载请求源代码http://mirrors.aliyun.com/pypi/simple/请求/将文件下载到本地后，解压到Python安装目录，然后打开解压后的文件运行命令行，然后输入python setup.py install安装测试。

　　导入请求，如果没有错误，说明安装已经成功！

2、requests库的安装

　　序列号

　　方法

　　形容

　　一个

　　请求.请求(url)

　　构造一个请求以支持以下方法

　　requests.get()

　　发送Get请求。

　　三

　　requests.post()

　　发送发布请求。

　　四

　　requests.head()

　　获取HTML的标题信息

　　五

　　requests.put()

　　发送上传请求

　　六

　　请求.补丁()

　　提交部分修改请求。

　　requests.delete()

　　提交删除请求

　　最常用的方法为get()和post()分别用于发送Get请求和Post请求

4、response对象的常用属性

　　序号

　　属性或方法

　　描述

　　response.status_code

　　响应状态码

　　response.content

　　把response对象转换为二进制数据

　　response.text

　　把response对象转换为字符串数据

　　response.encoding

　　定义response对象的编码

　　response.cookie

　　获取请求后的cookie

　　response.url

　　获取请求网址

　　response.json()

　　内置的JSON解码器

　　Response.headers

　　以字典对象存储服务器响应头，字典键不区分大小写

5、使用requests发送get请求

不带参数的get请求
- 案例：爬取百度主页
带参数的get请求
- 案例：贴吧
获取JSON数据
- 案例：百度美女图片
获取二进制数据
- 案例：下载百度logo

5.1 不带参数的get请求

# 不带参数的get请求
　　import requests
　　url=http://www.baidu.com
　　resp = requests.get(url)
　　# 设置响应的经编码格式
　　resp.encoding=utf-8
　　cookie=resp.cookies # 获取请求后的cookie信息
　　headers=resp.headers
　　print(响应状态码：, resp.status_code)
　　print(请求后的cookie：, cookie)
　　print(获取请求的网址：, resp.url)
　　print(响应头：, headers)
　　print(响应内容, resp.text)
　　----------------------------------以下为输出结果----------------------------------
　　响应状态码： 200
　　请求后的cookie： <RequestsCookieJar[<Cookie BDORZ=27315 for .baidu.com/>]>
　　获取请求的网址： http://www.baidu.com/
　　响应头： {Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform, Connection: keep-alive, Content-Encoding: gzip, Content-Type: text/html, Date: Fri, 23 Apr 2021 00:10:35 GMT, Last-Modified: Mon, 23 Jan 2017 13:28:16 GMT, Pragma: no-cache, Server: bfe/1.0.8.18, Set-Cookie: BDORZ=27315; max-age=86400; domain=.baidu.com; path=/, Transfer-Encoding: chunked}
　　响应内容 <!DOCTYPE html>
　　<!--STATUS OK--><html> <head><meta http-equiv=content-type.........

5.2 带参数的get请求

5.2.1 查询参数params

params，数据类型为字典
作用：对URL地址中的查询参数自动进行编码拼接
使用示例：resp = requests.get(url=baseurl, params=params, headers=headers)

# 带参数的get请求
　　import requests
　　url = https://tieba.baidu.com/f?
　　params = {kw:大学吧, pn:3}
　　headers = {User-Agent : Mozilla/5.0 (Windows NT 6.1; WOW64)}
　　# 开始请求
　　html = requests.get(url=url, params=params, headers=headers).text
　　print(html)

5.2.2 SSL证书认证参数 verify

参数值：True（默认） False
适用网站：https类型网站但是没有经过证书认证机构认证的网站
适用场景：当程序中抛出SSLError异常则考虑使用此参数
使用示例：requests.get(url=url,headers=headers,verify=False)
当verify参数设置为False时，则不会再对网站进行SSL证书认证

5.2.3 设置超时时间 timeout

　　我们可以通过timeout属性设置超时时间，一旦超过这个时间还没获得响应内容，就会提示错误。

import requests
　　requests.get(http://github.com, timeout=0.001)
　　---------------------以下为输出结果（报错）---------------------
　　Traceback (most recent call last):
　　 File "<stdin>", line 1, in <module>
　　requests.exceptions.Timeout: HTTPConnectionPool(host=github.com, port=80): Request timed out. (timeout=0.001)

5.2.4 代理IP参数 proxies

　　5.2.4.1 免费代理IP

语法格式：proxies = { '协议':'协议://IP:端口号'}
示例：
- 当我们抓取的地址为http时，则会选择proxies中http的代理，反之为https

import requests
　　url = http://httpbin.org/get
　　headers = {User-Agent:Mozilla/5.0}
　　# 定义代理，再代理IP网站中查找免费代理IP
　　proxies = {
　　 http:http://112.85.164.220:9999,
　　 https:https://112.85.164.220:9999
　　}
　　html = requests.get(url=url,proxies=proxies,headers=headers,timeout=5).text
　　print(html)

　　5.2.4.1 私密代理和独享代理

　　语法格式：proxies = { '协议':'协议://用户名:密码@IP:端口号'}

　　示例：

5.3 获取JSON数据

# 获取json数据
　　# 案例：百度获取宫崎骏动漫图片
　　# 滑动页面，URL没变化，F12中的文件越来越多，说明这是动态网页
　　# 选择XHR中的一个，复制其Request URL,粘贴给url
　　import requests
　　url=https://image.baidu.com/search/acjson?tn=resultjson_com&logid=10167214135414424439&ipn=rj&ct=201326592&is=&fp=result&queryWord=%E5%AE%AB%E5%B4%8E%E9%AA%8F%E5%8A%A8%E6%BC%AB%E5%9B%BE%E7%89%87&cl=2&lm=-1&ie=utf-8&oe=utf-8&adpicid=&st=&z=&ic=&hd=&latest=&copyright=&word=%E5%AE%AB%E5%B4%8E%E9%AA%8F%E5%8A%A8%E6%BC%AB%E5%9B%BE%E7%89%87&s=&se=&tab=&width=&height=&face=&istype=&qc=&nc=&fr=&expermode=&force=&pn=30&rn=30&gsm=1e&1619134335166=
　　resp=requests.get(url)
　　json_data=resp.json()
　　print(json_data)

5.4 获取二进制数据

　　一般来说，对于非文本请求，可以以字节形式访问响应正文。

# 获取二进制数据
　　# 案例：保存百度图片
　　import requests
　　url=https://www.baidu.com/img/bd_logo1.png
　　resp=requests.get(url)
　　# 存储
　　with open(logo.png,wb) as file:
　　 # resp.content：把response对象转换为二进制数据
　　 file.write(resp.content)

6、使用requests发送post请求

语法结构
　　
- requests.post(url, data=None, json=None)
参数说明
- url：需要爬取的网站的网址
- data：请求数据
- json：json格式的数据
案例：登录小说楼
- https://www.xslou.com/login.php

import requests
　　url=https://www.xslou.com/login.php
　　data={username:18600605736, password:57365736, action:login}
　　resp = requests.post(url,data)
　　resp.encoding=gb2312
　　print(响应状态码：, resp.status_code) # 200
　　print(响应内容, resp.text) # <html>......</html>

7、使用requests的session发送请求

import requests
　　url=https://www.xslou.com/login.php
　　data={username:18600605736, password:57365736, action:login}
　　# 使用session发送请求
　　session = requests.session()
　　resp=session.post(url,data=data) # 使用session发送post请求
　　resp.encoding=gb2312
　　# print( resp.text) # <html>..<title>登录成功</title>....</html>