【python 爬虫】伪造UA字符串

写好爬虫的原则只有一条:
就是让你的抓取行为和用户访问网站的真实行为尽量一致。

1、伪造UA字符串,每次请求都使用随机生成的UA。
为了减少复杂度,随机生成UA的功能通过第三方库fake-useragent实现

pip install fake-useragent

2、生成一个UA字符串只需要如下代码:

核心代码:

from fake_useragent import UserAgent
ua=UserAgent()
print(ua.random)

ipython 环境下:

In [1]: from fake_useragent import UserAgent

In [2]: ua=UserAgent()
No handlers could be found for logger "root"

In [3]: ua.random
Out[3]: u'Mozilla/5.0 (Windows; U; Windows NT 6.0; tr-TR) AppleWebKit/533.18.1 (
KHTML, like Gecko) Version/5.0.2 Safari/533.18.5'

In [4]: ua.random
Out[4]: u'Mozilla/5.0 (Windows NT 6.2) AppleWebKit/537.36 (KHTML, like Gecko) Ch
rome/28.0.1467.0 Safari/537.36'

In [5]: ua.random
Out[5]: u'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Ge
cko) Chrome/41.0.2225.0 Safari/537.36'

In [6]: ua.random
Out[6]: u'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KH
TML, like Gecko) Chrome/41.0.2227.1 Safari/537.36'

In [7]: ua.random
Out[7]: u'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHT
ML, like Gecko) Chrome/27.0.1453.93 Safari/537.36'

In [8]: ua.random
Out[8]: u'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Ch
rome/28.0.1468.0 Safari/537.36'

In [9]:
已标记关键词 清除标记
©️2020 CSDN 皮肤主题: 编程工作室 设计师:CSDN官方博客 返回首页