解决python爬虫问题：urllib.error.HTTPError: HTTP Error 301: The HTTP server returned a redirect error that

本文主要是介绍解决python爬虫问题：urllib.error.HTTPError: HTTP Error 301: The HTTP server returned a redirect error that，对大家解决编程问题具有一定的参考价值，需要的程序猿们随着小编来一起学习吧！

报错的原始方法：

1）使用request.Request，出现上述错误。html无法爬取

from urllib import request

def get_html(self, url):
    print(url)
    req = request.Request(url=url, headers={'User-Agent': random.choice(ua_list)})
    res = request.urlopen(req)
    # html = res.read().decode()
    html = req.read().decode("gbk", 'ignore')
    with open(filename, 'w') as f:
        f.write(html)
    self.parse_html(html)

解决方法：

1）将urllib.request 换成requests库，需要重新安装。

2）具体原因，我也不清楚。

　　import requests
    def get_html(self, url):
        print(url)
        req = requests.get(url=url, headers={'User-Agent': random.choice(ua_list)})
        req.encoding = 'utf-8'
        # print(req.text)
        # res = request.urlopen(req)
        # html = res.read().decode()
        # print(req)
        # html = req.read().decode("gbk", 'ignore')
        # print(html)
        # 直接调用解析函数
        # filename = '123456.html'
        # with open(filename, 'w') as f:
        #     f.write(html)
        self.parse_html(req.text)

这篇关于解决python爬虫问题：urllib.error.HTTPError: HTTP Error 301: The HTTP server returned a redirect error that的文章就介绍到这儿，希望我们推荐的文章对大家有所帮助，也希望大家多多支持为之网！

Python教程

解决python爬虫问题：urllib.error.HTTPError: HTTP Error 301: The HTTP server returned a redirect error that

前端开发

后端开发

移动端开发

数据库

服务器运维

人工智能

区块链

游戏开发

网站运营

大数据/云计算

软件工程

软件/开发工具使用

资讯