字符串的分割
webString = 'www.baidu.com' print(webString.split('.')) # ['www', 'baidu', 'com']
字符串前后空格的处理,或者特殊字符的处理
webString = ' www.baidu.com ' print(webString.strip()) # www.baidu.com webString = '!*www.baidu.com*!' print(webString.strip('!*')) # www.baidu.com
字符串格式化
webString = '{}www.baidu.com'.format('https://') print(webString) # https://www.baidu.com
自定义函数
webString = input("Please input url = ") print(webString) def change_number(number): return number.replace(number[3:7], '*'*4) print(change_number("15916881234")) # 159****1234
首先安装request第三方的库
GuessedAtParserWarning: No parser was explicitly specified 未添加解析器
基本请求的案例
import requests link = "http://www.santostang.com/" headers = {'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1'} data = requests.get(link, headers=headers) print(data.text)
完整代码展示
import requests from bs4 import BeautifulSoup link = "http://www.santostang.com/" headers = { 'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1'} data = requests.get(link, headers=headers) soup = BeautifulSoup(data.text, "html.parser") print(soup.find("h1", class_="post-title").a.text) # 第四章 – 4.3 通过selenium 模拟浏览器抓取