Python re内置库 -- 正则表达式

本文主要是介绍Python re内置库 -- 正则表达式，对大家解决编程问题具有一定的参考价值，需要的程序猿们随着小编来一起学习吧！

什么是正则表达式

正则表达式就是记录文本规则的代码
可以查找操作符合某些复杂规则的字符串

使用场景

处理字符串
处理日志

在 python 中使用正则表达式

把正则表达式作为模式字符串
正则表达式可以使用原生字符串来表示
原生字符串需要在字符串前方加上 r'string'

# 匹配字符串是否以 hogwarts_ 开头

r'hogwart_\w+'

使用 re 模块实现正则表达式操作

正则表达式对象转换

compile()：将字符串转换为正则表达式对象
需要多次使用这个正则表达式的场景

import re

''' prog：正则对象，可以直接调用匹配、替换、分割的方法，不需要再传入正则表达式 pattern：正则表达式 '''

prog = re.compile(pattern)

匹配字符串

match()：从字符串的开始处进行匹配
search()：在整个字符串中搜索第一个匹配的值
findall()：在整个字符串中搜索所有符合正则表达式的字符串，返回列表

''' pattern: 正则表达式

string: 要匹配的字符串 flags: 可选，控制匹配方式

- A：只进行 ASCII 匹配

- I：不区分大小写

- M：将 ^ 和 $ 用于包括整个字符串的开始和结尾的每一行

- S：使用 (.) 字符匹配所有字符（包括换行符）

- X：忽略模式字符串中未转义的空格和注释 '''

pattern = r'hog'
s1 = "hog 1111"
match1 = re.match(pattern,s1,re.I)
print(match1)
print(f"匹配起始位置：{match1.start()}")
print(f"匹配end位置：{match1.end()}")
print(f"匹配元组：{match1.span()}")
print(f"匹配字符串：{match1.string}")
print(f"匹配数据：{match1.group()}")

#search

match1 = re.search(pattern,s1,re.I)
print(match1)
s2 = " i like hogere in hogwarts"
match2 = re.search(pattern,s2,re.I)
print(match2)

#findall

match1 = re.findall(pattern,s1,re.I)
print(match1)
s2 = " i like hogere in hogwarts"
match2 = re.findall(pattern,s2,re.I)
print(match2)

findall输出：

['hog']
['hog', 'hog']

替换字符串

sub()：实现字符串替换

import re

''' pattern：正则表达式 repl：要替换的字符串

string：要被查找替换的原始字符串

count：可选，表示替换的最大次数，默认值为 0，表示替换所有匹配

flags：可选，控制匹配方式 '''

re.sub(pattern, repl, string, [count], [flags])

pattern = r"1[34578]\d{9}"

s1 = "中将号码123，联系电话：13444444311"
res = re.sub(pattern,'1xxxxxxxx11',s1)
print(res)

分割字符串

split()：根据正则表达式分割字符串，返回列表

import re

''' pattern：正则表达式

string：要匹配的字符串

maxsplit：可选，表示最大拆分次数

flags：可选，控制匹配方式 '''

re.split(pattern, string, [maxsplit], [flags])

#使用？和&来拆分url
p = r"[?|&]"
url = "https://vip.ceshiren.com/#/layout/course_ppt?url=https%3A%2F%2Fpdf.ceshiren.com%2Fbook%2Fpython_programming%2Fppt%2F%5B%25E5%25BD%2595%25E6%2592%25AD%5D%25E5%2586%2585%25E7%25BD%25AE%25E5%25BA%2593re.html&path=%2Flayout%2Fsection&name=%E5%AD%A6%E4%B9%A0%E8%BF%9B%E5%BA%A6"
r= re.split(p,url)
print(r)

这篇关于Python re内置库 -- 正则表达式的文章就介绍到这儿，希望我们推荐的文章对大家有所帮助，也希望大家多多支持为之网！

Python教程