finditer
可以返回对象,而findall
只会返回结果
比如对于如下 content
> **Theorem** > Strategy game and stackelberg game in zero-sum are essentially identical. - 竞争状态 - 维持在一个次优的纳什均衡 - 合作状态 - 确保合作状态能进行下去 > **Folk Theorem** > ... > 在无限重复博弈中,假设存在单阶段NE$a^{*}$以及一个更优的群体策略$\hat{a}$ > 那么存在 $\delta$ 的某取值,可以使$(\hat{a},\hat{a},\dots,\hat{a})$成为SPNE - 存在一个策略,使得各个玩家都有比竞争NE更好的收益
findall
只返回了结果,并且可以看到,在有分组的情况下,findall只返回了分组的结果
finditer
返回了re.Match object,并且包含 span=(0, 92), match='\n> Theorem\n> Strategy game and stackelberg等重要属性,可以获取包含分组在内的完全匹配信息
使用match.group()
就可以恢复出包含分组在内的完整的匹配信息
https://stackoverflow.com/questions/3765024/different-behavior-between-re-finditer-and-re-findall
import re CARRIS_REGEX=r'<th>(\d+)</th><th>([\s\w\.\-]+)</th><th>(\d+:\d+)</th><th>(\d+m)</th>' pattern = re.compile(CARRIS_REGEX, re.UNICODE) mailbody = open("test.txt").read() for match in pattern.finditer(mailbody): print(match) print() for match in pattern.findall(mailbody): print(match)
prints
<_sre.SRE_Match object at 0x00A63758> <_sre.SRE_Match object at 0x00A63F98> <_sre.SRE_Match object at 0x00A63758> <_sre.SRE_Match object at 0x00A63F98> <_sre.SRE_Match object at 0x00A63758> <_sre.SRE_Match object at 0x00A63F98> <_sre.SRE_Match object at 0x00A63758> <_sre.SRE_Match object at 0x00A63F98> ('790', 'PR. REAL', '21:06', '04m') ('758', 'PORTAS BENFICA', '21:10', '09m') ('790', 'PR. REAL', '21:14', '13m') ('758', 'PORTAS BENFICA', '21:21', '19m') ('790', 'PR. REAL', '21:29', '28m') ('758', 'PORTAS BENFICA', '21:38', '36m') ('758', 'SETE RIOS', '21:49', '47m') ('758', 'SETE RIOS', '22:09', '68m')
If you want the same output from finditer
as you're getting from findall
, you need
for match in pattern.finditer(mailbody): print(tuple(match.groups()))