正则表达式是一种文本模式匹配,包括普通字符(例如,a 到 z 之间的字母)和特殊字符(称为”元字符”)。它是一种字符串匹配的模式,可以用来检查一个字符串是否含有某种子串、将匹配的子串替换或者从某个字符串中取出某个条件的子串。
测试数据: #cat file ac ab abbc abcc aabbcc abbbc abbbbbc acc abc asb aa bb a_c aZc aAAAAc a c ABC ccc dddd http://www abababab c c d 123 a3c e*f
定位符 | 说明 |
^ | 锚定开头 ^a 以a开头 默认锚定一个字符 |
$ | 锚定结尾 a$ 以a结尾 默认锚定一个字符 |
1)精确匹配 以a开头c结尾的字符串,其中egrep==grep -E [root@www ~]# egrep "^ac$" file ac 2)模糊匹配 以a开头 [root@www ~]# egrep "^a" file ac ab abbc abcc aabbcc abbbc abbbbbc acc abc asb aa a_c aZc aAAAAc a c abababab a3c 3)模糊匹配 以c结尾的字符串 [root@www ~]# egrep "c$" file ac abbc abcc aabbcc abbbc abbbbbc acc abc a_c aZc aAAAAc a c ccc a3c
匹配符 | 说明 |
. | 匹配除回车以外的任意字符 |
() | 字符串分组 |
[] | 定义字符类,匹配括号中的一个字符 |
[^] | 表示否定括号中出现字符类中的字符,取反。 |
\ | 转义字符 |
| | 或 |
1)精确匹配 以a开头c结尾 中间任意 长度为三个字节的字符串 [root@www ~]# egrep "^a.c$" file acc abc a_c aZc a c a3c 2)模糊匹配 以cc结尾的字符串 因为$只能锚定单个字符,如果是一个字符串就需要用()来做定义 [root@www ~]# egrep "(cc)$" file abcc aabbcc acc ccc 3)精确匹配 以a开头c结尾 中间是a-z,0-9 长度为三个字节的字符串 [root@www ~]# egrep "^a[a-z0-9]c$" file acc abc a3c 4)精确匹配 以a开头c结尾 中间不包含a-z,0-9 长度为三个字节的字符串 [root@www ~]# egrep "^a[^a-z0-9]c$" file a_c aZc a c 5)精确匹配 以e开头f结尾 中间是*号 长度为三个字节的字符串 e*f [root@www ~]# egrep "^e\*f$" file e*f 6)精确匹配 以a开头b或c结尾 中间是任意 长度为三个字节的字符串 [root@www ~]# egrep "^a.(b|c)$" file acc abc asb a_c aZc a c a3c
限定符 | 说明 |
* | 某个字符之后加星号表示该字符不出现或出现多次 |
? | 与星号相似,但略有变化,表示该字符出现一次或不出现 |
+ | 与星号相似,表示其前面字符出现一次或多次,但必须出现一次 |
{n,m} | 某个字符之后出现,表示该字符最少n次,最多m次,[n,m] |
{m} | 正好出现了m次 |
1)精确匹配 以a开头 c结尾 中间是有b或者没有b 长度不限的字符串 [root@www ~]# egrep "^ab*c$" file ac abbc abbbc abbbbbc abc 2)精确匹配 以a开头 c结尾 中间只出现一次b或者没有b的字符串 [root@www ~]# egrep "^ab?c$" file ac abc 3)精确匹配 以a开头 c结尾 中间是有b且至少出现一次 长度不限的字符串 [root@www ~]# egrep "^ab+c$" file abbc abbbc abbbbbc abc 4)精确匹配 以a开头 c结尾 中间是有b且至少出现两次最多出现四次 长度不限的字符串 [root@www ~]# egrep "^ab{2,4}c$" file abbc abbbc 5)精确匹配 以a开头 c结尾 中间是有b且正好出现三次的字符串 [root@www ~]# egrep "^ab{3}c$" file abbbc 6) 精确匹配 以a开头 c结尾 中间是有b且至少出现一次的字符串 [root@www ~]# egrep "^ab{1,}c$" file abbc abbbc abbbbbc abc
POSIX特殊字符 | 说明 |
[:alnum:] | 匹配任意字母字符0-9 a-z A-Z |
[:alpha:] | 匹配任意字母,大写或小写 |
[:digit:] | 数字 0-9 |
[:graph:] | 非空字符( 非空格控制字符) |
[:lower:] | 小写字符a-z |
[:upper:] | 大写字符A-Z |
[:cntrl:] | 控制字符 |
[:print:] | 非空字符( 包括空格) |
[:punct:] | 标点符号 |
[:blank:] | 空格和TAB字符 |
[:xdigit:] | 16 进制数字 |
[:space:] | 所有空白字符( 新行、空格、制表符) |
注意[[ ]] 双中括号的意思: 第一个中括号是匹配符[] 匹配中括号中的任意一个字符,第二个[]是格式 如[:digit:] 1)精确匹配 以a开头c结尾 中间a-zA-Z0-9任意字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:alnum:]]c$" file acc abc aZc a3c 2)精确匹配 以a开头c结尾 中间是a-zA-Z任意字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:alpha:]]c$" file acc abc aZc 3)精确匹配 以a开头c结尾 中间是0-9任意字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:digit:]]c$" file a3c 4)精确匹配 以a开头c结尾 中间是a-z任意字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:lower:]]c$" file acc abc 4)精确匹配 以a开头c结尾 中间是A-Z任意字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:upper:]]c$" file aZc 5)精确匹配 以a开头c结尾 中间是非空任意字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:print:]]c$" file acc abc a_c aZc a c a3c 6)精确匹配 以a开头c结尾 中间是符号字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:punct:]]c$" file a_c 7)精确匹配 以a开头c结尾 中间是空格或者TAB符字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:blank:]]c$" file a c 类似 [root@www ~]# egrep "^a[[:space:]]c$" file a c 8)精确匹配 以a开头c结尾 中间是十六进制字符 长度为三个字节的字符串 [root@www ~]# egrep "^a[[:xdigit:]]c$" file acc abc a3c
案例一: 匹配合法的IP地址
grep ‘^((25[0-5]|2[0-4][[:digit:]]|[01]?[[:digit:]][[:digit:]]?).){3}(25[0-5]|2[0-4][[:digit:]]|[01]?[[:digit:]][[:digit:]]?)$’ —color ip_base
案例二 :匹配座机电话号码
egrep “^[[:graph:]]{12}$” number |egrep “^(0[1-9][0-9][0-9]?)-[1-9][0-9]{6,7}$”
区别是: 文本编辑器: 编辑对象是文件 行编辑器:编辑对象是文件中的行
sed 命令
sed [options] ‘{command}[flags]’ [filename]
命令选项 -e script 将脚本中指定的命令添加到处理输入时执行的命令中 多条件,一行中要有多个操作 -f script 将文件中指定的命令添加到处理输入时执行的命令中 -n 抑制自动输出 -i 编辑文件内容 -i.bak 修改时同时创建.bak备份文件。 -r 使用扩展的正则表达式 ! 取反 (跟在模式条件后与shell有所区别) sed常用内部命令 a 在匹配后面添加 i 在匹配前面添加 p 打印 d 删除 s 查找替换 c 更改 y 转换 N D P flags 数字 表示新文本替换的模式 g: 表示用新文本替换现有文本的全部实例 p: 表示打印原始的内容 w filename: 将替换的结果写入文件
[root@www ~]# cat data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
在data1的每行后追加一行新数据内容: append data "haha" [root@www ~]# sed 'a\append data "haha"' data1 1 the quick brown fox jumps over the lazy dog. append data "haha" 2 the quick brown fox jumps over the lazy dog. append data "haha" 3 the quick brown fox jumps over the lazy dog. append data "haha" 4 the quick brown fox jumps over the lazy dog. append data "haha" 5 the quick brown fox jumps over the lazy dog. append data "haha" 在第二行后新开一行追加数据: append data "haha" [root@www ~]# sed '2a\append data "haha"' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. append data "haha" 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 在第二到四行每行后新开一行追加数据: append data "haha" [root@www ~]# sed '2,4a\append data "haha"' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. append data "haha" 3 the quick brown fox jumps over the lazy dog. append data "haha" 4 the quick brown fox jumps over the lazy dog. append data "haha" 5 the quick brown fox jumps over the lazy dog. 匹配字符串追加: 找到包含"3 the"的行,在其后新开一行追加内容: append data "haha" [root@www ~]# sed '/3 the/a\append data "haha"' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. append data "haha" 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. //开启匹配模式 /要匹配的字符串/
在data1的每行前插入一行新数据内容: insert data "haha" [root@www ~]# sed 'i\insert data "haha"' data1 insert data "haha" 1 the quick brown fox jumps over the lazy dog. insert data "haha" 2 the quick brown fox jumps over the lazy dog. insert data "haha" 3 the quick brown fox jumps over the lazy dog. insert data "haha" 4 the quick brown fox jumps over the lazy dog. insert data "haha" 5 the quick brown fox jumps over the lazy dog. 在第二行前新开一行插入数据: insert data "haha" [root@www ~]# sed '2i\insert data "haha"' data1 1 the quick brown fox jumps over the lazy dog. insert data "haha" 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 在第二到四行每行前新开一行插入数据: insert data "haha" [root@www ~]# sed '2,4i\insert data "haha"' data1 1 the quick brown fox jumps over the lazy dog. insert data "haha" 2 the quick brown fox jumps over the lazy dog. insert data "haha" 3 the quick brown fox jumps over the lazy dog. insert data "haha" 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 匹配字符串插入: 找到包含"3 the"的行,在其前新开一行插入内容: insert data "haha" [root@www ~]# sed '/3 the/i\insert data "haha"' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. insert data "haha" 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
从标准输出流中做替换,将test替换为text [root@www ~]# echo "this is a test" |sed 's/test/text/' this is a text 将data1中每行的dog替换为cat [root@www ~]# sed 's/dog/cat/' data1 1 the quick brown fox jumps over the lazy cat. 2 the quick brown fox jumps over the lazy cat. 3 the quick brown fox jumps over the lazy cat. 4 the quick brown fox jumps over the lazy cat. 5 the quick brown fox jumps over the lazy cat. 将data1中第二行的dog替换为cat [root@www ~]# sed '2s/dog/cat/' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy cat. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 将data1中第二到第四行的dog替换为cat [root@www ~]# sed '2,4s/dog/cat/' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy cat. 3 the quick brown fox jumps over the lazy cat. 4 the quick brown fox jumps over the lazy cat. 5 the quick brown fox jumps over the lazy dog. 匹配字符串替换:将包含字符串"3 the"的行中的dog替换为cat [root@www ~]# sed '/3 the/s/dog/cat/' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy cat. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
将data1文件中的所有行的内容更改为: change data "data" [root@www ~]# sed 'c\change data "haha"' data1 change data "haha" change data "haha" change data "haha" change data "haha" change data "haha" 将data1文件第二行的内容更改为: change data "haha" [root@www ~]# sed '2c\change data "haha"' data1 1 the quick brown fox jumps over the lazy dog. change data "haha" 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 将data1文件中的第二、三、四行的内容更改为:change data "haha" [root@www ~]# sed '2,4c\change data "haha"' data1 1 the quick brown fox jumps over the lazy dog. change data "haha" 5 the quick brown fox jumps over the lazy dog. 将data1文件中包含"3 the"的行内容更改为: change data "haha" [root@www ~]# sed '/3 the/c\change data "data"' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. change data "data" 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
将data1中的a b c字符转换为对应的 A B C字符 [root@www ~]# sed 'y/abc/ABC/' data1 1 the quiCk Brown fox jumps over the lAzy dog. 2 the quiCk Brown fox jumps over the lAzy dog. 3 the quiCk Brown fox jumps over the lAzy dog. 4 the quiCk Brown fox jumps over the lAzy dog. 5 the quiCk Brown fox jumps over the lAzy dog.
删除文件data1中的所有数据 [root@www ~]# sed 'd' data1 删除文件data1中的第三行数据 [root@www ~]# sed '3d' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 删除文件data1第三到第四行的数据 [root@www ~]# sed '3,4d' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 删除文件data1中包含字符串"3 the"的行 [root@www ~]# sed '/3 the/d' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
打印data1文件内容 [root@www ~]# sed 'p' data1 1 the quick brown fox jumps over the lazy dog. 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 打印data1文件第三行的内容 [root@www ~]# sed '3p' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 打印data1文件第二、三、四行内容 [root@www ~]# sed '2,4p' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 打印data1文件包含字符串"3 the"的行 [root@www ~]# sed '/3 the/p' data1 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog. 可以看得出,打印内容是重复的行,原因是打印了指定文件内容一次,又将读入缓存的所有数据打印了一次,所以会看到这样的效果, 如果不想看到这样的结果,可以加命令选项-n抑制内存输出即可。
在命令行中使用多个命令 -e
将brown替换为green dog替换为cat [root@www ~]# sed -e 's/brown/green/;s/dog/cat/' data1 1 the quick green fox jumps over the lazy cat. 2 the quick green fox jumps over the lazy cat. 3 the quick green fox jumps over the lazy cat. 4 the quick green fox jumps over the lazy cat. 5 the quick green fox jumps over the lazy cat.
从文件读取编辑器命令 -f 适用于日常重复执行的场景
1)将命令写入文件 [root@www ~]# vim abc s/brown/green/ s/dog/cat/ s/fox/elephant/ 2)使用-f命令选项调用命令文件 [root@www ~]# sed -f abc data1 1 the quick green elephant jumps over the lazy cat. 2 the quick green elephant jumps over the lazy cat. 3 the quick green elephant jumps over the lazy cat. 4 the quick green elephant jumps over the lazy cat. 5 the quick green elephant jumps over the lazy cat.
抑制内存输出 -n
打印data1文件的第二行到最后一行内容 $最后的意思 [root@www ~]# sed -n '2,$p' data1 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
使用正则表达式 -r
打印data1中以字符串"3 the"开头的行内容 [root@www ~]# sed -n -r '/^(3 the)/p' data1 3 the quick brown fox jumps over the lazy dog.
1)查看文件列表,没有发现data1.bak [root@www ~]# ls abc apache data1 Dobby file node-v10.14.1 Python-3.7.1 soft1 vimset 2)执行替换命令并修改文件 可以把-i.bak 转成-i.txt [root@www ~]# sed -i.bak 's/brown/green/' data1 3)发现文件夹中多了一个data1.bak文件 [root@www ~]# ls abc data1 Dobby node-v10.14.1 soft1 apache data1.bak file Python-3.7.1 vimset 4)打印比较一下,发现data1已经被修改,data1.bak是源文件的备份。 [root@www ~]# cat data1 1 the quick green fox jumps over the lazy dog. 2 the quick green fox jumps over the lazy dog. 3 the quick green fox jumps over the lazy dog. 4 the quick green fox jumps over the lazy dog. 5 the quick green fox jumps over the lazy dog. [root@www ~]# cat data1.bak 1 the quick brown fox jumps over the lazy dog. 2 the quick brown fox jumps over the lazy dog. 3 the quick brown fox jumps over the lazy dog. 4 the quick brown fox jumps over the lazy dog. 5 the quick brown fox jumps over the lazy dog.
演示文档 [root@www ~]# cat data2 1 the quick brown fox jumps over the lazy dog . dog 2 the quick brown fox jumps over the lazy dog . dog 3 the quick brown fox jumps over the lazy dog . dog 4 the quick brown fox jumps over the lazy dog . dog 5 the quick brown fox jumps over the lazy dog . dog
替换一行中的第二处dog为cat [root@www ~]# sed 's/dog/cat/2' data2 1 the quick brown fox jumps over the lazy dog . cat 2 the quick brown fox jumps over the lazy dog . cat 3 the quick brown fox jumps over the lazy dog . cat 4 the quick brown fox jumps over the lazy dog . cat 5 the quick brown fox jumps over the lazy dog . cat
将data1文件中的所有dog替换为cat [root@www ~]# sed 's/dog/cat/g' data2 1 the quick brown fox jumps over the lazy cat . cat 2 the quick brown fox jumps over the lazy cat . cat 3 the quick brown fox jumps over the lazy cat . cat 4 the quick brown fox jumps over the lazy cat . cat 5 the quick brown fox jumps over the lazy cat . cat
[root@www ~]# sed '3s/dog/cat/p' data2 1 the quick brown fox jumps over the lazy dog . dog 2 the quick brown fox jumps over the lazy dog . dog 3 the quick brown fox jumps over the lazy cat . dog 3 the quick brown fox jumps over the lazy cat . dog 4 the quick brown fox jumps over the lazy dog . dog 5 the quick brown fox jumps over the lazy dog . dog
w filename标志
[root@www ~]# sed '3s/dog/cat/w text' data2 1 the quick brown fox jumps over the lazy dog . dog 2 the quick brown fox jumps over the lazy dog . dog 3 the quick brown fox jumps over the lazy cat . dog 4 the quick brown fox jumps over the lazy dog . dog 5 the quick brown fox jumps over the lazy dog . dog 可以看出,将修改的第三行内容存在了text文件中 [root@www ~]# cat text 3 the quick brown fox jumps over the lazy cat . dog
$= 统计文本有多少行
统计data2有多少行 [root@www ~]# sed -n '$=' data2 5 打印data2内容时加上行号 [root@www ~]# sed '=' data2 1 1 the quick brown fox jumps over the lazy dog . dog 2 2 the quick brown fox jumps over the lazy dog . dog 3 3 the quick brown fox jumps over the lazy dog . dog 4 4 the quick brown fox jumps over the lazy dog . dog 5 5 the quick brown fox jumps over the lazy dog . dog