此时小伙伴们对“字符串替换指定字符串”大体比较关切,姐妹们都需要学习一些“字符串替换指定字符串”的相关知识。那么小编也在网络上搜集了一些有关“字符串替换指定字符串””的相关文章,希望看官们能喜欢,我们快快来了解一下吧!学习《Python Cookbook》第三版
字符串中搜索和匹配指定的文本模式,对于简单的字面模式,直接使用 str.repalce() 方法即可,比如:
text = 'Yea, you are my god'print(text.replace('god', 'darling')) # Yea, you are my darlingprint(text) # Yea, you are my god
对于复杂的模式,请使用 re 模块中的 sub() 函数。为了说明这个,假设你想将形式为 1/30/2021 的日期字符串改成 2021-1-30 。示例如下:
import redate_text = 'Today is 1/30/2021. Tomorrow is 1/31/2021'print(re.sub(r'(\d+)/(\d+)/(\d+)', r'\3-\1-2', date_text)) # Today is 2021-1-2. Tomorrow is 2021-1-2
sub() 函数中的第一个参数是被匹配的模式,第二个参数是替换模式。反斜杠数字比如 \3 指向前面模式的捕获组号。
date_text = 'Today is 1/30/2021. Tomorrow is 1/31/2021'date_pattern = re.compile(r'(\d+)/(\d+)/(\d+)')print(date_pattern.sub(r'\3-\1-\2', date_text)) # Today is 2021-1-30. Tomorrow is 2021-1-31
import refrom calendar import month_abbrdate_text = 'Today is 1/30/2021. Tomorrow is 1/31/2021'date_pattern = re.compile(r'(\d+)/(\d+)/(\d+)')def change_date(matched): mon_name = month_abbr[int(matched.group(1))] return '{} {} {}'.format(matched.group(2), mon_name, matched.group(3))print(date_pattern.sub(change_date, date_text)) # Today is 30 Jan 2021. Tomorrow is 31 Jan 2021def dashrepl(matchobj): if matchobj.group(0) == '-': return ' ' else: return '-'print(re.sub(r'-{1,2}', dashrepl, 'pro---gram-files')) # pro- gram filesprint(re.search(r'-{1,2}', 'pro---gram-files').group(0)) # --
一个替换回调函数的参数是一个 match 对象,也就是 match() 或者 find() 返回的对象。使用 group() 方法来提取特定的匹配部分。回调函数最后返回替换字符串。
如果除了替换后的结果外,你还想知道有多少替换发生了,可以使用 re.subn()来代替。比如:
date_text = 'Today is 1/30/2021. Tomorrow is 1/31/2021'date_pattern = re.compile(r'(\d+)/(\d+)/(\d+)')print(date_pattern.subn(r'\3-\1-\2', date_text)) # ('Today is 2021-1-30. Tomorrow is 2021-1-31', 2)
关于正则表达式搜索和替换,上面演示的 sub() 方法基本已经涵盖了所有。其实最难的部分就是编写正则表达式模式,这个最好是留给作者自己去练习了。
re.sub(pattern, repl, string, count=0, flags=0)
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed. That is, \n is converted to a single newline character, \r is converted to a carriage return, and so forth. Unknown escapes such as \& are left alone.
If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.
The pattern may be a string or a pattern object.
The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer. If omitted or zero, all occurrences will be replaced. Empty matches for the pattern are replaced only when not adjacent to a previous empty match, so sub('x*', '-', 'abxd') returns '-a-b--d-'.
In string-type repl arguments, in addition to the character escapes and backreferences described above, \g<name> will use the substring matched by the group named name, as defined by the (?P<name>...) syntax. \g<number> uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \g<0> substitutes in the entire substring matched by the RE.
print(re.sub(r'x*', '-', 'abxc')) # -a-b--c-
在字符串类型的repl参数中,除了上面描述的字符转义和反向引用之外,\g<name>将使用名为name的组匹配的子字符串,正如(?P<name>…)语法定义的那样。\g<number>使用相应的组号;因此,\g<2>等价于\2,但是在像\g<2>0这样的替换中不存在二义性。\20将被解释为对组20的引用,而不是对组2后跟字面字符'0'的引用 反向引用 \g<0>在正则匹配的整个子字符串中替换。