找到你要的答案

Q:Python: How can I find all strings matching any of the multiple patterns given

Q:巨蟒:我怎样才能找到所有的字符串匹配的多模式

I am new to Python and have been trying to find all the strings which match multiple patterns. I did some google search and found some posts which suggest to compile all the patterns into one. However they were using the re.search() function which only returns the first matched instance. I need to search all instances of strings which match any of the patterns in the order of their occurrence. Any direction/suggestion is welcome.

More specifically I am looking for something similar to this grep command

grep -i "[0-9' '-:A-Za-z].* ERROR.*Job failed\|Caused by:\|^      *at\|ERROR\|more$" <file-name>

我是新的Python和一直试图找出所有字符串匹配多个模式。我做了一些谷歌搜索,发现一些帖子建议把所有的模式编译成一个。然而,他们使用的是()函数,只返回第一个匹配的实例。我需要搜索所有符合其发生顺序的字符串的实例。欢迎任何方向/建议。

更具体地说,我在找一些类似grep命令

grep -i "[0-9' '-:A-Za-z].* ERROR.*Job failed\|Caused by:\|^      *at\|ERROR\|more$" <file-name>
answer1: 回答1:

re.findall(combined_regex, your_string) is probably what you're looking for.

If you plan on doing this many times in the same program, consider compiling the regex as follows for better performance:

compiled = re.compile(combined_regex)
results = compiled.findall(your_string)

再次,所有(combined_regex,your_string)可能是你正在寻找的东西。

如果你计划在同一个程序做很多次,考虑编译正则表达式如下更好的性能:

compiled = re.compile(combined_regex)
results = compiled.findall(your_string)
answer2: 回答2:

If I understood you right, you're looking for smth like this:

import re

l_string="""gfrwrfr
gerqgeq
ERROR gferagqe
hello ERROR
smth more
more smth
Caused by: gferg
"""

pattern_strings =['[0-9" "-:A-Za-z].*ERROR.*Job run failed', 'Caused by:','^ *at','ERROR','more$']
pattern_string = '|'.join(pattern_strings)
pattern = re.compile(pattern_string)

for line in l_string.split("\n"):
    result = pattern.search(line)
    if result:
        print result.string

But of course, as we're already in the loop over lines, we can print line instead of match_obj.string.

Program output:

$  ./multiple_re.py
ERROR gferagqe
hello ERROR
smth more
Caused by: gferg

如果我理解你,你在寻找什么,像这样:

import re

l_string="""gfrwrfr
gerqgeq
ERROR gferagqe
hello ERROR
smth more
more smth
Caused by: gferg
"""

pattern_strings =['[0-9" "-:A-Za-z].*ERROR.*Job run failed', 'Caused by:','^ *at','ERROR','more$']
pattern_string = '|'.join(pattern_strings)
pattern = re.compile(pattern_string)

for line in l_string.split("\n"):
    result = pattern.search(line)
    if result:
        print result.string

当然,正如我们已经过线环,可以代替match_obj.string行打印。

程序的输出:

$  ./multiple_re.py
ERROR gferagqe
hello ERROR
smth more
Caused by: gferg
answer3: 回答3:

Use multi_pattern_search. Example follows.

from multi_pattern_search import MultiPatternSearch

search = MultiPatternSearch()

search.add_keyword("foo")
search.add_keyword("bar")

print search.exist("apple tree foo john doe")

for k, v in search.count("apple tree foo bar foobar john doe").iteritems():
    print k, v

使用multi_pattern_search。例子如下。

from multi_pattern_search import MultiPatternSearch

search = MultiPatternSearch()

search.add_keyword("foo")
search.add_keyword("bar")

print search.exist("apple tree foo john doe")

for k, v in search.count("apple tree foo bar foobar john doe").iteritems():
    print k, v
python  search  grep