找到你要的答案

Q:Why do my regular expressions work individually but fail when I combine them?

Q:为什么我的正则表达式单独工作,但失败时,我结合他们?

I'm trying to write a regular expression that includes three matching groups. The string/text that I am trying to match follows:

<td class="no-wrap past-rating" style="background-color: rgb(228, 254, 199);">
                    <div>
                        <b class="place">2</b><sup> 1</sup><sup class="remaining"> 1/2</sup>
                    </div>
                    <div>
                        46.96
                    </div>
                </td>

I'm trying to match the: 2, 1 and 1/2.

I have written the following regular expressions that match the desired text on a standalone basis, but when I combine any two or all three I get know matches.

/(?<one>(?<=<b class="place">).*(?=<\/b>))/ matches=> 2 

/(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))/ matches=> 1

 /(?<three>(?<=="remaining">).*(?=<\/sup>))/ matches => 1/2

Unfortunately,

/(?<one>(?<=<b class="place">).*(?=<\/b>))(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))(?<three>(?<=="remaining">).*(?=<\/sup>))/ 

fails to match anything. Can anyone tell me where I'm going wrong and why the combined regular expression fails and the individual expression match successfully.

我试图写一个正则表达式,包括三个匹配组。我试图匹配的字符串/文本如下:

<td class="no-wrap past-rating" style="background-color: rgb(228, 254, 199);">
                    <div>
                        <b class="place">2</b><sup> 1</sup><sup class="remaining"> 1/2</sup>
                    </div>
                    <div>
                        46.96
                    </div>
                </td>

我试图匹配:2,1和1 / 2。

我已经写了下面的正则表达式匹配所需的文本在一个独立的基础上,但当我结合任何两个或所有三个我知道匹配。

/(?<one>(?<=<b class="place">).*(?=<\/b>))/ matches=> 2 

/(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))/ matches=> 1

 /(?<three>(?<=="remaining">).*(?=<\/sup>))/ matches => 1/2

不幸的是,

/(?<one>(?<=<b class="place">).*(?=<\/b>))(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))(?<three>(?<=="remaining">).*(?=<\/sup>))/ 

不能匹配任何东西。谁能告诉我哪里出错了,为什么组合的正则表达式失败,并且单个表达式成功匹配。

answer1: 回答1:

To "combine" the regexes, you need to use a alternation operator |:

(?<one>(?<=<b class="place">).*(?=<\/b>))|(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))|(?<three>(?<=="remaining">).*(?=<\/sup>))

See demo

However, since it is HTML parts you are trying to match, I'd use a regex that is capable to deal with multiple attributes in the pattern tags, and with multiple lines inside the input text like this:

<b\b[^<]*class="place"[^<]*>(?<one>[^<]*)|<\/b><sup[^<]*>(?<two>[^<]*)|="remaining"[^<]*>(?<three>[^<]*(?=<\/sup>))

See another demo

“结合”的正则表达式,你需要使用一个交互操作|:

(?<one>(?<=<b class="place">).*(?=<\/b>))|(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))|(?<three>(?<=="remaining">).*(?=<\/sup>))

看到演示

不过,既然是你想匹配的HTML部分,我使用一个正则表达式,能够处理模式中的标签多属性和多线输入里面像这样的文本:

<b\b[^<]*class="place"[^<]*>(?<one>[^<]*)|<\/b><sup[^<]*>(?<two>[^<]*)|="remaining"[^<]*>(?<three>[^<]*(?=<\/sup>))

看到另一个演示

answer2: 回答2:

Maybe you should try something like this:

/<b class="place">(.*)<\/b><sup>\s*(.*)<\/sup><sup class="remaining">\s*(.*)<\/sup>/

Demo online

也许你应该尝试这样的东西:

/<b class="place">(.*)<\/b><sup>\s*(.*)<\/sup><sup class="remaining">\s*(.*)<\/sup>/

演示在线

answer3: 回答3:

I guess you can make a simpler regex, i.e.:

/>\s*?([\d\/]+)\s*?<\//

Output:

MATCH 1
`2`
MATCH 2
`1`
MATCH 3
`1/2`

Demo:

https://regex101.com/r/dC7zR5/1


Explanation:

/>\s*?([\d\/]+)\s*?<\//gm

    > matches the characters > literally
    \s*? match any white space character [\r\n\t\f ]
        Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
    1st Capturing group ([\d\/]+)
        [\d\/]+ match a single character present in the list below
            Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
            \d match a digit [0-9]
            \/ matches the character / literally
    \s*? match any white space character [\r\n\t\f ]
        Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
    < matches the characters < literally
    \/ matches the character / literally
    g modifier: global. All matches (don't return on first match)
    m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

我想你可以做一个简单的正则表达式,即:

/>\s*?([\d\/]+)\s*?<\//

输出:

MATCH 1
`2`
MATCH 2
`1`
MATCH 3
`1/2`

演示:

https://regex101.com/r/dc7zr5/1


解释:

/>\s*?([\d\/]+)\s*?<\//gm

    > matches the characters > literally
    \s*? match any white space character [\r\n\t\f ]
        Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
    1st Capturing group ([\d\/]+)
        [\d\/]+ match a single character present in the list below
            Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
            \d match a digit [0-9]
            \/ matches the character / literally
    \s*? match any white space character [\r\n\t\f ]
        Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
    < matches the characters < literally
    \/ matches the character / literally
    g modifier: global. All matches (don't return on first match)
    m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
regex