找到你要的答案

Q:How can I use regex in Python to return these numbers?

Q:我如何使用正则表达式在Python返回这些数字吗?

I have the following code stored as a string variable in Python. How can I use regex, along with re.findall('', text), to parse out the five 9-digit numbers (all starting with "305...") under "attributeLookup" lookup in the below code?

var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };

我有下面的代码保存为Python中的字符串变量。我如何使用正则表达式,以及重新。所有('',文本),解析出五个9位数号码(所有从“305…”)下的“attributelookup”查找以下代码?

var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };
answer1: 回答1:

Here is a way to do it. First parse your string to get the json object (everything inside the most outer braces). Then decode the json object using the json module and access what you need.

astr = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import re
import json
pat = re.compile('^[^\{]*(\{.*\});.*$', re.MULTILINE|re.DOTALL)
json_str = pat.match(astr).group(1)
d = json.loads(json_str)

for x in d['attributeDefinition']['attributeLookup']:
    print x[1]
# 305557121
# 305557187
# 305557696
# 305557344
# 305696435

这是一种方法。首先分析你的字符串来获得JSON对象(最外层的括号内的一切)。然后解码JSON对象使用JSON模块和访问你所需要的。

astr = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import re
import json
pat = re.compile('^[^\{]*(\{.*\});.*$', re.MULTILINE|re.DOTALL)
json_str = pat.match(astr).group(1)
d = json.loads(json_str)

for x in d['attributeDefinition']['attributeLookup']:
    print x[1]
# 305557121
# 305557187
# 305557696
# 305557344
# 305696435
answer2: 回答2:

You can just use the built in json library to parse it. I've assumed you've got rid of the Javascript already:

import json

input = """{
"attributeDefinition":{
    "defaultSku":305557121,
    "attributeListing":[{ 
        "label":"Finish",
                "defaultIndex":0,
                "options":[
                    "White::f33b4086",
                    "Beige::8e0900fa",
                    "Blue::3c3a4707",
                    "Orange::1d8cb503",
                    "Spring Green::dd5e599a"
                 ]
        }],
        "attributeLookup":[
        [0,305557121],
        [1,305557187],
        [2,305557696],
        [3,305557344],
        [4,305696435]
        ]
    }
}"""

data = json.loads(input)

# Get a list you can do stuff with. This gives you:
# [[0, 305557121], [1, 305557187], [2, 305557696], [3, 305557344], [4, 305696435]]
els = data['attributeDefinition']['attributeLookup']

for el in els:
    # Each el looks like: [0, 305557121]
    print(el[1])

你可以使用内置的JSON库来解析。我以为你已经摆脱了JavaScript已经:

import json

input = """{
"attributeDefinition":{
    "defaultSku":305557121,
    "attributeListing":[{ 
        "label":"Finish",
                "defaultIndex":0,
                "options":[
                    "White::f33b4086",
                    "Beige::8e0900fa",
                    "Blue::3c3a4707",
                    "Orange::1d8cb503",
                    "Spring Green::dd5e599a"
                 ]
        }],
        "attributeLookup":[
        [0,305557121],
        [1,305557187],
        [2,305557696],
        [3,305557344],
        [4,305696435]
        ]
    }
}"""

data = json.loads(input)

# Get a list you can do stuff with. This gives you:
# [[0, 305557121], [1, 305557187], [2, 305557696], [3, 305557344], [4, 305696435]]
els = data['attributeDefinition']['attributeLookup']

for el in els:
    # Each el looks like: [0, 305557121]
    print(el[1])
answer3: 回答3:
string = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import json
data = json.loads(string.split('=', 1)[1].strip(';'))
for d in data['attributeDefinition']['attributeLookup']:
    print(d[1])

Don't know why you want to use regex. Do you also take your car to visit your neighbour?

string = '''var PRO_META_JSON = {
    "attributeDefinition":{
        "defaultSku":305557121,
        "attributeListing":[{ 
            "label":"Finish",
                    "defaultIndex":0,
                    "options":[
                        "White::f33b4086",
                        "Beige::8e0900fa",
                        "Blue::3c3a4707",
                        "Orange::1d8cb503",
                        "Spring Green::dd5e599a"
                     ]
            }],
            "attributeLookup":[
            [0,305557121],
            [1,305557187],
            [2,305557696],
            [3,305557344],
            [4,305696435]
            ]
        }
    };'''

import json
data = json.loads(string.split('=', 1)[1].strip(';'))
for d in data['attributeDefinition']['attributeLookup']:
    print(d[1])

不知道你为什么要使用正则表达式。你也开车去拜访你的邻居吗?

answer4: 回答4:

in the findall you want to select the digits 0 to 9 over 9 characters like this. This still would be better using the json module rather than storing as a string.

I really useful tester for python regex can be found here

http://pythex.org/

re.findall('[0-9]{9}', PRO_META_JSON.split('attributeLookup')[1])

在你想选择的数字0到9这样的所有超过9个字符。这还可以更好地使用JSON模块而不是存储为字符串。

我真的很有用的测试Python正则表达式可以在这里找到

http://pythex.org/

re.findall('[0-9]{9}', PRO_META_JSON.split('attributeLookup')[1])
python  regex  findall