找到你要的答案

Q:Regex woes with Subprocess output

Q:正则表达式的痛苦过程输出

Im trying to match IP and MAC address from subprocess output, but it seems I have two issues (that I can see). The regx isn't very good as its missing some items and the other issue being that, for some reason OS X does not produce the correct MAC address output for the arp -a command.

I hate regx :(, I did start out using socket.inet_aton() to validate the ip addresses but iterating each line and trying to match regx with mac and validate using the socket.inet_aton(addr) was not particularly usable. So decided to go with regx on both counts.

I understand why the incorrect formatted output is not being matched (MAC) and that issue I will try address elsewhere but I can not work out why the correctly formatted output is not being matched. Have I mentioned i hate regx ? :)

Update

I did not notice the single digit in the following line initially ? (192.168.1.74) at fc:75:16:3:d0:2a on en0 ifscope [ethernet] # Not missing anything but does not match so it seems my problem is more OSx not printing the MACs out correctly. It looks to be leaving off the first digit of a segment if it is a 0 for some reason. So I will need to add a 0 to the front of any single digit segment found to resolve my problem (until i work out why its doing this in the first place. Testing on other systems does not produce this single digit segment in the mac address.

Output of script

? (192.168.1.74) at fc:75:16:3:d0:2a on en0 ifscope [ethernet] # Not missing anything but does not match
192.168.1.74
? (192.168.1.115) at 28:32:c5:f1:eb:9e on en0 ifscope [ethernet]
192.168.1.115
28:32:c5:f1:eb:9e
? (192.168.1.126) at 0:c:29:30:a1:c9 on en0 ifscope [ethernet] #Notice the misson 0 ?
192.168.1.126
gateway.home (192.168.1.254) at f4:55:9c:62:8a:cc on en0 ifscope [ethernet]
192.168.1.254
f4:55:9c:62:8a:cc
? (192.168.1.255) at ff:ff:ff:ff:ff:ff on en0 ifscope [ethernet]
192.168.1.255
ff:ff:ff:ff:ff:ff
? (192.168.7.1) at 0:50:56:c0:0:8 on vmnet8 ifscope permanent [ethernet] #Notice the misson 0 ?
192.168.7.1
? (192.168.194.1) at 0:50:56:c0:0:1 on vmnet1 ifscope permanent [ethernet] #Notice the misson 0 ?
192.168.194.1

Script

cmd="arp -a"
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
output, errrors = process.communicate()

for line in output.split("\n"):
    print line
    for data in line.split(' '):
        data = data.translate(None, '()')
        mac = re.match("^([0-9A-Fa-f]{2}[:]){5}([0-9A-Fa-f]{2})$", data)
        if mac:
            print mac.group()
        ip = re.match("^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$", data)
        if ip:
            print ip.group()

我想比赛过程输出的IP和MAC地址,但似乎我有两个问题(我能看到的)。不是很好的regx为丢失了一些项目和另一个问题是,由于某些原因,OS X不为ARP命令产生正确的MAC地址输出。

我讨厌regx:(,我开始使用插座。inet_aton()验证IP地址,但迭代每一条线,试图匹配regx MAC和验证使用的插座。inet_aton(addr)不是特别有用。所以决定去都regx。

我理解为什么不正确的格式化输出是不匹配(MAC)和这个问题,我会尝试其他地方,但我不能找出为什么正确格式化的输出不匹配。我说我恨regx?:)

更新

我没有注意到在下面的行最初的位数?(192.168.1.74)在FC:75:16:3:d0:2a对en0 ifscope [以太网] #没有错过什么但不匹配,所以似乎我的问题更是不正确的Mac OSX印刷。它看起来是离开一个段的第一个数字,如果它是一个0由于某种原因。因此,我需要添加一个0到任何一个数字段的前面找到解决我的问题(直到我搞清楚为什么它这样做的第一位。对其他系统的测试不会在MAC地址中产生这个单字节段。

输出脚本

? (192.168.1.74) at fc:75:16:3:d0:2a on en0 ifscope [ethernet] # Not missing anything but does not match
192.168.1.74
? (192.168.1.115) at 28:32:c5:f1:eb:9e on en0 ifscope [ethernet]
192.168.1.115
28:32:c5:f1:eb:9e
? (192.168.1.126) at 0:c:29:30:a1:c9 on en0 ifscope [ethernet] #Notice the misson 0 ?
192.168.1.126
gateway.home (192.168.1.254) at f4:55:9c:62:8a:cc on en0 ifscope [ethernet]
192.168.1.254
f4:55:9c:62:8a:cc
? (192.168.1.255) at ff:ff:ff:ff:ff:ff on en0 ifscope [ethernet]
192.168.1.255
ff:ff:ff:ff:ff:ff
? (192.168.7.1) at 0:50:56:c0:0:8 on vmnet8 ifscope permanent [ethernet] #Notice the misson 0 ?
192.168.7.1
? (192.168.194.1) at 0:50:56:c0:0:1 on vmnet1 ifscope permanent [ethernet] #Notice the misson 0 ?
192.168.194.1

脚本

cmd="arp -a"
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
output, errrors = process.communicate()

for line in output.split("\n"):
    print line
    for data in line.split(' '):
        data = data.translate(None, '()')
        mac = re.match("^([0-9A-Fa-f]{2}[:]){5}([0-9A-Fa-f]{2})$", data)
        if mac:
            print mac.group()
        ip = re.match("^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$", data)
        if ip:
            print ip.group()
answer1: 回答1:

The MAC address regex (as pointed out by @nhahtdh) does not consider single digit (or letter) segment, I've also used a single regex for both (to avoid a loop and reduce the code).

#!/usr/bin/python

import subprocess
import re

cmd = "arp -a"
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
output, errors = process.communicate()

for line in output.split("\n"):
    if line and not line.isspace():
        print "line ->", line
        regex = re.match("(?i).*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*?((?:[0-9A-F]{1,2}[:]){5}(?:[0-9A-F]{1,2}))", line)
        print "ip  ->", regex.group(1)
        print "mac ->", re.sub('(^[^:](?=:)|(?<=:)[^:](?=:)|(?<=:)[^:]$)', '0\\1', regex.group(2))

MAC地址的正则表达式(指出“nhahtdh)不考虑单一的数字(或字母)段,我也用一个正则表达式为(避免循环和减少代码)。

#!/usr/bin/python

import subprocess
import re

cmd = "arp -a"
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
output, errors = process.communicate()

for line in output.split("\n"):
    if line and not line.isspace():
        print "line ->", line
        regex = re.match("(?i).*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*?((?:[0-9A-F]{1,2}[:]){5}(?:[0-9A-F]{1,2}))", line)
        print "ip  ->", regex.group(1)
        print "mac ->", re.sub('(^[^:](?=:)|(?<=:)[^:](?=:)|(?<=:)[^:]$)', '0\\1', regex.group(2))
python  regex