标签:python parsing python-2-7 pyparsing
我试图第一次使用pyparsing.
我的解析器没有做我希望它会做的事情,有人可以检查一下,看看有什么问题.我试图在OneOrMore中嵌入OneOrMore,我认为应该可以正常工作,但事实并非如此.
以下是整个代码:
import pyparsing
status = """
sale number : 11/7
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-1 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-2 918 1 104708K 07:38:19.08 24.02% run 1d01h
sale number : 11/8
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-3 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-4 918 1 104708K 07:38:19.08 24.02% run 1d01h
"""
integer = pyparsing.Word(pyparsing.nums).setParseAction(lambda toks: int(toks[0]))
decimal = pyparsing.Word(pyparsing.nums + ".").setParseAction(lambda toks: float(toks[0]))
wordSuppress = pyparsing.Suppress(pyparsing.Word(pyparsing.alphas))
endOfLine = pyparsing.LineEnd().suppress()
colon = pyparsing.Suppress(":")
saleNumber = pyparsing.Regex("\d{2}\/\d{1}").setResultsName("saleNumber")
lineSuppress = pyparsing.Regex("NAME.*STOP") + endOfLine
saleRow = wordSuppress + wordSuppress + colon + saleNumber + endOfLine
name = pyparsing.Regex("cross-cu-\d").setResultsName("name")
id = integer.setResultsName("id")
pawn = integer.setResultsName("pawn")
price = integer.setResultsName("price") + "K"
time = pyparsing.Regex("\d{2}:\d{2}:\d{2}.\d{2}").setResultsName("time")
c = decimal.setResultsName("c") + "%"
state = pyparsing.Word(pyparsing.alphas).setResultsName("state")
startStop = pyparsing.Word(pyparsing.alphanums).setResultsName("startStop")
row = name + id + pawn + price + time + c + state + startStop + endOfLine
table = pyparsing.OneOrMore(pyparsing.Group(saleRow + lineSuppress.suppress() + (pyparsing.OneOrMore(pyparsing.Group(row) | pyparsing.SkipTo(row).suppress())) ) | pyparsing.SkipTo(saleRow).suppress())
resultDic = [x.asDict() for x in table.parseString(status)]
print resultDic
它只返回[{‘saleNumber’:’11 / 7’}]
我希望得到一个这样的词典列表:
[{ {'saleNumber': '11/7'},{ elements in cross-cu-1 line, elements in cross-cu-2 line } },
{ {'saleNumber': '11/8'},{ elements in cross-cu-3 line, elements in cross-cu-4 line } }]
任何帮助表示赞赏!
请不要建议实现此输出的其他方法!我也想学习pyparsing!
解决方法:
在这种情况下,pyparsing可能是矫枉过正.为什么不直接读取文件然后解析结果?
代码如下所示:
编辑:我已更新代码以更密切地关注您的示例.
来自集合import defaultdict
status = """
sale number : 11/7
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-1 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-2 918 1 104708K 07:38:19.08 24.02% run 1d01h
sale number : 11/8
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-3 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-4 918 1 104708K 07:38:19.08 24.02% run 1d01h
"""
sale_number = ''
sales = defaultdict(list)
for line in status.split('\n'):
line = line.strip()
if line.startswith("NAME"):
continue
elif line.startswith("sale number"):
sale_number = line.split(':')[1].strip()
elif not line or line.isspace() :
continue
else:
# you can also use a regular expression here
sales[sale_number].append(line.split())
for sale in sales:
print sale, sales[sale]
标签:python,parsing,python-2-7,pyparsing 来源: https://codeday.me/bug/20190709/1413730.html
本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享; 2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关; 3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关; 4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除; 5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。