我有一个文本文件,我想解析并将问题和选项放入问题和选项列表中
示例文本:[更新示例文本以包括问题类型和选项中的所有变化类型]
- 26 yrs Man Hbsag +ve ,hbeag +ve on routine screening ..what is next ; IM
A. observe
B. HBV DNA study\
C. Interferon
D. take liver biopsy
- Trauma è skin erythema and Partiel skin loss ,ttt: surgery
A. H2o irrigation
B. Bicarb. Irrigation
C. Surgical debridment\
- Old female, obese on diet control ,polydipsia , invest. Hba1c 7.5 ,all (random,
Fasting, post prandial ) sugar are mild elevated urine ketone (+) ttt: IM
A. Insulin “ ketonuria “\
B. pioglitazone
C. Thiazolidinediones
D. fourth i forgot (not Metformin nor sulfonylurea)
- Day to day variation of this not suitable for patients under warfarin therapy: IM
A. retinols
B. Fresh fruits and vegitables
C. Meet and paultry\
D. Old cheese
我是 python 的新手,尤其是正则表达式的新手。试图找到将找到以“-”开头的句子以及新行有“A”的正则表达式。, 在 'A.' 之前将其切片 并将问题放入列表中。注意:有些问题有两行长。
也是一个正则表达式,用于将每组选项提取到列表中。所以最终结果将是:
question list = ['- 26 yrs Man Hbsag +ve ,hbeag +ve on routine screening ..what is next ; IM','- Old female, obese on diet control ,polydipsia , invest. Hba1c 7.5 ,all (random,Fasting, post prandial ) sugar are mild elevated urine ketone (+) ttt:IM ','etc','and so on']
options list = [['A. observe','B. HBV DNA study\','C. Interferon','D. take liver biopsy'],['A. H2o irrigation\','B. Bicarb. Irrigation','C. Surgical debridment',[['A. Something Else','B. Something Else',......,'D. ']],[etc]]
我猜这会有点复杂,但是对正则表达式部分的任何帮助甚至是开始都会很棒。我有一个包含 1000 个这样的问题和选项的文本文件,理想情况下我想提取所有问题和选项。
import re
with open("julysmalltext.txt") as file:
content = file.read()
question_list = re.findall(r'', content)
options_list = re.findall(r'', content)
MMMHUHU
斯蒂芬大帝
HUWWW
相关分类