我的 xml 看起来像这样:
xml = """
<portfolio>
<assets>600000</assets>
<assetClassDetails>
<assetClassName>Bonds</assetClassName>
<assetAmount>100000</assetAmount>
</assetClassDetails>
<assetClassDetails>
<assetClassName>Equities</assetClassName>
<assetAmount>500000</assetAmount>
</assetClassDetails>
<rateOfReturn>6.3</rateOfReturn>
</portfolio>
"""
我通过这样做将每个元素解析到一个表中:
root = etree.fromstring(xml)
tag = []
text = []
parent = []
double_parent = []
for element in root.iter():
try:
element_parent = element.getparent().tag
except AttributeError:
element_parent = 'none'
try:
element_double_parent = element.getparent().getparent().tag
except AttributeError:
element_double_parent = 'none'
tag.append(element.tag)
text.append(element.text)
parent.append(element_parent)
double_parent.append(element_double_parent)
df = pd.DataFrame({'tag' : tag, 'text' : text, 'parent' : parent, 'double_parent' : double_parent})
结果是这样的:
tag text parent double_parent
portfolio \n none none
assets 600000 portfolio none
assetClassDetails \n portfolio none
assetClassName Bonds assetClassDetails portfolio
assetAmount 100000 assetClassDetails portfolio
assetClassDetails \n portfolio none
assetClassName Equities assetClassDetails portfolio
assetAmount 500000 assetClassDetails portfolio
rateOfReturn 6.3 portfolio none
我正在努力解决如何转换数据,以便将资产类别名称和金额配对并绑定到投资组合标签(及其直接子项)。如何在结果中配对同级标签?
我想要的结果如下所示:
type assets rateOfReturn assetClassName assetAmount
portfolio 600000 6.3 Bonds 100000
portfolio 600000 6.3 Equities 500000
DIEA
肥皂起泡泡
小怪兽爱吃肉
相关分类