从 XML 数据创建 pandas 数据框

我使用xml etree模块来遍历 xml 并提取相关数据。注释在下面的代码中以解释该过程：看看它，然后玩代码。希望它适合您的用例import xml.etree.ElementTree as ETfrom collections import defaultdictd = defaultdict(list)#since u r reading from a file,# root should be root = ET.parse('filename.xml').getroot()#mine is wrapped in a string hence : root = ET.fromstring(data)#required data is in the Frame sectionfor ent in root.findall('./Match//Frame'):    #this gets us the timestamp    Frame = ent.attrib['utc']    for entry in ent.findall('Objs/Obj'):        #append the objects to the relevant timestamp        d[Frame].append(entry.attrib)df = (pd.concat((pd.DataFrame(value) #create dataframe of the values                 .assign(Frame=key) #assign keys to the dataframe                 .filter(['id','Frame','x','y','z']) #keep only required columns                 for key, value in d.items()),                axis=1) #concatenate on the columns axis     )df.head()id  Frame   x   y   z   id  Frame   x   y   z0   0   2016-09-13T18:45:35.272 -46 -2562   0   0   2016-09-13T18:45:35.319 -46 -2558   01   105823  2016-09-13T18:45:35.272 939 113 NaN 105823  2016-09-13T18:45:35.319 938 113 NaN2   250086090   2016-09-13T18:45:35.272 1194    1425    NaN 250086090   2016-09-13T18:45:35.319 1198    1426    NaN3   250080473   2016-09-13T18:45:35.272 37  2875    NaN 250080473   2016-09-13T18:45:35.319 36  2874    NaN4   250054760   2016-09-13T18:45:35.272 329 833 NaN 250054760   2016-09-13T18:45:35.319 330 833 NaN

从 XML 数据创建 pandas 数据框

1回答