从列中的相似值创建嵌套字典并使用值作为字典的键包含具有该值的所有行

4回答

跃然一笑

那这个呢？new_dict = df.set_index('image').stack().groupby('image').apply(list).to_dict()print(new_dict){'bookstore_video0_40.jpg': [763,  899,  806,  940,  'pedestrian',  1026,  754,  1075,  797,  'pedestrian',  868,  770,  927,  822,  'biker',  413,  1010,  433,  1040,  'pedestrian'], 'bookstore_video0_80.jpg': [866,  278,  917,  328,  'pedestrian',  761,  825,  820,  865,  'biker']}

0 0

开心每一天1111

这是一个基于您的示例的工作示例，但读取实际的 XML 文件除外。非常感谢。我怀疑您的回答会很有用，因为这是机器视觉领域的人们在进行诸如切割已经注释的 4K 图像之类的事情时会遇到的问题。import sysimport globimport numpy as npimport pandas as pdfrom lxml import etreefrom pathlib import Path, PurePosixPathfrom xml.etree import ElementTree as ETdf = pd.DataFrame(dict(    image = '40.jpg 40.jpg 40.jpg 40.jpg 80.jpg 80.jpg'.split(),    xmin = [763, 1026, 868, 413, 866, 761],    ymin = [899, 754, 770, 1010, 278, 825],    xmax = [806, 1075, 927, 433, 917, 820],    ymax = [940, 797, 822, 1040, 328, 865],    label = 'pedestrian pedestrian biker pedestrian pedestrian biker'.split(),))for img in df['image'].unique():    img_df = df[df['image']==img].drop(columns = 'image').reset_index()    boxes = range(img_df.shape[0])    print(img, '\n', img_df)    # Ideally your custom voc writer can be inited here    # with something like:    image = img    # v_writer = VocWriter(f'path/{img[:-4]}.xml')    print("New custom VOC Writer instance inited here!")    depth = 3    filepath = PurePosixPath('image')    annotation = ET.Element('annotation')    ET.SubElement(annotation, 'folder').text = str(image)    ET.SubElement(annotation, 'filename').text = str(image)    ET.SubElement(annotation, 'segmented').text = '0'    size = ET.SubElement(annotation, 'size')    ET.SubElement(size, 'width').text = str('0')    ET.SubElement(size, 'height').text = str('0')    ET.SubElement(size, 'depth').text = str('3')    for box in boxes:        xmin = img_df.loc[box,'xmin']        ymin = img_df.loc[box,'ymin']        xmax = img_df.loc[box,'xmax']        ymax = img_df.loc[box,'ymax']        label = img_df.loc[box,'label']        print(xmin, ymin, xmax, ymax)        # Inside of this loop,         # you can add each box to your VocWriter object        # something like:        ob = ET.SubElement(annotation, 'object')        ET.SubElement(ob, 'name').text = str(img_df.loc[box,'label'])        ET.SubElement(ob, 'pose').text = 'Unspecified'        ET.SubElement(ob, 'truncated').text = '0'        ET.SubElement(ob, 'difficult').text = '0'        bbox = ET.SubElement(ob, 'bndbox')        ET.SubElement(bbox, 'xmin').text = str(img_df.loc[box,'xmin'])        ET.SubElement(bbox, 'ymin').text = str(img_df.loc[box,'ymin'])        ET.SubElement(bbox, 'xmax').text = str(img_df.loc[box,'xmax'])        ET.SubElement(bbox, 'ymax').text = str(img_df.loc[box,'ymax'])    # Once you exit that inner loop,    # you can save your data to your .xml file    # with something like:    # v_writer.save(f'path/{img[:-4]}.xml')    print(".xml file saved here!")    fileName = str(img)    tree = ET.ElementTree(annotation)    tree.write("./mergedxml/" + fileName + ".xml", encoding='utf8')

0 0

繁花如伊

也许您需要在groupby上使用 dict 和tuple/list：images_dict = dict(tuple(df.groupby('image')))

0 0

侃侃尔雅

我想将此作为评论而不是答案，但链接太长：我写了一个voc作家。我只需要能够以这样的方式传递数据，以便我可以遍历它。我有一个不同的数据集，我在其中做类似的事情，但数据已经是一种易于使用的形式。对于我的项目，我花了很多时间编辑、清理、转换等数据。对我来说不好玩😁 – Robi Sen你的 voc 作家是如何工作的？它是否类似于我链接到的那个（即使用 OPP 并具有用于将 bbox 数据添加到 xml 编写器实例的类方法，然后是另一种将该实例保存到 xml 文件的方法？）评论写得不好，这里有一个更好的例子来说明我的意思：import pandas as pddf = pd.DataFrame(dict(    image = '40.jpg 40.jpg 40.jpg 40.jpg 80.jpg 80.jpg'.split(),    xmin = [763, 1026, 868, 413, 866, 761],    ymin = [899, 754, 770, 1010, 278, 825],    xmax = [806, 1075, 927, 433, 917, 820],    ymax = [940, 797, 822, 1040, 328, 865],    label = 'pedestrian pedestrian biker pedestrian pedestrian biker'.split(),))for img in df['image'].unique():    img_df = df[df['image']==img].drop(columns = 'image').reset_index()    boxes = range(img_df.shape[0])    print(img, '\n', img_df)    # Ideally your custom voc writer can be inited here    # with something like:    # v_writer = VocWriter(f'path/{img[:-4]}.xml')    print('New custom VOC XML Writer instance inited here!')    for box in boxes:        xmin = img_df.loc[box,'xmin']        ymin = img_df.loc[box,'ymin']        xmax = img_df.loc[box,'xmax']        ymax = img_df.loc[box,'ymax']        label = img_df.loc[box,'label']        print(xmin, ymin, xmax, ymax)        # Inside of this loop,         # you can add each box to your VocWriter object        # something like:        # v_writer.addObject(label, xmin, ymin, xmax, ymax)        print('New bbox object added to writer instance here!')    # Once you exit that inner loop,    # you can save your data to your .xml file    # with something like:    # v_writer.save(f'path/{img[:-4]}.xml')    print(f'path/{img[:-4]}.xml file saved here!')逐步浏览python导师中的示例，以更好地了解我的想法

0 0