如何从嵌套列表中找到包含较高值的列表并返回这些列表?

我有这个包含重复条目的嵌套列表:


[['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

 ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

 ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

 ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

 ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

 ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],

 ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]

我想通过 i[3] 过滤嵌套列表,所以最终输出将是这样的


[['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],

 ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],

 ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]

我尝试了一个 for 循环,但我无法弄清楚如何获得重复列表的最高值


jeck猫
浏览 101回答 3
3回答

ITMISS

这是我能想到的最 pythonic 的方式。我的做法是先对列表的列表进行排序,按sublist[3],这意味着当我们遍历列表时,我们最终会在遇到重复项之前遇到具有最大评论数的子列表。这个技巧将用于构建最终列表。meta_list = [['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Coloring book moana', 'FAMILY', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'], ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'], ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]# Sort the list by review count and review name - make sure the highest review is firstmeta_list.sort(key=lambda x: (int(x[3]), x[0]), reverse=True)# This is the list we'll use to store the final data infinal_list = []# Go through all the items in the meta_listfor meta in meta_list:        if not meta[0] in [item[0] for item in final_list]:        '''        If another meta with the same name (0th index)        doesn't already exist in final_list, add it        '''        final_list.append(meta)输出-[['Instagram',  'SOCIAL',  '4.5',  66577446,  'Varies with device',  '1,000,000,000+',  'Free',  '0',  'Teen',  'Social',  'July 31, 2018',  'Varies with device',  'Varies with device'], ['Gmail',  'COMMUNICATION',  '4.3',  4604483,  'Varies with device',  '1,000,000,000+',  'Free',  '0',  'Everyone',  'Communication',  'August 2, 2018',  'Varies with device',  'Varies with device'], ['Coloring book moana',  'FAMILY',  '3.9',  974,  '14M',  '500,000+',  'Free',  '0',  'Everyone',  'Art & Design;Pretend Play',  'January 15, 2018',  '2.0.0',  '4.0.3 and up']]基本上它将所有不存在的元数据添加到final_list. 为什么这行得通?因为您在循环时遇到的第一个元数据是评论数最高的元数据。所以一旦那个被添加,它的复制品就不能被添加,我们就完成了。注意:这不会保留评论本身的顺序。它只会确保只保留评论数最高的评论,以防出现同名的重复评论。

MMTTMM

这个问题可能有更优雅/pythonic 的解决方案,但这是一个可能的途径:my_list = [...] # Nested list heredef compare_duplicates(nested_list, name_index=0, compare_index=3):    max_values = dict() # Used two dictionaries for readability    final_indexes = dict()    for i, item in enumerate(nested_list):        name, value = item[name_index], item[compare_index]        if value > max_values.get(name, 0):            max_values[name] = value            final_indexes[name] = i    return [nested_list[i] for i in final_indexes.values()]print(compare_duplicates(my_list))

忽然笑

是这样的:_DATA = [    ['Coloring book moana', 'ART_AND_DESIGN', '3.9', 967, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],    ['Coloring book moana', 'ART_AND_DESIGN', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'],    ['Gmail', 'COMMUNICATION', '4.3', 4604324, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],    ['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device'],    ['Instagram', 'SOCIAL', '4.5', 66577313, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],    ['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'],    ['Instagram', 'SOCIAL', '4.5', 66509917, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']]def print_highest(data):    list_map = {}    for d in data:        key = str(d[0:3] + d[4:])        if key not in list_map:            list_map[key] = d            continue        if d[3] > list_map[key][3]:            list_map[key] = d    for l in list_map.values():        print(l)print_highest(_DATA)输出:['Coloring book moana', 'ART_AND_DESIGN', '3.9', 974, '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']['Gmail', 'COMMUNICATION', '4.3', 4604483, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Everyone', 'Communication', 'August 2, 2018', 'Varies with device', 'Varies with device']['Instagram', 'SOCIAL', '4.5', 66577446, 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python