猿问

根据嵌套列表python中的类别计算用户数

我有一个包含两个子列表的列表。这里看起来像这样


a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]

我想根据类别计算用户(唯一)。


必需的:


required = [['referral',3],['affiliate',3],['cpc',4],['orgainic',2]]

我得到的输出:


{'referral': 3, 'affiliate': 2, 'cpc': 4, 'orgainic': 3}

算错了。


这是我尝试过的代码:


a = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]


required = [['referral',3],['affiliate',3],['cpc',4],['orgainic',2]]


c = {}

visits = []

for i in a:

    # print(i)

    for j in i[1:]:

        if j not in c and i[0] not in visits:

            c[j] = 1

            visits.append(i[0])

        elif j in c and i[0] not in visits:

            c[j] = c[j]+1

print(c)

帮我解决一些问题...


千巷猫影
浏览 219回答 4
4回答

牧羊人nacy

这是一种使用collections.defaultdict.前任:from collections import defaultdicta = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]result = defaultdict(int)seen = set()for k, v in a:    key = "{}_{}".format(k, v)    if key not in seen:        result[v] += 1        seen.add(key)print(list(map(list, result.items())))输出:[['referral', 3], ['affiliate', 3], ['cpc', 4], ['orgainic', 2]]

白衣非少年

首先让我们使条目独一无二:c = {tuple(sublist) for sublist in a}现在我们有了一对独特的用户和类型。对于我们不需要用户的计数,因此让我们将其设为仅包含第二个参数的列表:c = [elem[1] for elem in c]现在我们可以很容易地计算它:from collections import Counterc = Counter(c)结果:Counter({'cpc': 4, 'affiliate': 3, 'referral': 3, 'orgainic': 2})现在把它们放在一起:from collections import Counterc = Counter(elem[1] for elem in {tuple(sublist) for sublist in a})

繁星coding

defaultdict和基于循环的解决方案这可以使用defaultdict:d = defaultdict(set)for user, category in a:    d[category].add(user)res = [[category, len(users)] for category, users in d.items()]输出:# [['affiliate', 3], ['cpc', 4], ['orgainic', 2], ['referral', 3]]groupby基于解决方案或者,这可以使用groupbyfrom来完成itertools:from itertools import groupbyfrom operator import itemgettera = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ...]# Sort the items according to the category so groupby will collect the pairs accordinglyres = {category: len({user for user, _ in pairs}) for category, pairs in       groupby(sorted(a, key=itemgetter(1)), key=itemgetter(1))}res = [list(pair) for pair in res.items()]输出:# [['affiliate', 3], ['cpc', 4], ['orgainic', 2], ['referral', 3]]

撒科打诨

这听起来像是熊猫的案例,您的列表已经是正确的形状:import pandas as pda = [['user1', 'referral'], ['user2', 'referral'], ['user1', 'referral'], ['user1', 'affiliate'], ['user7', 'affiliate'], ['user1', 'affiliate'], ['user9', 'affiliate'], ['user4', 'cpc'], ['user4', 'referral'], ['user2', 'referral'], ['user7', 'affiliate'], ['user14', 'cpc'], ['user3', 'orgainic'], ['user2', 'orgainic'], ['user4', 'cpc'], ['user2', 'cpc'], ['user8', 'cpc'], ['user2', 'orgainic']]df = pd.DataFrame(a)df.columns=["user", "type"]unique_per_type = df.groupby("type")["user"].unique()现在 unique_per_type 是:typeaffiliate            [user1, user7, user9]cpc          [user4, user14, user2, user8]orgainic                    [user3, user2]referral             [user1, user2, user4]Name: user, dtype: object您可以执行以下操作:# access length by keylen(unique_per_type["affiliate"]) # or use it like a dictfor key, val in unique_per_type.items():    print(key, len(val)))这个解决方案添加了 pandas,这是一个巨大的依赖。但是,一旦您将数据放入 DataFrame 中,您就可以用它做很多事情:df["user"].unique() # shows all unique usersdf.query("user=='user1'") # shows all observations involving user1
随时随地看视频慕课网APP

相关分类

Python
我要回答