转换并求和 pandas 数据框列内嵌套列表中的元素

我有一个像这样的 df 列:


col1

[[0.73, 0.43, 0.5, 0.0], [0.39, 0.5], [0.37], [0.38, 0.51, 0.0, 0.2]]

[[0.53, 0.33, 0.2, 0.0], [0.79, 0.5], [0.96], [0.88, 0.21, 0.0, 0.0]]

子列表可以是任意大小。我试图将子列表中的数字转换为浮点数(它们是字符串),然后创建一个对每个子列表求和的列,然后除以子列表中的项目数


所以第 1 行的总和:


(.73 + .43 + .5 + 0) / 4 =.415

(.39 + .5) / 2 = .445

(.37) / 1 = .37

(.38 + .51 + 0.0 + .2) / 4 = .272

对于第 2 行:


(.53 + .33 + .2 + 0) / 4 = .265

(.79 + .5) / 2 = .645

(.96) / 1 = .96

(.88 + .21 + 0.0 + 0.0) / 4 = .272

结果:


new_col

[[.415],[.445],[.37],[.272]]

[[.265],[.645],[.96],[.272]]

我尝试过很多东西:


#something like this where it creates a column of the number of elements in each sublist and then uses that to divide the sum of each number


# this didn't work - just grabbed the first lists size

df1['words_in_company_name'] = df1['children_org_name_sublists'].str.len()


#this doesn't really work - i mean it shows the numbers per list, just not sure where to go from here

for i in df1.func_scores:

    length = []

    for j in i:

        print(j)

A


慕雪6442864
浏览 72回答 1
1回答

幕布斯6054654

只要做apply与np.meandf['new_col'] = df.col.apply(lambda x : [[np.mean(y)] for y in x ])dfOut[17]:                                                  col                               new_col0  [[0.73, 0.43, 0.5, 0.0], [0.39, 0.5], [0.37], ...  [[0.415], [0.445], [0.37], [0.2725]]1  [[0.53, 0.33, 0.2, 0.0], [0.79, 0.5], [0.96], ...  [[0.265], [0.645], [0.96], [0.2725]]
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python