猿问

迭代 df 行并分别对两列求和,直到其中一列满足条件

我肯定仍在学习 python,并尝试了无数的方法,但无法弄清楚这一点。


我有一个包含 2 列的数据框,将它们称为 A 和 B。我需要返回一个 df,它将独立地对这两列中每一列的行值求和,直到 A 的阈值总和超过某个值,在本例中假设为 10。到目前为止,我正在尝试使用 iterrows() 并可以根据 A >= 10 获取分段,但在满足阈值之前似乎无法解决行求和。即使最终 A 值不满足条件阈值,所得 df 也必须是详尽的 - 请参阅所需输出的最后一行。


    df1 = pd.DataFrame(data = [[20,16],[10,5],[3,2],[1,1],[12,10],[9,7],[6,6],[5,2]],columns=['A','B'])

    df1

        A   B

    0   20  16

    1   10  5

    2   3   2

    3   1   1

    4   12  10

    5   9   7

    6   6   6

    7   5   2

期望的结果:


        A   B

    0   20  16

    1   10  5

    2   16  13

    3   15  13

    4   5   2

预先感谢您,花费了很多时间,非常感谢您的帮助!干杯


慕尼黑5688855
浏览 174回答 2
2回答

ibeautiful

我很少为 pandas 编写长循环,但我没有找到使用 pandas 方法来执行此操作的方法。试试这个可怕的循环:(:我创建的变量t本质上是检查累积和是否>&nbsp;n(我们将其设置为10)。然后,我们决定使用,即数据框中任何给定行的t累积值或值(并且与 B 列的相同值并行)。iju有一些条件,所以有一些elif语句,并且按照我设置的方式,最后一行会有不同的行为,所以我必须对最后一行有一些单独的逻辑 - 否则if最后一个值不会得到附:import pandas as pddf1 = pd.DataFrame(data = [[20,16],[10,5],[3,2],[1,1],[12,10],[9,7],[6,6],[5,2]],columns=['A','B'])df1a,b = [],[]t,u,count = 0,0,0n=10for (i,j) in zip(df1['A'], df1['B']):&nbsp; &nbsp; count+=1&nbsp; &nbsp; if i < n and t >= n:&nbsp; &nbsp; &nbsp; &nbsp; a.append(t)&nbsp; &nbsp; &nbsp; &nbsp; b.append(u)&nbsp; &nbsp; &nbsp; &nbsp; t = i&nbsp; &nbsp; &nbsp; &nbsp; u = j&nbsp; &nbsp; elif 0 < t < n:&nbsp; &nbsp; &nbsp; &nbsp; t += i&nbsp; &nbsp; &nbsp; &nbsp; u += j&nbsp; &nbsp; elif i < n and t == 0:&nbsp; &nbsp; &nbsp; &nbsp; t += i&nbsp; &nbsp; &nbsp; &nbsp; u += j&nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; t = 0&nbsp; &nbsp; &nbsp; &nbsp; u = 0&nbsp; &nbsp; &nbsp; &nbsp; a.append(i)&nbsp; &nbsp; &nbsp; &nbsp; b.append(j)&nbsp; &nbsp; if count == len(df1['A']):&nbsp; &nbsp; &nbsp; &nbsp; if t == i or t == 0:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; a.append(i)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; b.append(j)&nbsp; &nbsp; &nbsp; &nbsp; elif t > 0 and t != i:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; t += i&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; u += j&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; a.append(t)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; b.append(u)df2 = pd.DataFrame({'A' : a, 'B' : b})df2

RISEBY

这是一个更短的有效方法:import pandas as pddf1 = pd.DataFrame(data = [[20,16],[10,5],[3,2],[1,1],[12,10],[9,7],[6,6],[5,2]],columns=['A','B'])df2 = pd.DataFrame()index = 0while index < df1.size/2:&nbsp; &nbsp; if df1.iloc[index]['A'] >= 10:&nbsp; &nbsp; &nbsp; &nbsp; a = df1.iloc[index]['A']&nbsp; &nbsp; &nbsp; &nbsp; b = df1.iloc[index]['B']&nbsp; &nbsp; &nbsp; &nbsp; temp_df = pd.DataFrame(data=[[a,b]], columns=['A','B'])&nbsp; &nbsp; &nbsp; &nbsp; df2 = df2.append(temp_df, ignore_index=True)&nbsp; &nbsp; &nbsp; &nbsp; index += 1&nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; a_sum = 0&nbsp; &nbsp; &nbsp; &nbsp; b_sum = 0&nbsp; &nbsp; &nbsp; &nbsp; while a_sum < 10 and index < df1.size/2:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; a_sum += df1.iloc[index]['A']&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; b_sum += df1.iloc[index]['B']&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; index += 1&nbsp; &nbsp; &nbsp; &nbsp; if a_sum >= 10:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; temp_df = pd.DataFrame(data=[[a_sum,b_sum]], columns=['A','B'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; df2 = df2.append(temp_df, ignore_index=True)&nbsp; &nbsp; &nbsp; &nbsp; else:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; a = df1.iloc[index-1]['A']&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; b = df1.iloc[index-1]['B']&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; temp_df = pd.DataFrame(data=[[a,b]], columns=['A','B'])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; df2 = df2.append(temp_df, ignore_index=True)关键是跟踪您在 DataFrame 中的位置并跟踪总和。不要害怕使用变量。在 Pandas 中,使用 iloc 按索引访问每一行。通过检查大小来确保不会超出 DataFrame 范围。df.size 返回元素的数量,因此它将行乘以列。这就是为什么我将大小除以列数以获得实际行数。
随时随地看视频慕课网APP

相关分类

Python
我要回答