使用带有子集的 iterrows 操作数据帧

首页课程实战体系课手记专栏慕课教程

我正在尝试根据他们的 ID、初始金额和余额来操作这个数据框，这是我想要的数据框，desired_output 是我制作的列：

df = pd.DataFrame(

{"ID" : [1,1,1,2,3,3,3],

"Initial amount": [7650,25500,56395,13000,10700,12000,27000],

"Balance": [43388,43388,43388,2617,19250,19250,19250], "desired_output": [7650,25500,10238,2617,10720,8530,0]})

这是我当前的代码：

unique_ids = list(df["ID"].unique())

new_output = []

for i,row in df.iterrows():

this_adv = row["ID"]

subset = df.loc[df["ID"] == this_adv,:]

if len(subset) == 1:

this_output = np.where(row["Balance"] >= row["Initial amount"], row["Initial amount"], row["Balance"])

new_output.append(this_output)

else:

if len(subset) >= 1:

if len(subset) == 1:

this_output = np.where(row["Balance"] >= row["Initial amount"], row["Initial amount"], row["Balance"])

new_output.append(this_output)

elif row["Balance"] - sum(new_output) >= row["Initial amount"]:

this_output = row["Initial amount"]

new_output.append(this_output)

else:

this_output = row["Balance"] - sum(new_output)

new_output.append(this_output)

new_df = pd.DataFrame({"new_output" : new_output})

final_df = pd.concat([df,new_df], axis = 1)

基本上我想要做的是如果只有 1 个唯一 ID (len(subset) == 1) 然后使用第一个 if 语句。具有超过 1 个 ID（len(subset) >= 1）的任何其他内容都使用其他 if 语句。我没有得到我想要的输出，你们会如何处理这个问题？

MM们

浏览 173回答 1

随时随地看视频慕课网APP