我有一个可重现的例子,玩具数据框:
df = pd.DataFrame({'my_customers':['John','Foo'],'email':['email@gmail.com','othermail@yahoo.com'],'other_column':['yes','no']})
print(df)
my_customers email other_column
0 John email@gmail.com yes
1 Foo othermail@yahoo.com no
我apply()对行创建了一个函数,在函数内部创建了一个新列:
def func(row):
# if this column is 'yes'
if row['other_column'] == 'yes':
# create a new column with 'Hello' in it
row['new_column'] = 'Hello'
# return to df
return row
# otherwise
else:
# just return the row
return row
然后我将该函数应用于 df,我们可以看到顺序已更改。这些列现在按字母顺序排列。有没有办法避免这种情况?我想保持原来的顺序。
df = df.apply(func, axis = 1)
print(df)
email my_customers new_column other_column
0 email@gmail.com John Hello yes
1 othermail@yahoo.com Foo NaN no
为澄清而编辑 - 上面的代码太简单了
输入
df = pd.DataFrame({'my_customers':['John','Foo'],
'email':['email@gmail.com','othermail@yahoo.com'],
'api_status':['data found','no data found'],
'api_response':['huge json','huge json']})
my_customers email api_status api_response
0 John email@gmail.com data found huge json
1 Foo othermail@yahoo.com no data found huge json
预期输出:
my_customers email api_status api_response job_1 job_2 \
0 John email@gmail.com data found huge json xyz xyz2
1 Foo othermail@yahoo.com no data found huge json nan nan
education_1 facebook other api info
0 foo profile1 etc
1 nan nan nan
犯罪嫌疑人X
汪汪一只猫
相关分类