使用 if 语句对 df 进行子集 - Pandas

首页课程实战体系课手记专栏慕课教程

使用 if 语句对 df 进行子集 - Pandas

我希望df使用if语句创建并返回子集。具体来说，对于下面的代码，我有两组不同的值。我要返回的df值将根据这些值之一而有所不同。

使用下面的代码，具体值将在normal和内different。中的值place将决定如何对df进行子集化。

下面是我的尝试。in 中的值place永远只是一个值，因此它不会完全匹配列表。df当place这些列表中的值等于单个值时，是否可以返回？

我希望返回df1以用于后续任务。

import pandas as pd

df = pd.DataFrame({

'period' : [1.0, 1.0, 2.0, 2.0, 3.0, 4.0, 5.0, 7.0, 7.0, 8.0, 9.0],

})

place = 'a'

normal = ['a','b']

different = ['v','w','x','y','z']

different_subset_start = 2

normal_subset_start = 4

subset_end = 8

for val in df:

if place in different:

print('place is different')

df1 = df[(df['period'] >= different_subset_start) & (df['period'] <= subset_end)].drop_duplicates(subset = 'period')

return df1

elif place in normal:

print('place is normal')

df1 = df[(df['period'] >= normal_subset_start) & (df['period'] <= subset_end)].drop_duplicates(subset = 'period')

return df1

else:

print('Incorrect input for Day. Day Floater could not be scheduled. Please check input value')

return

打印（df1）

预期的输出将返回df1以供以后使用。

period

2 2.0

4 3.0

5 4.0

6 5.0

7 7.0

9 8.0

动漫人物

浏览 244回答 2

2回答

素胚勾勒不出你

要检查一个对象是否在某物中而不是检查它是否等于某物，请使用in.if place in different:同样地elif place in normal:编辑：如果你把它变成一个函数，它应该是这样的。基本上，您只需要做一些def my_function_name(arguments):事情，然后缩进其余代码，使其属于该函数。像这样：import pandas as pddef get_subset(df, place):    normal = ['a','b']    different = ['v','w','x','y','z']    different_subset_start = 2    normal_subset_start = 4    subset_end = 8    if place in different:        df1 = df[(df['period'] >= different_subset_start) & (df['period'] <= subset_end)].drop_duplicates(subset = 'period')    elif place in normal:        df1 = df[(df['period'] >= normal_subset_start) & (df['period'] <= subset_end)].drop_duplicates(subset = 'period')    else:        df1 = None    return df1df = pd.DataFrame({    'period' : [1.0, 1.0, 2.0, 2.0, 3.0, 4.0, 5.0, 7.0, 7.0, 8.0, 9.0],                                 })place = 'a'print(get_subset(df, place))

0 0

呼如林

看看for val in df:你的代码。这样的结构很奇怪，因为您不使用val变量。将代码的最后一个片段更改为如下所示：def fn():    if place in different:        print('place is different')        return df[df.period.between(different_subset_start, subset_end)]\            .drop_duplicates(subset='period')    elif place in normal:        print('place is normal')        return df[df.period.between(normal_subset_start, subset_end)]\            .drop_duplicates(subset = 'period')    else:        print('Incorrect input for place. Please check value')在您的情况下subset = 'period'是多余的，因为period是 DataFrame 中的唯一列。也不需要最后一次返回。如果函数执行到代码末尾，它会返回而不返回任何值。还有一个细节：如果您的DataFrame有一个列，那么一个Series就足够了？

0 0

随时随地看视频慕课网APP

相关分类

Python