两个数字之间的列上的熊猫动作

目前使用熊猫和麻痹症。我有一个名为“df”的数据帧。假设我有下面的数据,我该如何根据 between 子句给第三列一个值?如果可能的话,我想把它当作一种矢量化的方法,以保持我已经拥有的速度。


我尝试过lambda函数,但坦率地说,我不明白我在做什么,我遇到了错误,例如对象没有属性“之间”。


一般方法 - 使用非矢量化方法:


NOTE: I am looking for a way to make this vectorised.


If df.['Col2'] is between 0 and 10

   df.['Col 3'] = 1

Elseif df.['Col2'] is between 10.01 and 20

   df.['Col3']  = 2

Else if df.['Col2'] is between 20.1 and 30

   df.['Col3']  = 3

样品集


+------+------+------+

| Col1 | Col2 | Col3 |

+------+------+------+

| a    |    5 |    1 |

| b    |   10 |    1 |

| c    |   15 |    2 |

| d    |   20 |    2 |

| e    |   25 |    3 |

| f    |   30 |    3 |

| g    |    1 |    1 |

| h    |   11 |    2 |

| i    |   21 |    3 |

| j    |    7 |    1 |

+------+------+------+



非常感谢


紫衣仙女
浏览 97回答 3
3回答

交互式爱情

重用当前代码的解决方案:def cust_func(row):&nbsp; &nbsp; r = row['Col2']&nbsp; &nbsp; if&nbsp; r >=0 AND r<=10:&nbsp; &nbsp; &nbsp; &nbsp; val = 1&nbsp; &nbsp; elif r >=10.01 AND r<=20:&nbsp; &nbsp; &nbsp; &nbsp; val = 2&nbsp; &nbsp; elseif r>=20.01 AND r<=30:&nbsp; &nbsp; &nbsp; &nbsp; val = 3&nbsp; &nbsp; return valdf['Col3'] = df.apply(cust_func, axis=1)最佳解决方案:cut_labels = [1, 2, 3]cut_bins = [0, 10, 20,30]df['Col3'] = pd.cut(df['Col2'], bins=cut_bins, labels=cut_labels)

千万里不及你

有几种方法:麻木选择和麻木。我更喜欢后者,因为我不必列出条件 - 只要您的数据排序,它就适用于二分法算法;是的,我认为这是一群中最快的。如果您运行一些计时并共享结果,那将很酷:&nbsp; Col1&nbsp; Col20&nbsp; &nbsp;a&nbsp; &nbsp;51&nbsp; &nbsp;b&nbsp; &nbsp;102&nbsp; &nbsp;c&nbsp; &nbsp;153&nbsp; &nbsp;d&nbsp; &nbsp;204&nbsp; &nbsp;e&nbsp; &nbsp;255&nbsp; &nbsp;f&nbsp; &nbsp;306&nbsp; &nbsp;g&nbsp; &nbsp;17&nbsp; &nbsp;h&nbsp; &nbsp;118&nbsp; &nbsp;i&nbsp; &nbsp;219&nbsp; &nbsp;j&nbsp; &nbsp;7&nbsp; &nbsp;#step 1: create your 'conditions'#sort dataframe on Col2df = df.sort_values('Col2')#benchmarks are ur ranges within which you set your scores/gradebenchmarks = np.array([10,20,30])#the grades to be assigned for Col2score = np.array([1,2,3])#and use search sorted#it will generate the indices for where the values should be#e.g if you have [1,4,5] then the position of 3 will be 1, since it is between 1 and 4#and python has a zero based index notationindices = np.searchsorted(benchmarks,df.Col2)#create ur new column by indexing the score array with the indicesdf['Col3'] = score[indices]df = df.sort_index()df&nbsp; &nbsp; Col1&nbsp; &nbsp; Col2&nbsp; Col30&nbsp; &nbsp; a&nbsp; &nbsp; &nbsp; &nbsp;5&nbsp; &nbsp; &nbsp; 11&nbsp; &nbsp; b&nbsp; &nbsp; &nbsp; &nbsp;10&nbsp; &nbsp; &nbsp;12&nbsp; &nbsp; c&nbsp; &nbsp; &nbsp; &nbsp;15&nbsp; &nbsp; &nbsp;23&nbsp; &nbsp; d&nbsp; &nbsp; &nbsp; &nbsp;20&nbsp; &nbsp; &nbsp;24&nbsp; &nbsp; e&nbsp; &nbsp; &nbsp; &nbsp;25&nbsp; &nbsp; &nbsp;35&nbsp; &nbsp; f&nbsp; &nbsp; &nbsp; &nbsp;30&nbsp; &nbsp; &nbsp;36&nbsp; &nbsp; g&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; &nbsp; &nbsp; 17&nbsp; &nbsp; h&nbsp; &nbsp; &nbsp; &nbsp;11&nbsp; &nbsp; &nbsp;28&nbsp; &nbsp; i&nbsp; &nbsp; &nbsp; &nbsp;21&nbsp; &nbsp; &nbsp;39&nbsp; &nbsp; j&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; &nbsp; &nbsp; 1

慕无忌1623718

你可以用 np.select() 漂亮而干净地做到这一点。我添加了一些<=,因为我猜你想更新所有值。但是,如果需要,它很容易编辑。conditions = [(df['Col2'] > 0) & (df['Col2'] <= 10),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(df['Col2'] > 10) & (df['Col2'] <= 20),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(df['Col2'] > 20) & (df['Col2'] <= 30) ]updates = [1, 2, 3]df["Col3"] = np.select(conditions, updates, default=999)使用原始范围将导致这种情况,其中值 == 10, 20, 30 从 np.select() 获取值 999。conditions = [(df['Col2'] > 0) & (df['Col2'] < 10),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(df['Col2'] > 10.01) & (df['Col2'] < 20),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;(df['Col2'] > 20.1) & (df['Col2'] < 30) ]updates = [1, 2, 3]df["Col3"] = np.select(conditions, updates, default=999)print(df)&nbsp; &nbsp; Col1&nbsp; &nbsp; Col2&nbsp; &nbsp; Col30&nbsp; &nbsp;a&nbsp; &nbsp;5&nbsp; &nbsp;11&nbsp; &nbsp;b&nbsp; &nbsp;10&nbsp; 9992&nbsp; &nbsp;c&nbsp; &nbsp;15&nbsp; 23&nbsp; &nbsp;d&nbsp; &nbsp;20&nbsp; 9994&nbsp; &nbsp;e&nbsp; &nbsp;25&nbsp; 35&nbsp; &nbsp;f&nbsp; &nbsp;30&nbsp; 9996&nbsp; &nbsp;g&nbsp; &nbsp;1&nbsp; &nbsp;17&nbsp; &nbsp;h&nbsp; &nbsp;11&nbsp; 28&nbsp; &nbsp;i&nbsp; &nbsp;21&nbsp; 39&nbsp; &nbsp;j&nbsp; &nbsp;7&nbsp; &nbsp;1
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python