Pandas 根据前 n 行的条件过滤数据框

我有一个形状为 [600 000, 19] 的数据框。我想根据一个条件过滤前 100 000 行,根据另一个条件过滤下 300 000 行,以及最后一行的第三个条件。我想知道如何做到这一点。

目前,我将数据框分成 3 个段并应用它们各自的条件。然后,我重新连接数据框。有没有更好的办法?

示例:根据小于 5 的任何值过滤前 100 000 行。对于第二个 300 000 行,我不想要任何大于 40 的值,等等。



眼眸繁星
浏览 165回答 2
2回答

湖上湖

您可以尝试以下方法:import pandas as pdsample = pd.DataFrame({'x' : pd.np.arange(100),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'colname': pd.np.arange(100)})conditions = [('index < 5', 'colname < 3'),&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ('index > 50', 'index < 100', 'colname < 55')]sample.query('|'.join(map(lambda x: '&'.join(x), conditions)))

蓝山帝景

方法是使用数据帧索引切片pd.concat来构建完整的布尔系列:import numpy as npimport pandas as pdnp.random.seed(0)df=pd.DataFrame(np.random.randint(0,50,60))df[pd.concat([df.iloc[:10] > 10, df[11:40] < 30, df[41:] % 2 == 0])]前 10 条记录过滤小于 10 的值,接下来的 30 个值过滤大于 30 的值,最后一个值检查偶数。然后您可以使用 dropna 删除所有 NaN 值输出:&nbsp; &nbsp; &nbsp; 00&nbsp; &nbsp;44.01&nbsp; &nbsp;47.02&nbsp; &nbsp; NaN3&nbsp; &nbsp; NaN4&nbsp; &nbsp; NaN5&nbsp; &nbsp;39.06&nbsp; &nbsp; NaN7&nbsp; &nbsp;19.08&nbsp; &nbsp;21.09&nbsp; &nbsp;36.010&nbsp; &nbsp;NaN11&nbsp; &nbsp;6.012&nbsp; 24.013&nbsp; 24.014&nbsp; 12.015&nbsp; &nbsp;1.016&nbsp; &nbsp;NaN17&nbsp; &nbsp;NaN18&nbsp; 23.019&nbsp; &nbsp;NaN20&nbsp; 24.021&nbsp; 17.022&nbsp; &nbsp;NaN23&nbsp; 25.024&nbsp; 13.025&nbsp; &nbsp;8.026&nbsp; &nbsp;9.027&nbsp; 20.028&nbsp; 16.029&nbsp; &nbsp;5.030&nbsp; 15.031&nbsp; &nbsp;NaN32&nbsp; &nbsp;0.033&nbsp; 18.034&nbsp; &nbsp;NaN35&nbsp; 24.036&nbsp; &nbsp;NaN37&nbsp; 29.038&nbsp; 19.039&nbsp; 19.040&nbsp; &nbsp;NaN41&nbsp; &nbsp;NaN42&nbsp; 32.043&nbsp; &nbsp;NaN44&nbsp; &nbsp;NaN45&nbsp; 32.046&nbsp; &nbsp;NaN47&nbsp; 10.048&nbsp; &nbsp;NaN49&nbsp; &nbsp;NaN50&nbsp; &nbsp;NaN51&nbsp; 28.052&nbsp; 34.053&nbsp; &nbsp;0.054&nbsp; &nbsp;0.055&nbsp; 36.056&nbsp; &nbsp;NaN57&nbsp; 38.058&nbsp; 40.059&nbsp; &nbsp;NaN
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python