如何通过使用特定值对行进行分组来拆分数据帧并创建子数据帧?

我有一个像波纹管一样的数据帧


date,value

2/10/19,34

2/11/19,34

2/12/19,34

2/13/19,34

2/14/19,34

2/15/19,34

2/16/19,34

2/17/19,0

2/18/19,0

2/19/19,0

2/20/19,22

2/21/19,22

2/22/19,22

2/23/19,22

2/24/19,0

2/25/19,0

2/26/19,0

2/27/19,0

2/28/19,1

3/1/19,2

3/2/19,2

3/3/19,1

3/4/19,0

3/5/19,0

3/6/19,0

3/7/19,3

3/8/19,3

3/9/19,3

3/10/19,0

在每个间隔数据帧都有零值之后,我想以这种方式对行进行分组,如果零连续出现两次,它应该创建一个子数据帧并保存文件。


Output:



df1 

    2/17/19,0

    2/18/19,0

    2/19/19,0

df2

    2/24/19,0

    2/25/19,0

    2/26/19,0

    2/27/19,0

df3

    3/4/19,0

    3/5/19,0

    3/6/19,0

我尝试了很多方法来做到这一点,但它失败了。


谢谢。


ITMISS
浏览 101回答 2
2回答

SMILET

您可以尝试使用滚动:def merge_intervals(intervals):    sorted_intervals = sorted(intervals, key=lambda x: x[0])    interval_index = 0    #print(sorted_intervals)    for  i in sorted_intervals:        if i[0] > sorted_intervals[interval_index][1]:            interval_index += 1            sorted_intervals[interval_index] = i        else:            sorted_intervals[interval_index] = [sorted_intervals[interval_index][0], i[1]]    #print(sorted_intervals)    return sorted_intervals[:interval_index+1]end_ids = df[df['value'].rolling(3).apply(lambda x: (x==0).all())==1].indexstart_ids = end_ids-3intervals = merge_intervals([*zip(starts_ids, end_ids)])for i,interval in enumerate(intervals):    df[interval[0]+1:interval[1]+1].to_csv('df_' + str(i) + '.csv')不是最漂亮的代码,但它可以工作,合并函数在这里找到:在Python中合并重叠间隔

慕虎7371278

找到值等于零的位置,并取长度为 3 的滚动总和。找出滚动总和等于 3 的位置。结果将滞后 2 个空格,因此我们采用结果的 -1 平移和 -2 移位版本的结果的逻辑。ormask = df['value'].eq(0).rolling(3).sum().eq(3)mask |= mask.shift(-2) | mask.shift(-1)为了得到组,我取逻辑否定的累积和。对于每个非零值,这将递增,并停滞在零处。但是,每组零将不同。在我使用 时,这并不重要,因为我将使用首字母来仅查看首先满足条件的行。groupbymask但是,生成的组将是一组非连续的整数。因为我不喜欢这样,所以我曾经给这些组提供从零开始的唯一整数值。factorizegrp_masked = (~mask).cumsum()[mask].factorize()[0]g = df[mask].groupby(grp_masked)保存文件for grp, d in g:&nbsp; &nbsp; d.to_csv(f'df_{grp}.csv', index=False)创建词典df_dict = {grp: d for grp, d in g}详这将显示原始数据帧以及显示我们计算的一些内容的其他列。group_series = pd.Series(&nbsp; &nbsp; grp_masked, df.index[mask], pd.Int64Dtype())df_ = df.assign(&nbsp; &nbsp; EqZero=df['value'].eq(0),&nbsp; &nbsp; Roll2=df['value'].eq(0).rolling(3).sum(),&nbsp; &nbsp; Is3=df['value'].eq(0).rolling(3).sum().eq(3),&nbsp; &nbsp; Shift=lambda d: d.Is3.shift(-2) | d.Is3.shift(-1),&nbsp; &nbsp; Mask=mask,&nbsp; &nbsp; PreGrp=(~mask).cumsum(),&nbsp; &nbsp; Grp=group_series)df_&nbsp; &nbsp; &nbsp; &nbsp;date&nbsp; value&nbsp; EqZero&nbsp; Roll2&nbsp; &nbsp; Is3&nbsp; Shift&nbsp; &nbsp;Mask&nbsp; PreGrp&nbsp; &nbsp;Grp0&nbsp; &nbsp;2/10/19&nbsp; &nbsp; &nbsp;34&nbsp; &nbsp;False&nbsp; &nbsp; NaN&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;1&nbsp; <NA>1&nbsp; &nbsp;2/11/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; NaN&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;2&nbsp; <NA>2&nbsp; &nbsp;2/12/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 2.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;3&nbsp; <NA>3&nbsp; &nbsp;2/13/19&nbsp; &nbsp; &nbsp;34&nbsp; &nbsp;False&nbsp; &nbsp; 2.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;4&nbsp; <NA>4&nbsp; &nbsp;2/14/19&nbsp; &nbsp; &nbsp;34&nbsp; &nbsp;False&nbsp; &nbsp; 1.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;5&nbsp; <NA>5&nbsp; &nbsp;2/15/19&nbsp; &nbsp; &nbsp;34&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;6&nbsp; <NA>6&nbsp; &nbsp;2/16/19&nbsp; &nbsp; &nbsp;34&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; <NA>7&nbsp; &nbsp;2/17/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 1.0&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; &nbsp; &nbsp;08&nbsp; &nbsp;2/18/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 2.0&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; &nbsp; &nbsp;09&nbsp; &nbsp;2/19/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 3.0&nbsp; &nbsp;True&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; &nbsp;7&nbsp; &nbsp; &nbsp;010&nbsp; 2/20/19&nbsp; &nbsp; &nbsp;22&nbsp; &nbsp;False&nbsp; &nbsp; 2.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;8&nbsp; <NA>11&nbsp; 2/21/19&nbsp; &nbsp; &nbsp;22&nbsp; &nbsp;False&nbsp; &nbsp; 1.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; &nbsp;9&nbsp; <NA>12&nbsp; 2/22/19&nbsp; &nbsp; &nbsp;22&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 10&nbsp; <NA>13&nbsp; 2/23/19&nbsp; &nbsp; &nbsp;22&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 11&nbsp; <NA>14&nbsp; 2/24/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 1.0&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 11&nbsp; &nbsp; &nbsp;115&nbsp; 2/25/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 2.0&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 11&nbsp; &nbsp; &nbsp;116&nbsp; 2/26/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 3.0&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 11&nbsp; &nbsp; &nbsp;117&nbsp; 2/27/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 3.0&nbsp; &nbsp;True&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 11&nbsp; &nbsp; &nbsp;118&nbsp; 2/28/19&nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp;False&nbsp; &nbsp; 2.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 12&nbsp; <NA>19&nbsp; &nbsp;3/1/19&nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp;False&nbsp; &nbsp; 1.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 13&nbsp; <NA>20&nbsp; &nbsp;3/2/19&nbsp; &nbsp; &nbsp; 2&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 14&nbsp; <NA>21&nbsp; &nbsp;3/3/19&nbsp; &nbsp; &nbsp; 1&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 15&nbsp; <NA>22&nbsp; &nbsp;3/4/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 1.0&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 15&nbsp; &nbsp; &nbsp;223&nbsp; &nbsp;3/5/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 2.0&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 15&nbsp; &nbsp; &nbsp;224&nbsp; &nbsp;3/6/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 3.0&nbsp; &nbsp;True&nbsp; False&nbsp; &nbsp;True&nbsp; &nbsp; &nbsp; 15&nbsp; &nbsp; &nbsp;225&nbsp; &nbsp;3/7/19&nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp;False&nbsp; &nbsp; 2.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 16&nbsp; <NA>26&nbsp; &nbsp;3/8/19&nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp;False&nbsp; &nbsp; 1.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 17&nbsp; <NA>27&nbsp; &nbsp;3/9/19&nbsp; &nbsp; &nbsp; 3&nbsp; &nbsp;False&nbsp; &nbsp; 0.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 18&nbsp; <NA>28&nbsp; 3/10/19&nbsp; &nbsp; &nbsp; 0&nbsp; &nbsp; True&nbsp; &nbsp; 1.0&nbsp; False&nbsp; False&nbsp; False&nbsp; &nbsp; &nbsp; 19&nbsp; <NA>
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python