Python Pandas 在不同日期和日期范围内重新采样特定时间

我的解决方案的想法基于具有范围定义的辅助 DataFrame，为此要计算平均值（上述属性的day_in_week、time_in_day 和相应的CustomBusinessHour ）。这个 DataFrame（我称之为calendars）的创建从 day_in_week , time_in_day列开始：calendars = pd.DataFrame([    ['sun',     'morning'],    ['sun-thu', 'morning'],    ['sun-thu', 'noon'],    ['fri-sat', 'noon'],    ['fri',     'eve']],    columns=['day_in_week', 'time_in_day'])如果您需要更多此类定义，请在此处添加它们。然后，添加相应的CustomBusinessHour对象：定义一个函数来获取小时限制：def getHourLimits(name):    if name == 'morning':        return '06:00', '10:00'    elif name == 'noon':        return '11:00', '13:00'    elif name == 'eve':        return '18:00', '21:00'    else:        return '8:00', '16:00'定义一个函数来获取周掩码（开始时间和结束时间）：def getWeekMask(name):    parts = name.split('-')    if len(parts) > 1:        fullWeek = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']        ind1 = fullWeek.index(parts[0].capitalize())        ind2 = fullWeek.index(parts[1].capitalize())        return ' '.join(fullWeek[ind1 : ind2 + 1])    else:        return parts[0].capitalize()定义生成CustomBusinessHour对象的函数：def getCBH(row):    wkMask = getWeekMask(row.day_in_week)    hStart, hEnd = getHourLimits(row.time_in_day)    return pd.offsets.CustomBusinessHour(weekmask=wkMask, start=hStart, end=hEnd)将CustomBusinessHour对象添加到日历：calendars['CBH'] = calendars.apply(getCBH, axis=1)然后为给定的实体 Id 定义一个计算所有必需均值的函数：def getSums(entId):    outRows = []    wrk = df[df.entity_id.eq(entId)]    # Filter for entity Id    for _, row in calendars.iterrows():        dd = row.day_in_week        hh = row.time_in_day        cbh = row.CBH        # Filter for the current calendar        cnts = wrk[wrk.time.apply(lambda val: cbh.is_on_offset(val))]        cnt = cnts.counts.mean()        if pd.notnull(cnt):            outRows.append(pd.Series([entId, dd, hh, cnt],                index=['entity_id', 'day_in_week', 'time_in_day', 'counts_mean']))    return pd.DataFrame(outRows)如您所见，结果仅包含非空均值。并生成结果，运行：pd.concat([getSums(entId) for entId in df.entity_id.unique()], ignore_index=True)对于您的数据样本（仅包含早上的读数），结果是：   entity_id day_in_week time_in_day  counts_mean0        175         sun     morning     6.3333331        175     sun-thu     morning     6.3333332        178         sun     morning     5.0000003        178     sun-thu     morning     5.0000004        200         sun     morning     5.0000005        200     sun-thu     morning     5.000000

Python Pandas 在不同日期和日期范围内重新采样特定时间

2回答