如何根据某些条件迭代 Pandas DataFrame 以创建新的 DateFrame

您可以尝试使用列表理解，pd.date_range并且explodedf['Weighted_revenue']=(df['Dealsize'].astype(float)/df['Duration'].astype(float))*df['Probability'].astype(float)df['Period']=[pd.date_range(x, periods=y, freq="M").strftime('%Y-%m') for x,y in zip(df["Start_period"], df["Duration"])]df=df.explode('Period')输出：df  Client     Stage Probability Dealsize  Duration Start_period  Weighted_revenue   Period0      A   suspect        0.25     1200         6      2020-08              50.0  2020-080      A   suspect        0.25     1200         6      2020-08              50.0  2020-090      A   suspect        0.25     1200         6      2020-08              50.0  2020-100      A   suspect        0.25     1200         6      2020-08              50.0  2020-110      A   suspect        0.25     1200         6      2020-08              50.0  2020-120      A   suspect        0.25     1200         6      2020-08              50.0  2021-011      B  prospect        0.60     1000         4      2020-10             150.0  2020-101      B  prospect        0.60     1000         4      2020-10             150.0  2020-111      B  prospect        0.60     1000         4      2020-10             150.0  2020-121      B  prospect        0.60     1000         4      2020-10             150.0  2021-01细节：首先，我们'Weighted_revenue'使用您描述的公式创建列：df['Weighted_revenue']=(df['Dealsize'].astype(float)/df['Duration'].astype(float))*df['Probability'].astype(float)df  Client     Stage Probability Dealsize  Duration Start_period  Weighted_revenue0      A   suspect        0.25     1200         6      2020-08              50.01      B  prospect        0.60     1000         4      2020-10             150.0然后，我们使用列表推导 withzip来创建基于'Start_period'和'Duration'列的日期范围df['Period']=[pd.date_range(x, periods=y, freq="M").strftime('%Y-%m') for x,y in zip(df["Start_period"], df["Duration"])]df  Client     Stage Probability Dealsize  Duration Start_period  Weighted_revenue                                             Period0      A   suspect        0.25     1200         6      2020-08              50.0  [2020-08, 2020-09, 2020-10, 2020-11, 2020-12, 2021-01]1      B  prospect        0.60     1000         4      2020-10             150.0               [2020-10, 2020-11, 2020-12, 2021-01]最后我们使用explode扩展列表：df=df.explode('Period')df  Client     Stage Probability Dealsize  Duration Start_period  Weighted_revenue   Period0      A   suspect        0.25     1200         6      2020-08              50.0  2020-080      A   suspect        0.25     1200         6      2020-08              50.0  2020-090      A   suspect        0.25     1200         6      2020-08              50.0  2020-100      A   suspect        0.25     1200         6      2020-08              50.0  2020-110      A   suspect        0.25     1200         6      2020-08              50.0  2020-120      A   suspect        0.25     1200         6      2020-08              50.0  2021-011      B  prospect        0.60     1000         4      2020-10             150.0  2020-101      B  prospect        0.60     1000         4      2020-10             150.0  2020-111      B  prospect        0.60     1000         4      2020-10             150.0  2020-121      B  prospect        0.60     1000         4      2020-10             150.0  2021-01

如何根据某些条件迭代 Pandas DataFrame 以创建新的 DateFrame

1回答