我正在寻找一种基于值条件插入重复行的方法。
输入数据集包含以周为单位的客户价格和价格有效期-'price_start_week'和'price_end_week'。
想法是通过添加带有实际星期数的新列来扩展数据框,并根据有效星期数重复行。
输入:
╔═══════════════╦══════════════════╦════════════════╦═════════════╗
║ customer_name ║ price_start_week ║ price_end_week ║ price_value ║
╠═══════════════╬══════════════════╬════════════════╬═════════════╣
║ A ║ 4 ║ 7 ║ 500 ║
║ B ║ 3 ║ 6 ║ 600 ║
║ C ║ 2 ║ 4 ║ 700 ║
╚═══════════════╩══════════════════╩════════════════╩═════════════╝
输出:
+---------------+------------------+----------------+-------------+-------------+
| customer_name | price_start_week | price_end_week | actual week | price_value |
+---------------+------------------+----------------+-------------+-------------+
| A | 4 | 7 | 4 | 500 |
| A | 4 | 7 | 5 | 500 |
| A | 4 | 7 | 6 | 500 |
| A | 4 | 7 | 7 | 500 |
| B | 3 | 6 | 3 | 600 |
| B | 3 | 6 | 4 | 600 |
| B | 3 | 6 | 5 | 600 |
| B | 3 | 6 | 6 | 600 |
| C | 2 | 2 | 4 | 700 |
| C | 2 | 3 | 4 | 700 |
| C | 2 | 4 | 4 | 700 |
+---------------+------------------+----------------+-------------+-------------+
最好的方法是什么?
我在考虑应用功能,像这样:
def repeat(a):
if (a['price_start_week']>a['price_end_week']):
return a['price_start_week']-a['price_end_week']
...
df['actual_week']=df.apply(repeat, axis=0)
梦里花落0921
相关分类