我有以下代码,旨在从 1 个数据帧创建 2 个单独的表。这些表应用了不同的过滤器。我发现,一旦应用第一个过滤器,原始数据框就会“改变”。
df_orig = pd.read_excel('JRMaster.xlsm')
df_orig.columns = map(str.upper, df_orig.columns)
df_orig['SYSTEM'] = df_orig['SYSTEM'].str.upper()
df_orig['STATUS'] = df_orig['STATUS'].str.upper()
df = df_orig.copy(deep=True)
df_copy_all = df_orig.copy(deep=True)
df = df[(df['DATE PAID'].dt.month.between(10,10)) & (df['DATE PAID'].dt.year == 2020)]
df2 = df_copy_all[(df_copy_all['DATE SENT'].dt.month.between(10,10)) & (df['DATE SENT'].dt.year == 2020)]
df 和 df2 应该有两个不同的结果,但输出是相同的。我尝试过 df.copy() 和 df.copy(deep=True)
使用 Pandas 1.0.5 和 Python 3.6
一些论坛指出这是一个错误,但我想检查是否有解决方法或修复此问题。
我想到的另一种方法是将原始 Excel 文档读入多个数据帧,但这似乎不可持续且资源繁重。
编辑:
示例数据如下:
System DATE SENT STATUS DATE PAID
0 One 2020-10-01 OPEN NaT
1 One 2020-10-01 OPEN NaT
2 THREE 2020-10-01 SR 2020-10-07
3 One 2020-10-01 DUP NaT
4 One 2020-10-01 OPEN NaT
5 One 2020-10-01 OPEN NaT
6 THREE 2020-10-01 OPEN NaT
7 One 2020-10-01 DUP NaT
8 THREE 2020-10-01 AR 2020-07-31
9 THREE 2020-10-01 OPEN NaT
10 One 2020-10-01 AR 2020-08-21
11 One 2020-10-01 DUP NaT
12 One 2020-10-01 OPEN NaT
13 One 2020-10-01 DUP NaT
14 One 2020-10-01 DUP NaT
15 One 2020-10-01 DUP NaT
16 One 2020-10-01 DUP NaT
17 THREE 2020-10-01 OPEN NaT
18 One 2020-10-01 OPEN NaT
19 One 2020-10-01 OPEN NaT
慕森卡
Qyouu
相关分类