我有两种不同类型的缺失值(np.nan 和 None)的数据,我正在尝试使用 SimpleImputer 对它们进行估算。尽管我可以分两步完成此操作,但我想知道是否有办法将其合并为一个。我的代码如下:
import pandas as pd
import numpy as np
from sklearn.impute import SimpleImputer
train = pd.DataFrame({
'users':[None,'John Johnson',np.nan,'John Smith','Mary Williams','ted bundy'],
})
test = pd.DataFrame({
'users':[None,np.nan,'John Smith','Mary Williams','Andy Rollins'],
})
si1 = SimpleImputer(strategy='constant',fill_value='NAN')
si2 = SimpleImputer(strategy='constant',missing_values = None, fill_value='MISSING')
train_imputed_interim1 = si1.fit_transform(train)
train_imputed = si2.fit_transform(train_imputed_interim1)
test_imputed_interim1 = si1.fit_transform(test)
test_imputed = si2.fit_transform(test_imputed_interim1)
print('\ntrain_imputed:')
print(train_imputed)
print('\ntest_imputed:')
print(test_imputed)
有没有办法将 si1 和 si2 合二为一。我试过了
si = SimpleImputer(strategy='constant',missing_values = [None,np.nan], fill_value='MISSING')
但这似乎不起作用。
有只小跳蛙
哆啦的时光机
相关分类