我有用于机器学习研究的数据,但我坚持使用这些字符串特征。我想将 ) 映射them(object到number(int64).
例如,在 feature 中workclass,制作一个map(dict)as {'private':0,'State-gov':1, etc}。
那么,如何在 DataFrame 中处理它,我是否应该编写一个 for 循环来查找特征中的 n 个不同类,并为每个对象特征进行 n 键映射?
# There are the code about data reading
import pandas as pd
df_trainFeatures = pd.read_csv('data/trainFeatures.csv')
object_features = ['workclass','education','Marital-status',
'occupation','occupation','relationship','race','sex','native-country']
# list data type
for i in df_trainFeatures:
print(df_trainFeatures[i].dtype,i)
//Printing
int64 age
object workclass
int64 fnlwgt
object education
int64 education-num
object Marital-status
object occupation
object relationship
object race
object sex
int64 capital-gain
int64 capital-loss
int64 hours-per-week
object native-country
子数据框如下:
眼眸繁星
相关分类