我试图从类似于此的数据框中提取一些特征:
feature1:float feature2:float feature3:string succeeded:boolean
我远不是该主题的专家,但我尝试了以下操作:
from sklearn.feature_extraction.text import CountVectorizer
import scipy as sp
vectorizer = CountVectorizer()
vectorizer.fit(small_df.feature3)
X = sp.sparse.hstack( (vectorizer.transform(small_df.feature3),
small_df[['feature1', 'feature2']),
format='csr')
X_columns = vectorizer.get_feature_names() + df[cols].columns.tolist()
但是,我最终遇到以下错误: TypeError: no supported conversion for types: (dtype('int64'), dtype('O'))
任何帮助,将不胜感激!
撒科打诨
相关分类