我正在编写一个智能应用程序,根据来自 UCI 机器学习库的避孕方法选择数据集的数据,确定哪些因素会导致关系中的 0 个孩子,引用 Dua, D. 和 Graff, C. (2019)。UCI 机器学习存储库 [ http://archive.ics.uci.edu/ml]。加州尔湾:加州大学信息与计算机科学学院。我在使用 pandas apply 函数编写 lambda 表达式时遇到问题。
我不确定要尝试什么。
这是一些示例文件
wife's age, wife's education, husband's education, number of children, wife's religion, wife now working, husband's occupation, standard-of-living index, media exposure, contraceptive method used
24,2,3,3,1,1,2,3,0,1
45,1,3,10,1,1,3,4,0,1
43,2,3,7,1,1,3,4,0,1
42,3,2,9,1,1,3,3,0,1
36,3,3,8,1,1,3,2,0,1
19,4,4,0,1,1,3,3,0,1
这是我的代码
#import modules
import pandas as pd
#define functions
def read_datafile():
d = pd.read_csv('cmc.data.txt', sep=',')
return d
def create_bin_label(data):
data['numchildren'] = data.apply(lambda row: 1 if (row['number of children']) <= 0 else 0, axis=1)
data = data.drop(['number of children'], axis=1)
#read in datafile
data = read_datafile()
print(len(data))
#create a binary label column and delete the old column
bl = create_bin_label(data)
print(data.head())
我希望 create_bin_label(data) 从一组数值属性中找到一个值,例如,孩子的数量可以是任何数字,但我只想要 0,我还希望它将列“numchildren”添加为二进制标签,我希望 create_bin_label(data) 删除旧列(它称为“儿童数”。create_bin_label(data) 所做的是返回一个看起来像这样的错误(尽管我认为重要的部分是某些 str 正在尝试作为 int 处理,但我不确定这是在哪里发生的)
Traceback (most recent call last):
File "C:\Users\Hezekiah\PycharmProjects\Artificial Intelligence 0\venv\lib\site-packages\pandas\core\indexes\base.py", line 4381, in get_value
return libindex.get_value_box(s, key)
File "pandas\_libs\index.pyx", line 52, in pandas._libs.index.get_value_box
File "pandas\_libs\index.pyx", line 48, in pandas._libs.index.get_value_at
File "pandas\_libs\util.pxd", line 113, in pandas._libs.util.get_value_at
File "pandas\_libs\util.pxd", line 98, in pandas._libs.util.validate_indexer
TypeError: 'str' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
慕仙森
相关分类