如何解决属性错误“float”对象在python中没有属性“split”?

当我运行下面的代码时,它给我一个错误,说存在属性错误:'float' object has no attribute 'split' in python。


我想知道为什么会出现这个错误。


def text_processing(df):


    """""=== Lower case ==="""

    '''First step is to transform comments into lower case'''

    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))


    return df


df = text_processing(df)

错误的完整回溯:


Traceback (most recent call last):

  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>

    main()

  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main

    globals = debugger.run(setup['file'], None, None, is_module)

  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run

    pydev_imports.execfile(file, globals, locals)  # execute the script

  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile

    exec(compile(contents+"\n", file, 'exec'), glob, loc)

  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>

    df = text_processing(df)

  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing

    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))

  File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply



至尊宝的传说
浏览 1164回答 2
2回答

喵喵时光机

错误指向这一行:df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if x not in stop_words))split这里用作 Python 内置str类的方法。您的错误表明中的一个或多个值df['content']的类型为float。这可能是因为存在空值,即NaN,或非空浮点值。一个解决办法,这将字符串化浮动,是只适用str于x使用前split:df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if x not in stop_words))或者,可能是更好的解决方案,明确并使用带有try/except子句的命名函数:def converter(x):&nbsp; &nbsp; try:&nbsp; &nbsp; &nbsp; &nbsp; return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])&nbsp; &nbsp; except AttributeError:&nbsp; &nbsp; &nbsp; &nbsp; return None&nbsp; # or some other valuedf['content'] = df['content'].apply(converter)由于pd.Series.apply只是一个有开销的循环,您可能会发现列表理解或map更有效:df['content'] = [converter(x) for x in df['content']]df['content'] = list(map(converter, df['content']))

GCT1015

split() 是一种仅适用于字符串的 Python 方法。似乎您的“内容”列不仅包含字符串,还包含其他值,例如无法应用 .split() 方法的浮点数。尝试使用 str(x).split() 将值转换为字符串,或者首先将整个列转换为字符串,这样效率会更高。您按如下方式执行此操作:df['column_name'].astype(str)
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python