将一列对象值转换为浮点数或整数。ValueError:int() 的无效文字

dfproduction = pd.read_csv('https://raw.githubusercontent.com/chessybo/Oil-Spill-map/master/Oil%20Spill%20Data%20-%20Crude%20Oil%2C%20Gas%20Well%20Liquids%20or%20Associated%20Products%20(H-8)/production%20data/Crude%20Oil%20Production%20and%20Well%20Counts%20(since%201935).csv', encoding='utf-8')

我想将此数据转换为数字(即列,“原油生产(Mbbl)”),例如 float 或 int。


目前 dtype 是对象


    print(dfproduction.dtypes)

MasterYear                                  int64

Crude Oil Production (Mbbl)                object

Daily Avg. Production (Mbbl/day)           object

Number of Producing Wells                  object

Percent Change in Production               object

Avg. Per Well Production (bbl/day)        float64

Crude Oil Reserves as of Jan. 1 (Mbbl)     object

info                                       object

dtype: object

然而,任何这样做的尝试都会导致某种形式的错误。


dfproduction['Crude Oil Production (Mbbl)'].astype('int')

ValueError: invalid literal for int() with base 10: '1,026,765'


dfproduction['Crude Oil Production (Mbbl)'].astype('float')

ValueError: could not convert string to float: '375,617'

更新:


问题是数字中的逗号,我删除了逗号并重新上传了数据。直到现在我收到以下错误..


UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 83: invalid start byte


陪伴而非守候
浏览 130回答 1
1回答

斯蒂芬大帝

使用str.replace()删除逗号。dfproduction['Crude Oil Production (Mbbl)'].str.replace(r',', '').astype('int')
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python