如何使用熊猫从字符串中删除小数点

您的问题与 Spark 或 PySpark 无关。它与Pandas相关。这是因为 Pandas 会自动解释和推断列的数据类型。由于您的列的所有值都是数字，Pandas 会将其视为float数据类型。为了避免这种情况，pandas.ExcelFile.parse方法接受一个名为的参数converters，您可以使用它通过以下方式告诉 Pandas 特定的列数据类型：# if you want one specific column as stringdf = pd.concat([filepath_pd.parse(name, converters={'column_name': str}) for name in names])或者# if you want all columns as string# and you have multi sheets and they do not have same columns# this merge all sheets into one dataframedef get_converters(excel_file, sheet_name, dt_cols):    cols = excel_file.parse(sheet_name).columns    converters = {col: str for col in cols if col not in dt_cols}    for col in dt_cols:        converters[col] = pd.to_datetime    return convertersdf = pd.concat([filepath_pd.parse(name, converters=get_converters(filepath_pd, name, ['date_column'])) for name in names]).reset_index(drop=True)或者# if you want all columns as string# and all your sheets have same columnscols = filepath_pd.parse().columnsdt_cols = ['date_column']converters = {col: str for col in cols if col not in dt_cols}for col in dt_cols:    converters[col] = pd.to_datetimedf = pd.concat([filepath_pd.parse(name, converters=converters) for name in names]).reset_index(drop=True)

如何使用熊猫从字符串中删除小数点

2回答