如何将包含 NUL ('\x00') 行的 csv 读入 pandas？

该文件充满了NUL, '\x00'，需要将其删除。清理行后，用于pandas.DataFrame从加载数据。dimport pandas as pdimport string  # to make column names# the issue is the the file is filled with NUL not whitespacedef import_file(filename):    # open the file and clean it    with open(filename) as f:        d = list(f.readlines())        # replace NUL, strip whitespace from the end of the strings, split each string into a list        d = [v.replace('\x00', '').strip().split(',') for v in d]        # remove some empty rows        d = [v for v in d if len(v) > 2]    # load the file with pandas    df = pd.DataFrame(d)    # convert column 0 and 1 to a datetime    df['datetime'] = pd.to_datetime(df[0] + ' ' + df[1])    # drop column 0 and 1    df.drop(columns=[0, 1], inplace=True)    # set datetime as the index    df.set_index('datetime', inplace=True)    # convert data in columns to floats    df = df.astype('float')    # give character column names    df.columns = list(string.ascii_uppercase)[:len(df.columns)]        # reset the index    df.reset_index(inplace=True)        return df.copy()# call the functiondfs = list()filenames = ['67.csv']for filename in filenames:        dfs.append(import_file(filename))display(df)                       A    B      C    D    E      F     G     H      I     J     K     L    M    N    Odatetime                                                                                                 2020-02-03 15:13:39  5.5  5.8  42.84  7.2  6.8  10.63  60.0   0.0  300.0   1.0  30.0  79.0  0.0  0.0  0.02020-02-03 15:13:49  5.5  5.8  42.84  7.2  6.8  10.63  60.0   0.0  300.0   1.0  30.0  79.0  0.0  0.0  0.02020-02-03 15:13:59  5.5  5.7  34.26  7.2  6.8  10.63  60.0  22.3  300.0   1.0  30.0  79.0  0.0  0.0  0.02020-02-03 15:14:09  5.5  5.7  34.26  7.2  6.8  10.63  60.0  15.3  300.0  45.0  30.0  79.0  0.0  0.0  0.02020-02-03 15:14:19  5.5  5.4  17.10  7.2  6.8  10.63  60.0  50.2  300.0  86.0  30.0  79.0  0.0  0.0  0.0

如何将包含 NUL ('\x00') 行的 csv 读入 pandas？

1回答