如何将多个csv连接到xarray并定义坐标？

NumPy回想一下，尽管它在原始类数组之上引入了维度、坐标和属性形式的标签，但它的xarray灵感来自pandas. 因此，要回答这个问题，您可以按照以下步骤进行。from glob import globimport numpy as npimport pandas as pd# Get the list of all the csv files in data pathcsv_flist = glob(data_path + "/*.csv") df_list = []for _file in csv_flist:    # get the file name from the data path    file_name = _file.split("/")[-1]        # extract the date from a file name, e.g. "data.2018-06-01.csv"    date = file_name.split(".")[1]        # read the read the data in _file    df = pd.read_csv(_file)        # add a column date knowing that all the data in df are recorded at the same date    df["date"] = np.repeat(date, df.shape[0])    df["date"] = df.date.astype("datetime64[ns]") # reset date column to a correct date format        # append df to df_list    df_list.append(df)让我们检查一下例如第df一个df_listprint(df_list[0])    status  user_id  weight       date0  healthy        1      72 2019-06-011    obese        2     103 2019-06-01连接所有的dfsaxis=0df_all = pd.concat(df_list, ignore_index=True).sort_index()print(df_all)    status  user_id  weight       date0  healthy        1      72 2019-06-011    obese        2     103 2019-06-012  healthy        1      70 2018-06-013  healthy        2      90 2018-06-01使用和将的索引设置df_all为两个级别的levels[0] = "date"多索引levels[1]="user_id"。data = df_all.set_index(["date", "user_id"]).sort_index()print(data)                     status  weightdate       user_id                 2018-06-01 1        healthy      70           2        healthy      902019-06-01 1        healthy      72           2          obese     103随后，您可以将结果pandas.DataFrame转换为xarray.Datasetusing .to_xarray()，如下所示。xds = data.to_xarray()print(xds)<xarray.Dataset>Dimensions:  (date: 2, user_id: 2)Coordinates:  * date     (date) datetime64[ns] 2018-06-01 2019-06-01  * user_id  (user_id) int64 1 2Data variables:    status   (date, user_id) object 'healthy' 'healthy' 'healthy' 'obese'    weight   (date, user_id) int64 70 90 72 103这将完全回答这个问题。

如何将多个csv连接到xarray并定义坐标？

2回答