数据框：单元格级别：将逗号分隔的字符串转换为列表

用于pandas.Series.str.split将字符串拆分为list.# use str split on the columndf.mgrs_grids = df.mgrs_grids.str.split(',')# display(df) driver_code journey_code mgrs_grids0 7211863 7211863-140 [18TWL927129, 18TWL888113, 18TWL888113, 18TWL887113, 18TWL888113, 18TWL887113, 18TWL887113, 18TWL887113, 18TWL903128]1 7211863 7211863-105 [18TWL927129, 18TWL939112, 18TWL939112, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL960111, 18TWL960112]2 7211863 7211863-50 [18TWL927129, 18TWL889085, 18TWL889085, 18TWL888085, 18TWL888085, 18TWL888085, 18TWL888085, 18TWL888085, 18TWL890085]3 7211863 7211863-109 [18TWL927129, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952105, 18TWL951103]print(type(df.loc[0, 'mgrs_grids']))[out]:list每个值单独一行创建一列列表后。用于pandas.DataFrame.explode为列表中的每个值创建单独的行。# get a separate row for each valuedf = df.explode('mgrs_grids').reset_index(drop=True)# display(df.hea()) driver_code journey_code mgrs_grids0 7211863 7211863-140 18TWL9271291 7211863 7211863-140 18TWL8881132 7211863 7211863-140 18TWL8881133 7211863 7211863-140 18TWL8871134 7211863 7211863-140 18TWL888113更新这是另一个选项，它将组合'journey_code'到的前面'mgrs_grids'，然后将字符串拆分为列表。该列表被分配回'mgrs_grids'，但也可以分配给新列。# add the journey code to mgrs_grids and then splitdf.mgrs_grids = (df.journey_code + ',' + df.mgrs_grids).str.split(',')# display(df.head()) driver_code journey_code mgrs_grids0 7211863 7211863-140 [7211863-140, 18TWL927129, 18TWL888113, 18TWL888113, 18TWL887113, 18TWL888113, 18TWL887113, 18TWL887113, 18TWL887113, 18TWL903128]1 7211863 7211863-105 [7211863-105, 18TWL927129, 18TWL939112, 18TWL939112, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL939113, 18TWL960111, 18TWL960112]2 7211863 7211863-50 [7211863-50, 18TWL927129, 18TWL889085, 18TWL889085, 18TWL888085, 18TWL888085, 18TWL888085, 18TWL888085, 18TWL888085, 18TWL890085]3 7211863 7211863-109 [7211863-109, 18TWL927129, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952106, 18TWL952105, 18TWL951103]# output to nested listdf.mgrs_grids.tolist()[out]:[['7211863-140', '18TWL927129', '18TWL888113', '18TWL888113', '18TWL887113', '18TWL888113', '18TWL887113', '18TWL887113', '18TWL887113', '18TWL903128'], ['7211863-105', '18TWL927129', '18TWL939112', '18TWL939112', '18TWL939113', '18TWL939113', '18TWL939113', '18TWL939113', '18TWL939113', '18TWL939113', '18TWL960111', '18TWL960112'], ['7211863-50', '18TWL927129', '18TWL889085', '18TWL889085', '18TWL888085', '18TWL888085', '18TWL888085', '18TWL888085', '18TWL888085', '18TWL890085'], ['7211863-109', '18TWL927129', '18TWL952106', '18TWL952106', '18TWL952106', '18TWL952106', '18TWL952106', '18TWL952106', '18TWL952106', '18TWL952105', '18TWL951103']]

数据框：单元格级别：将逗号分隔的字符串转换为列表

2回答