将列值划分为部分并将部分名称存储在新列 pandas 中

首页课程实战体系课手记专栏慕课教程

将列值划分为部分并将部分名称存储在新列 pandas 中

我有一个包含多个产品名称的列，例如

Contract

0 O.U20

1 O.Z20

2 O.H21

3 O.M21

4 O.U21

5 O.Z21

6 O.H22

7 O.M22

8 S3.U20

9 S3.Z20

10 S6.M26

11 S6.U26

12 S6.Z26

13 S6.H27

14 S9.U26

15 S9.Z26

16 F3.U26

17 F3.Z26

18 F3.H27

19 F6.H26

20 F6.M26

21 F6.U26

22 F9.U20

我想要做的是根据合同名称分配部分名称，例如

Contract Sections

0 O.U20 O1

1 O.Z20 O1

2 O.H21 O1

3 O.M21 O1

4 O.U21 O2

5 O.Z21 O2

6 O.H22 O2

7 O.M22 O2

8 S3.U20 S3

9 S3.Z20 S3

10 S6.M26 S6

11 S6.U26 S6

12 S6.Z26 S6

13 S6.H27 S6

14 S9.U26 S9

15 S9.Z26 S9

16 F3.U26 F3

17 F3.Z26 F3

18 F3.H27 F3

19 F6.H26 F6

20 F6.M26 F6

21 F6.U26 F6

22 F9.U20 F9

对于 S 和 F 系列，我可以使用此代码实现所需的结果（如果有更好的实现方法，请告诉我）

df.loc[df['Contract'].str.contains('S3'),'Sections'] = 'S3'

df.loc[df['Contract'].str.contains('S6'),'Sections'] = 'S6'

df.loc[df['Contract'].str.contains('S9'),'Sections'] = 'S9'

df.loc[df['Contract'].str.contains('F3'),'Sections'] = 'F3'

df.loc[df['Contract'].str.contains('F6'),'Sections'] = 'F6'

df.loc[df['Contract'].str.contains('F9'),'Sections'] = 'F9'

因为它只是匹配分配部分名称的字符串。遗憾的是 O 系列没有附加数字，所以我必须将它分成 4 个块，如上所示

Contract Sections

0 O.U20 O1

1 O.Z20 O1

2 O.H21 O1

3 O.M21 O1

4 O.U21 O2

5 O.Z21 O2

6 O.H22 O2

7 O.M22 O2

我尝试了以下代码

df.loc[df['Contract'].str.contains('O'),'Sections'] = df.index // 4+1

但它抛出错误

ValueError: could not broadcast input array from shape (23) into shape (8)

我怎样才能以更好、更有效的方式取得成果？请注意，这只是一个样本数据，原始数据集有更多这样的值。

烙印99

浏览 100回答 2

2回答

www说

将您的代码更改为df.loc[df['Contract'].str.contains('O'),'Sections'] = 'O' +((df['Contract'].str.contains('O').cumsum().sub(1)//4) + 1).astype(str)

0 0

函数式编程

为了简化df.loc[df['Contract'].str.contains('S3'),'Sections'] = 'S3'df.loc[df['Contract'].str.contains('S6'),'Sections'] = 'S6'df.loc[df['Contract'].str.contains('S9'),'Sections'] = 'S9'df.loc[df['Contract'].str.contains('F3'),'Sections'] = 'F3'df.loc[df['Contract'].str.contains('F6'),'Sections'] = 'F6'df.loc[df['Contract'].str.contains('F9'),'Sections'] = 'F9'只需将其替换为以下 1 行代码：df['Section'] = df['Contract'].str.split('.').str[0]

0 0

随时随地看视频慕课网APP