我正在研究文本挖掘问题并使用 Pandas 进行文本处理。从以下示例中,我只需要选择在同一类别 ( ) 中具有最大跨度 ( start- end) 的那些行cat
鉴于此数据框:
name start end cat
0 coumadin 0 8 DRUG
1 albuterol 18 27 DRUG
2 albuterol sulfate 18 35 DRUG
3 sulfate 28 35 DRUG
4 2.5 36 39 STRENGTH
5 2.5 mg 36 42 STRENGTH
6 2.5 mg /3 ml 36 48 STRENGTH
7 0.083 50 55 STRENGTH
8 0.083 % 50 57 STRENGTH
9 2.5 mg /3 ml (0.083 %) 36 58 STRENGTH
10 solution 59 67 FORM
11 solution for nebulization 59 84 FORM
12 nebulization 72 84 ROUTE
13 one (1) 90 97 FREQUENCY
14 neb 98 101 ROUTE
15 neb inhalation 98 112 ROUTE
16 inhalation 102 112 ROUTE
17 q4h 113 116 FREQUENCY
18 every 118 123 FREQUENCY
19 every 4 hours 118 131 FREQUENCY
20 q4h (every 4 hours) 113 132 FREQUENCY
21 q4h (every 4 hours) as needed 113 142 FREQUENCY
22 as needed 133 142 FREQUENCY
23 dyspnea 147 154 REASON
相关分类