倚天杖
–而是使用了,因此与for一起-使用:Series.str.splitexpand=TrueDataFramedata = ['Jun–Nov 1976', 'Sep–Oct 1976', 'Jun 1977', 'Jul–Oct 1979', 'Nov 1994', 'Nov 1994–Feb 1995', 'Jan–Jul 1995', 'Jan–Mar 1996', 'Jul 1996–Jan 1997', 'Oct 2000–Feb 2001', 'Oct 2001–Mar 2002', 'Oct 2001–Mar 2002', 'Oct 2001–Mar 2002', 'Oct 2001–Mar 2002', 'Oct 2001–Mar 2002', 'Dec 2002–Apr 2003', 'Dec 2002–Apr 2003', 'Dec 2002–Apr 2003', 'Oct–Dec 2003', 'Apr–Jun 2004'] ebola = pd.DataFrame(data, columns=['Date range'])ebola1 = ebola['Date range'].str.split('–', 1, expand=True)ebola1.columns = ['start date','end date']然后numpy.where添加来自end dateby的年份Series.str.extract,但前提是在start date测试的列中不存在Series.str.contains:mask = ebola1['start date'].str.contains('\d')years = ebola1['end date'].str.extract('(\d+)', expand=False)ebola1['start date'] = np.where(mask, ebola1['start date'], ebola1['start date'] + ' ' + years)print (ebola1) start date end date0 Jun 1976 Nov 19761 Sep 1976 Oct 19762 Jun 1977 None3 Jul 1979 Oct 19794 Nov 1994 None5 Nov 1994 Feb 19956 Jan 1995 Jul 19957 Jan 1996 Mar 19968 Jul 1996 Jan 19979 Oct 2000 Feb 200110 Oct 2001 Mar 200211 Oct 2001 Mar 200212 Oct 2001 Mar 200213 Oct 2001 Mar 200214 Oct 2001 Mar 200215 Dec 2002 Apr 200316 Dec 2002 Apr 200317 Dec 2002 Apr 200318 Oct 2003 Dec 200319 Apr 2004 Jun 2004