慕的地10843
关键的功能,在这里可以帮助你是pandas.merge_asof用allow_exact_matches=Falseimport pandas as pd input = pd.DataFrame([\ ["01/01/2019", 1, 43.5863 , 7.12993, 0], ["01/01/2019", 2, 44.3929 , 8.93832, 0], ["02/01/2019", 1, 43.5393 , 7.03134, 1], ["02/01/2019", 2, 39.459462, -0.31228, 0], ["03/01/2019", 1, 44.3173 , 84.942, 0], ["03/01/2019", 2, -12.3284 ,-9.04522, 1], ["04/01/2019", 1, -36.8414 ,17.4762, 0], ["04/01/2019", 2, 43.542 , 10.2958, 0], ["05/01/2019", 1, 43.5242 , 69.473, 0], ["05/01/2019", 2, 37.9382 , 23.668, 1], ["06/01/2019", 1, 4.4409 , 89.218, 1], ["06/02/2019", 2, 25.078037, -77.3289, 0]], columns=["date","car_id","latitude", "longitude" , "event"])input['date'] = pd.to_datetime(input['date'])df = pd.merge_asof(input.set_index('date'), input.loc[input['event'] == 1].set_index('date'), on='date', suffixes=['_l','_r'], by='car_id', allow_exact_matches=False)此时,df 中的每一行都已经包含了进一步计算所需的必要元素。由于我不确定您的Distance()函数是否接受数据框,因此我们可以使用.apply()附加distance_since_event列。def getDistance(lat1, lat2, long1, long2): if pd.isna(lat2) or pd.isna(long2): return -1 # substitute this with the actual wgs84_geod library that you eventually use return ((lat2-lat1)**2 + (long2-long1)**2) **0.5df['distance_since_event'] = df.apply(lambda row: getDistance(row['latitude_l'], row['latitude_r'], row['longitude_l'], row['longitude_r']), axis=1)print(df)输出: car_id date latitude_l longitude_l event_l latitude_r longitude_r event_r distance_since_event0 1 2019-01-01 43.586300 7.12993 0 NaN NaN NaN -1.0000001 2 2019-01-01 44.392900 8.93832 0 NaN NaN NaN -1.0000002 1 2019-02-01 43.539300 7.03134 1 NaN NaN NaN -1.0000003 2 2019-02-01 39.459462 -0.31228 0 NaN NaN NaN -1.0000004 1 2019-03-01 44.317300 84.94200 0 43.5393 7.03134 1.0 77.9145445 2 2019-03-01 -12.328400 -9.04522 1 NaN NaN NaN -1.0000006 1 2019-04-01 -36.841400 17.47620 0 43.5393 7.03134 1.0 81.0564747 2 2019-04-01 43.542000 10.29580 0 -12.3284 -9.04522 1.0 59.1234028 1 2019-05-01 43.524200 69.47300 0 43.5393 7.03134 1.0 62.4416629 2 2019-05-01 37.938200 23.66800 1 -12.3284 -9.04522 1.0 59.97404310 1 2019-06-01 4.440900 89.21800 1 43.5393 7.03134 1.0 91.01281211 2 2019-06-02 25.078037 -77.32890 0 37.9382 23.66800 1.0 101.812365从这里您可以根据需要重命名或删除列