哔哔one
这是我的解决方案:我使用库 geopy 来计算距离。您可以选择在 geodesic() 或 great_circle() 中计算距离,函数 distance = geodesic。你可以在度量标准更改.km到.miles或m或ft如果你喜欢别的指标from geopy.distance import lonlat, distance, great_circle,geodesicdmin=[]for index, r in df_actual.iterrows(): valmin = df_fleet.apply(lambda x: distance(lonlat(x['Longitude'], x['Latitude']), lonlat(r['Longitude'], r['Latitude'])).km,axis=1).min() dmin.append(valmin)df_actual['nearest to fleet(km)'] = dminprint(df_actual)如果你想要所有舰队点 < 100m 每个实际点,你做for ai, a in df_actual.iterrows(): actual = lonlat(a['Longitude'], a['Latitude']) filter = df_fleet.apply(lambda x: distance(lonlat(x['Longitude'], x['Latitude']), actual).meters < 100 ,axis=1) print(f"for {(a['Longitude'], a['Latitude'])}"); print(df_fleet[filter])最后一个解决方案基于树计算,我认为它非常非常快,我正在使用 scipy 空间,它计算空间中的最近点并给出欧几里得距离的结果。我刚刚调整了 x,y,z 空间点中的 lat,lon 以获得正确的结果(测地线或半正弦)。在这里,我生成了 2 个(纬度,经度)15000 和 10000 行的数据帧,我正在为 df2 中的每个 df1 搜索五个最近的数据帧from random import uniformfrom math import radians, sin, cosfrom scipy.spatial import cKDTreeimport pandas as pdimport numpy as npdef to_cartesian(lat, lon): lat = radians(lat); lon = radians(lon) R = 6371 x = R * cos(lat) * cos(lon) y = R * cos(lat) * sin(lon) z = R * sin(lat) return x, y , zdef newpoint(): return uniform(23, 24), uniform(66, 67)def ckdnearest(gdA, gdB, bcol): nA = np.array(list(zip(gdA.x, gdA.y, gdA.z)) ) nB = np.array(list(zip(gdB.x, gdB.y, gdB.z)) ) btree = cKDTree(nB) dist, idx = btree.query(nA,k=5) #search the first 5 (k=5) nearest point df2 for each point of df1 dist = [d for d in dist] idx = [s for s in idx] df = pd.DataFrame.from_dict({'distance': dist, 'index of df2' : idx}) return df#create the first df (actual)n = 15000lon,lat = [],[]for x,y in (newpoint() for x in range(n)): lon += [x];lat +=[y]df1 = pd.DataFrame({'lat': lat, 'lon': lon})df1['x'], df1['y'], df1['z'] = zip(*map(to_cartesian, df1.lat, df1.lon))#-----------------------#create the second df (fleet)n = 10000lon,lat = [],[]for x,y in (newpoint() for x in range(n)): lon += [x];lat +=[y]id = [x for x in range(n)]df2 = pd.DataFrame({'lat': lat, 'lon': lon})df2['x'], df2['y'], df2['z'] = zip(*map(to_cartesian, df2.lat, df2.lon))#-----------------------df = ckdnearest(df1, df2, 'unused')print(df)如果你只想要 1 个没有笛卡尔坐标的最近点:def ckdnearest(gdA, gdB, bcol): nA = np.array(list(zip(gdA.lat, gdA.lon))) nB = np.array(list(zip(gdB.lat, gdB.lon))) btree = cKDTree(nB) dist, idx = btree.query(nA,k=1) #search the first nearest point df2 df = pd.DataFrame.from_dict({'distance': dist, 'index of df2' : idx}) return df