如何在数据框列中找到第一个和最后一个元素并修剪这些元素之间的值

我一直在使用坐标数据。(经纬度)


背景


Act Df = 


Index       Latitude            Longitude

0           66.36031097267725   23.714807357485936

1           66.36030099322495   23.71479548193769

2

.

.

Flt Df =


Index       Latitude            Longitude

0           66.34622070356742   23.687960586306179

1           66.34620931053996   23.687951092116624

2

.

.

len(Actual) = 12053 

len(Fleet) = 8000 

上述数据表明,Fleet 数据坐标点在 Actual Data Lat & Long Graph 中占据的面积较短。


笔记:


Fleet Lat & Long 值可能不必等于 Actual Lat & long 值,但它在 Actual Lat/Long 图形点中拥有较短的区域


要求


我想根据 Fleet Lat/Long Data 中的值修剪 Actual Lat/Long 数据的一部分。


我的要求是,当我在 Open Street 地图或 matplotlib 中绘制实际纬度/经度数据和舰队纬度/经度数据时,必须遵循相同的路径。(位置可能不一定相同)


我试过的:


我使用了算术运算


actual_data[(actual_data['Latitude'] <= fleet_data_Lat_start_point) & (actual_data['Longitude'] <= fleet_data_Long_start_point) & (actual_data['Latitude'] <= fleet_data_Lat_end_point) & (actual_data['Longitude'] <= fleet_data_Long_end_point)]



蓝山帝景
浏览 194回答 1
1回答

哔哔one

这是我的解决方案:我使用库 geopy 来计算距离。您可以选择在 geodesic() 或 great_circle() 中计算距离,函数 distance = geodesic。你可以在度量标准更改.km到.miles或m或ft如果你喜欢别的指标from geopy.distance import lonlat, distance, great_circle,geodesicdmin=[]for index, r in df_actual.iterrows():&nbsp; &nbsp; valmin = df_fleet.apply(lambda x:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; distance(lonlat(x['Longitude'], x['Latitude']),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;lonlat(r['Longitude'], r['Latitude'])).km,axis=1).min()&nbsp; &nbsp; dmin.append(valmin)df_actual['nearest to fleet(km)'] = dminprint(df_actual)如果你想要所有舰队点 < 100m 每个实际点,你做for ai, a in df_actual.iterrows():&nbsp; &nbsp; actual = lonlat(a['Longitude'], a['Latitude'])&nbsp; &nbsp; filter = df_fleet.apply(lambda x:&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; distance(lonlat(x['Longitude'], x['Latitude']), actual).meters < 100 ,axis=1)&nbsp; &nbsp; print(f"for {(a['Longitude'], a['Latitude'])}"); print(df_fleet[filter])最后一个解决方案基于树计算,我认为它非常非常快,我正在使用 scipy 空间,它计算空间中的最近点并给出欧几里得距离的结果。我刚刚调整了 x,y,z 空间点中的 lat,lon 以获得正确的结果(测地线或半正弦)。在这里,我生成了 2 个(纬度,经度)15000 和 10000 行的数据帧,我正在为 df2 中的每个 df1 搜索五个最近的数据帧from random import uniformfrom math import radians, sin, cosfrom scipy.spatial import cKDTreeimport pandas as pdimport numpy as npdef to_cartesian(lat, lon):&nbsp; &nbsp; lat = radians(lat); lon = radians(lon)&nbsp; &nbsp; R = 6371&nbsp; &nbsp; x = R * cos(lat) * cos(lon)&nbsp; &nbsp; y = R * cos(lat) * sin(lon)&nbsp; &nbsp; z = R * sin(lat)&nbsp; &nbsp; return x, y , zdef newpoint():&nbsp; &nbsp; return uniform(23, 24), uniform(66, 67)def ckdnearest(gdA, gdB, bcol):&nbsp; &nbsp;&nbsp; &nbsp; nA = np.array(list(zip(gdA.x, gdA.y, gdA.z)) )&nbsp; &nbsp; nB = np.array(list(zip(gdB.x, gdB.y, gdB.z)) )&nbsp; &nbsp; btree = cKDTree(nB)&nbsp; &nbsp; dist, idx = btree.query(nA,k=5) #search the first 5 (k=5) nearest point df2 for each point of df1&nbsp; &nbsp; dist = [d for d in dist]&nbsp; &nbsp; idx = [s for s in idx]&nbsp; &nbsp; df = pd.DataFrame.from_dict({'distance': dist,&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;'index of df2' : idx})&nbsp; &nbsp; return df#create the first df (actual)n = 15000lon,lat = [],[]for x,y in (newpoint() for x in range(n)):&nbsp; &nbsp; lon += [x];lat +=[y]df1 = pd.DataFrame({'lat': lat, 'lon': lon})df1['x'], df1['y'], df1['z'] = zip(*map(to_cartesian, df1.lat, df1.lon))#-----------------------#create the second df (fleet)n = 10000lon,lat = [],[]for x,y in (newpoint() for x in range(n)):&nbsp; &nbsp; lon += [x];lat +=[y]id = [x for x in range(n)]df2 = pd.DataFrame({'lat': lat, 'lon': lon})df2['x'], df2['y'], df2['z'] = zip(*map(to_cartesian, df2.lat, df2.lon))#-----------------------df = ckdnearest(df1, df2, 'unused')print(df)如果你只想要 1 个没有笛卡尔坐标的最近点:def ckdnearest(gdA, gdB, bcol):&nbsp; &nbsp;&nbsp; &nbsp; nA = np.array(list(zip(gdA.lat, gdA.lon)))&nbsp; &nbsp; nB = np.array(list(zip(gdB.lat, gdB.lon)))&nbsp; &nbsp; btree = cKDTree(nB)&nbsp; &nbsp; dist, idx = btree.query(nA,k=1) #search the first&nbsp; nearest point df2&nbsp;&nbsp; &nbsp; df = pd.DataFrame.from_dict({'distance': dist, 'index of df2' : idx})&nbsp; &nbsp; return df
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python