熊猫数据框中的每一行都包含2点的经/纬度坐标。使用下面的Python代码,计算许多(几百万)行的这两个点之间的距离需要很长时间!
考虑到两个点相距不到50英里,并且精度不是很重要,是否可以使计算更快?
from math import radians, cos, sin, asin, sqrtdef haversine(lon1, lat1, lon2, lat2): """ Calculate the great circle distance between two points on the earth (specified in decimal degrees) """ # convert decimal degrees to radians lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2]) # haversine formula dlon = lon2 - lon1 dlat = lat2 - lat1 a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2 c = 2 * asin(sqrt(a)) km = 6367 * c return kmfor index, row in df.iterrows(): df.loc[index, 'distance'] = haversine(row['a_longitude'], row['a_latitude'], row['b_longitude'], row['b_latitude'])
婷婷同学_
相关分类