调用交通 api 并使用 python 获取错误格式的数据。
#!/usr/bin/env python
# make sure to install these packages before running:
# pip install pandas
# pip install sodapy
import pandas as pd
from sodapy import Socrata
# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.pa.gov", None)
# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.pa.gov,
# MyAppToken,
# userame="user@example.com",
# password="AFakePassword")
# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("dc5b-gebx", limit=50000)
# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)
results_df.latitude 出来是这样的
latitude
0 40 36:56.627
这显然是不正确的,假设这是由于 api 调用的处理方式造成的?
还有另一个 location_1 列,它有这样的字符串数据。
location_1
0 {'latitude': '40.6157', 'longitude': '-75.4621'}
1 {'latitude': '40.4587', 'longitude': '-79.9985'}
2 {'latitude': '39.9328', 'longitude': '-75.2891'}
3 {'latitude': '40.4435', 'longitude': '-80.0046'}
4 {'latitude': '40.5994', 'longitude': '-75.4703'}
I need the lat and lon as separate columns
对于最好的方法超级困惑,目前我感到很奇怪,我正在考虑简单地像这样处理数据框,
list(df.location_1.values)
然后循环遍历内部值,
dict = {}
n = 0
for x in list:
n+=1
append(x.strip())
精慕HU
相关分类