解析dataframe中json类型格式的元素

159 [{"geoid":"02020000101"},{"geoid":"02020000204"},{"geoid":"02020000300"},{"geoid":"02020000400"},{"geoid":"02020000500"},{"geoid":"02020000600"},{"geoid":"02020000802"},{"geoid":"02020000901"},{"geoid":"02020000902"},{"geoid":"02020001000"},{"geoid":"02020001500"},{"geoid":"02020001601"},{"geoid":"02020001602"},{"geoid":"02020001701"},{"geoid":"02020001802"},{"geoid":"02020001900"},{"geoid":"02020002000"},{"geoid":"02020002100"},{"geoid":"02020002201"},{"geoid":"02020002400"},{"geoid":"02020002501"},{"geoid":"02020002502"},{"geoid":"02020002601"},{"geoid":"02020002712"},{"geoid":"02020002811"},{"geoid":"02020002812"},{"geoid":"02020002813"},{"geoid":"02122000100"},{"geoid":"02122000300"},{"geoid":"02170001300"},{"geoid":"02170000300"},{"geoid":"02170001100"},{"geoid":"02170000800"},{"geoid":"02261000300"},{"geoid":"02290000400"},{"geoid":"02240000400"},{"geoid":"02170000102"},{"geoid":"02170000402"},{"geoid":"02170000101"},{"geoid":"02170001201"},{"geoid":"02170001001"},{"geoid":"02170000706"},{"geoid":"02170001202"},{"geoid":"02170001004"},{"geoid":"02170000705"},{"geoid":"02170000603"},{"geoid":"02020000102"},{"geoid":"02020000201"},{"geoid":"02020000202"},{"geoid":"02020000203"},{"geoid":"02020000701"},{"geoid":"02020000702"},{"geoid":"02020000703"},{"geoid":"02020000801"},{"geoid":"02020001100"},{"geoid":"02020001200"},

我希望这个例子有帮助：#creating a dataframe for example:d = [{'A':3,'B':[{'id':'001'},{'id':'002'}]},    {'A':4,'B':[{'id':'003'},{'id':'004'}]},    {'A':5,'B':[{'id':'005'},{'id':'006'}]},    {'A':6,'B':[{'id':'007'},{'id':'008'}]}]df = pd.DataFrame(d)df    A   B0   3   [{'id': '001'}, {'id': '002'}]1   4   [{'id': '003'}, {'id': '004'}]2   5   [{'id': '005'}, {'id': '006'}]3   6   [{'id': '007'}, {'id': '008'}]#apply an explode to the column B and reset indexdf = df.explode('B')df.reset_index(drop = True, inplace = True)df# now it looks like this    A    B0   3   {'id': '001'}1   3   {'id': '002'}2   4   {'id': '003'}3   4   {'id': '004'}4   5   {'id': '005'}5   5   {'id': '006'}6   6   {'id': '007'}7   6   {'id': '008'}# now we need to remove the extra text and rename the column from B to iddf.B = df.B.apply(lambda x: x['id'])df.rename(columns={"B": "id"} , inplace = True)# this is the final product:df    A   id0   3   0011   3   0022   4   0033   4   0044   5   0055   5   0066   6   0077   6   008

解析dataframe中json类型格式的元素

2回答