有没有办法将此数据格式转换为 CSV?

我有大量带有格式的提取 Json 文件。我想知道是否有任何方法可以将其转换为以列作为特征和行中的值的 CSV。


{"state": "New Jersey", "text": "RT @joncoopertweets: Register to join the #WeThePeopleMarch on September 21st in Washington, D.C. \u2014 or one of the 50+ marches that will be\u2026", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246349467649, "entities": {"hashtags": [{"text": "WeThePeopleMarch", "indices": [42, 59]}], "urls": [], "user_mentions": [{"screen_name": "joncoopertweets", "name": "Jon Cooper", "id": 27493883, "id_str": "27493883", "indices": [3, 19]}], "symbols": []}, "source": "Twitter for iPad", "location": "Leonia, NJ", "verified": false, "geocode": null}

{"state": "Indiana", "text": "RT @dariusherron1: Don\u2019t nobody love they girl like Mexicans ", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246378827776, "entities": {"hashtags": [], "urls": [{"url": "", "expanded_url": "", "display_url": "", "indices": [61, 84]}], "user_mentions": [{"screen_name": "dariusherron1", "name": "Darius Herron", "id": 1680891876, "id_str": "1680891876", "indices": [3, 17]}], "symbols": []}, "source": "Twitter for iPhone", "location": "Indianapolis, IN", "verified": false, "geocode": null}

http://img4.mukewang.com/628c466b000120ea12770670.jpg

http://img.mukewang.com/628c46700001557909820408.jpg

ibeautiful
浏览 88回答 2
2回答

跃然一笑

我对您的预期输出并不完全清楚(请参阅@user5783745 答案的评论和讨论)。您的 JSON 字符串包含一些嵌套对象,list如果您使用jsonlite::fromJSON. 由于您没有为您提供的示例数据提供匹配的预期输出,因此可能有不同的方法来处理这些嵌套条目。一种可能性是解析 JSON 字符串,然后在绑定行之前解析两次flatten结果。listlibrary(tidyverse)library(jsonlite)map(json, ~fromJSON(.x) %>% flatten() %>% flatten()) %>% bind_rows()## A tibble: 2 x 15#&nbsp; state text&nbsp; has_emoji created_at&nbsp; &nbsp; &nbsp;id indices screen_name name&nbsp; id_str#&nbsp; <chr> <chr> <lgl>&nbsp; &nbsp; &nbsp;<chr>&nbsp; &nbsp; &nbsp; &nbsp;<dbl> <list>&nbsp; <chr>&nbsp; &nbsp; &nbsp; &nbsp;<chr> <chr>#1 New … WeTh… FALSE&nbsp; &nbsp; &nbsp;Mon Sep 0… 2.75e7 <int [… joncoopert… Jon … 27493…#2 Indi… "RT … FALSE&nbsp; &nbsp; &nbsp;Mon Sep 0… 1.68e9 <int [… dariusherr… Dari… 16808…## … with 6 more variables: source <chr>, location <chr>, verified <lgl>,##&nbsp; &nbsp;url <chr>, expanded_url <chr>, display_url <chr>结果对象是一个tibble带有一些list列的对象。要存储为 CSV,您可以排除这些list列。样本数据json <- c(&nbsp; &nbsp; '{"state": "New Jersey", "text": "RT @joncoopertweets: Register to join the #WeThePeopleMarch on September 21st in Washington, D.C. \u2014 or one of the 50+ marches that will be\u2026", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246349467649, "entities": {"hashtags": [{"text": "WeThePeopleMarch", "indices": [42, 59]}], "urls": [], "user_mentions": [{"screen_name": "joncoopertweets", "name": "Jon Cooper", "id": 27493883, "id_str": "27493883", "indices": [3, 19]}], "symbols": []}, "source": "Twitter for iPad", "location": "Leonia, NJ", "verified": false, "geocode": null}',&nbsp; &nbsp; '{"state": "Indiana", "text": "RT @dariusherron1: Don\u2019t nobody love they girl like Mexicans ", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246378827776, "entities": {"hashtags": [], "urls": [{"url": "", "expanded_url": "", "display_url": "", "indices": [61, 84]}], "user_mentions": [{"screen_name": "dariusherron1", "name": "Darius Herron", "id": 1680891876, "id_str": "1680891876", "indices": [3, 17]}], "symbols": []}, "source": "Twitter for iPhone", "location": "Indianapolis, IN", "verified": false, "geocode": null}')

一只斗牛犬

您可以轻松地将其转换为更易于使用 (a list) 的数据格式,但此后如何使用它取决于您自己。在这种情况下,数据列表不会自动变成 a data.frame- 您必须考虑如何转换它(假设某些列表项是单个项,而其他列表项本身就是data.framesa <- '{"state": "New Jersey", "text": "RT @joncoopertweets: Register to join the #WeThePeopleMarch on September 21st in Washington, D.C. \u2014 or one of the 50+ marches that will be\u2026", "has_emoji": false, "created_at": "Mon Sep 02 16:32:05 +0000 2019", "id": 1168562246349467649, "entities": {"hashtags": [{"text": "WeThePeopleMarch", "indices": [42, 59]}], "urls": [], "user_mentions": [{"screen_name": "joncoopertweets", "name": "Jon Cooper", "id": 27493883, "id_str": "27493883", "indices": [3, 19]}], "symbols": []}, "source": "Twitter for iPad", "location": "Leonia, NJ", "verified": false, "geocode": null}'&nbsp;library(jsonlite)library(dplyr)a <- a %>% fromJSON&nbsp;new_dataframe <- data.frame(state=character(),&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; text=character(),&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; has_emoji=character(),&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; id=character(),&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; entities=character(), stringsAsFactors = FALSE)new_dataframe[1, ] <- c(a$state, a$text, a$has_emoji, a$created_at, a$id)&nbsp;
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python