如何使用嵌套列表创建 Spark 表

首页课程实战体系课手记专栏慕课教程

如何使用嵌套列表创建 Spark 表

我如何使用这个答案（List to DataFrame in pyspark a new answer to create a table using spark for nested list?

lst = [{'sfObject': 'event',

'objID': 'Id',

'interimRun': 'True',

'numAttributes_Total': 140,

'numAttributes_Compounded': 0,

'numAttributes_nonCompounded': 140,

'chunks': 1,

'compoundStatus': 'False',

'allAttributes': ['Id',

'RecordTypeId',

'WhoId',

'Advisor_Team__c’,…],

'compoundAttributes': [],

'nonCompoundAttributes': ['Id',

'RecordTypeId',

'WhoId',

'WhatId’…]},

{'sfObject': 'fund__c',

'objID': 'Id',

'interimRun': 'False',

'numAttributes_Total': 40,

'numAttributes_Compounded': 0,

'numAttributes_nonCompounded': 40,

'chunks': 1,

'compoundStatus': 'False',

'allAttributes': ['Id',

'IsDeleted',

'Name’…],

'compoundAttributes': [],

'nonCompoundAttributes': ['Id',

'IsDeleted',

'Name',

'RecordTypeId’…]}]

我想创建将这个列表存储到一个表中，所以需要它的结构是这样的：

下面的链接是我需要使用上面的 lst 创建的表的图像：

在此处输入图像描述

此嵌套列表最多包含 30 个不同的项目，因此答案需要为每个项目动态创建最多 30 行。

谢谢！

慕婉清6462132

浏览 102回答 1

1回答

慕神8447489

获得字典列表后，运行以下命令。它将推断模式。df = sc.parallelize(lst).toDF()如果你想把它当作一个表来运行 SQL 查询，运行：df.createOrReplaceTempView("df_table")new_df = spark.sql("SELECT * FROM df_table")

0 0

随时随地看视频慕课网APP