在 jupyter 中使用以下命令下载数据。
!7z x stackoverflow.com-Posts.7z -oposts
# load xml file into spark data frame.
posts = spark.read.format("xml").option("rowTag", "row").load("./posts/Posts.xml")
出现以下错误:
Py4JJavaError: An error occurred while calling o532.load.
: java.lang.ClassNotFoundException: Failed to find data source: xml. Please find packages at http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
绝地无双
相关分类