侃侃尔雅
在构建和部署星火应用程序时,所有依赖项都需要兼容版本。Scala版本..所有软件包都必须使用相同的主要(2.10,2.11,2.12)Scala版本。考虑以下(不正确)build.sbt:name := "Simple Project"version := "1.0"libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.0.1",
"org.apache.spark" % "spark-streaming_2.10" % "2.0.1",
"org.apache.bahir" % "spark-streaming-twitter_2.11" % "2.0.1")我们用spark-streaming对于Scala2.10,剩下的包用于Scala2.11。一个有效文件可能是name := "Simple Project"version := "1.0"libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.0.1",
"org.apache.spark" % "spark-streaming_2.11" % "2.0.1",
"org.apache.bahir" % "spark-streaming-twitter_2.11" % "2.0.1")但是最好是全局指定版本并使用%%:name := "Simple Project"version := "1.0"scalaVersion := "2.11.7"libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.0.1",
"org.apache.spark" %% "spark-streaming" % "2.0.1",
"org.apache.bahir" %% "spark-streaming-twitter" % "2.0.1")同样,在Maven中:<project>
<groupId>com.example</groupId>
<artifactId>simple-project</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Simple Project</name>
<packaging>jar</packaging>
<version>1.0</version>
<properties>
<spark.version>2.0.1</spark.version>
</properties>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.bahir</groupId>
<artifactId>spark-streaming-twitter_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies></project>火花版所有软件包都必须使用相同的主要SPark版本(1.6、2.0、2.1、.)。考虑以下(不正确的)构建:name := "Simple Project"version := "1.0"libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "1.6.1",
"org.apache.spark" % "spark-streaming_2.10" % "2.0.1",
"org.apache.bahir" % "spark-streaming-twitter_2.11" % "2.0.1")我们用spark-core1.6其余组件在Spark2.0中。一个有效文件可能是name := "Simple Project"version := "1.0"libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.0.1",
"org.apache.spark" % "spark-streaming_2.10" % "2.0.1",
"org.apache.bahir" % "spark-streaming-twitter_2.11" % "2.0.1")但是最好使用一个变量:name := "Simple Project"version := "1.0"val sparkVersion = "2.0.1"libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % sparkVersion,
"org.apache.spark" % "spark-streaming_2.10" % sparkVersion,
"org.apache.bahir" % "spark-streaming-twitter_2.11" % sparkVersion)同样,在Maven中:<project>
<groupId>com.example</groupId>
<artifactId>simple-project</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Simple Project</name>
<packaging>jar</packaging>
<version>1.0</version>
<properties>
<spark.version>2.0.1</spark.version>
<scala.version>2.11</scala.version>
</properties>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.bahir</groupId>
<artifactId>spark-streaming-twitter_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies></project>星火依赖项中使用的星火版本必须与星火安装版本相匹配。例如,如果在集群上使用1.6.1,则必须使用1.6.1来构建JAR。小版本的不匹配并不总是被接受的。用于构建JAR的Scala版本必须与用于构建已部署的SPark的Scala版本相匹配。默认情况下(可下载的二进制文件和默认构建):星火1.x->Scala2.10星火2.x->Scala2.11如果包含在FAT JAR中,则应该可以在工作节点上访问其他包。有许多选择,包括:在群集节点中提交时,应包括应用程序。jar在……里面--jars.--jars主张spark-submit-在当地分发jar档案。--packages主张spark-submit-从Maven存储库获取依赖项。