无法在客户端模式 Kubernetes 下运行 pyspark 作业

我正在使用以下指南在我的 aks Kubernetes 集群中部署 pyspark:

我已经按照上面链接中的说明部署了我的驱动程序吊舱:

apiVersion: apps/v1

kind: Deployment

metadata:

  namespace: spark

  name: my-notebook-deployment

  labels:

    app: my-notebook

spec:

  replicas: 1

  selector:

    matchLabels:

      app: my-notebook

  template:

    metadata:

      labels:

        app: my-notebook

    spec:

      serviceAccountName: spark

      containers:

      - name: my-notebook

        image: pidocker-docker-registry.default.svc.cluster.local:5000/my-notebook:latest

        ports:

          - containerPort: 8888

        volumeMounts:

          - mountPath: /root/data

            name: my-notebook-pv

        workingDir: /root

        resources:

          limits:

            memory: 2Gi

      volumes:

        - name: my-notebook-pv

          persistentVolumeClaim:

            claimName: my-notebook-pvc

---

apiVersion: v1

kind: Service

metadata:

  namespace: spark

  name: my-notebook-deployment

spec:

  selector:

    app: my-notebook

  ports:

    - protocol: TCP

      port: 29413

  clusterIP: None

然后我可以使用以下代码创建 Spark 集群:


import os

from pyspark import SparkContext, SparkConf

from pyspark.sql import SparkSession

# Create Spark config for our Kubernetes based cluster manager

sparkConf = SparkConf()

sparkConf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443")

sparkConf.setAppName("spark")

sparkConf.set("spark.kubernetes.container.image", "<MYIMAGE>")

sparkConf.set("spark.kubernetes.namespace", "spark")

sparkConf.set("spark.executor.instances", "7")

sparkConf.set("spark.executor.cores", "2")

sparkConf.set("spark.driver.memory", "512m")

sparkConf.set("spark.executor.memory", "512m")

sparkConf.set("spark.kubernetes.pyspark.pythonVersion", "3")


据我所知,我正在尝试在客户端节点中运行我的 Spark 集群,jupyter pod 充当主节点并创建从属节点,当我在 jupyter pod 内运行代码时它可以工作,但当其他 pod 尝试连接它时它可以工作。

我该如何解决这个问题?


ITMISS
浏览 52回答 1
1回答

喵喵时光机

我遇到了类似的问题,最后我手动创建了客户端 Pod 所需的服务。就我而言,我想部署不支持集群模式的 Spark-thrift 服务器。首先,您需要创建 Spark blockManager 和驱动程序本身所需的服务apiVersion: v1kind: Servicemetadata:&nbsp; name: spark-thriftspec:&nbsp; type: ClusterIP&nbsp; ports:&nbsp; &nbsp; - protocol: TCP&nbsp; &nbsp; &nbsp; port: 4000&nbsp; &nbsp; &nbsp; name: driver&nbsp; &nbsp; - protocol: TCP&nbsp; &nbsp; &nbsp; port: 4001&nbsp; &nbsp; &nbsp; name: block-manager现在你可以像这样启动你的驱动程序:apiVersion: apps/v1beta1kind: Deploymentmetadata:&nbsp; name: spark-thrift&nbsp; labels:&nbsp; &nbsp; app: spark-thriftspec:&nbsp; template:&nbsp; &nbsp; spec:&nbsp; &nbsp; &nbsp; containers:&nbsp; &nbsp; &nbsp; &nbsp; - name: spark-thrift-driver&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; image: image:latest&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; command:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - /opt/spark/bin/spark-submit&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; args:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "--name"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "spark-thrift"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "--class"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "org.apache.spark.sql.hive.thriftserver.HiveThriftServer2"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "--conf"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "spark.driver.port=4000"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "--conf"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "spark.driver.host=spark-thrift"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "--conf"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "spark.driver.bindAddress=0.0.0.0"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "--conf"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - "spark.driver.blockManager.port=4001"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; imagePullPolicy: Always&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ports:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - name: driver&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; containerPort: 4000&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - name: blockmanager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; containerPort: 4001这里的重要论点是“spark.driver.host=spark-thrift” - 指向主机(因此是服务名称)“spark.driver.port=4000” - 指向驱动程序端口“spark.driver.bindAddress=0.0.0.0” - 这是spark不使用spark-thrift作为主机名所必需的,否则会导致错误“spark.driver.blockManager.port=4001” - 指向 blockManager 端口显然,这不是一个工作 Pod 的完整示例,您仍然需要在规范中配置您自己的端口和卷。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python