在我们介绍Executor执行Task之前,先看一个重要的类,它就是CoarseGrainedExecutorBackend类
它创建这个进程的时候会调用onStart方法
它是ExecutorBackend粗粒度进程,
它负责向Driver发送Executor的注册请求
它是一个通信的进程,它可以与Driver相互通信
它是Executor所在的一个进程名称,Executor才是处理Task真正的对象,Executor处理Task都是由线程池来进行Task的处理的。
它负责接受Driver返回回来的Executor注册信息,然后创建Executor上下文。
它负责接受TaskSchedule发送过来的LaunchTask消息,开始Task的启动与计算
Executor执行Task的原理分析:
当CoarseGrainedExecutorBackend接收到Driver发送过来的RegisteredExecutor消息的时候就会创建Executor
然后当再次接受Driver发送过来的LaunchTask消息后就会开始执行Task,首先它会对发送来的TaskTaskDescription进行反序列化,然后调用launchTask方法交由Executor去执行Task。
在launchTask方法中,创建了TaskRunner,然后TaskRunner继承了Runnable接口,然后将这个TaskRunner加入到线程池和缓存中,然后线程池调用executor方法开始Task的执行。
Executor执行Task的原码分析:
CoarseGrainedExecutorBackend的onStart方法:该方法在创建CoarseGrainedExecutorBackend类的时候被执行,它会向Driver注册Executor
override def onStart() {
logInfo("Connecting to driver: " + driverUrl)
rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
// This is a very fast action so we can use "ThreadUtils.sameThread"
driver = Some(ref)
//向Driver发送Executor的注册请求
ref.ask[RegisterExecutorResponse](
RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))
}(ThreadUtils.sameThread).onComplete {
// This is a very fast action so we can use "ThreadUtils.sameThread"
case Success(msg) => Utils.tryLogNonFatalError {
Option(self).foreach(_.send(msg)) // msg must be RegisterExecutorResponse
}
case Failure(e) => {
logError(s"Cannot register with driver: $driverUrl", e)
System.exit(1)
}
}(ThreadUtils.sameThread)
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
CoarseGrainedExecutorBackend的receive方法:该方法作用就是接受各种消息用的。
override def receive: PartialFunction[Any, Unit] = {
//Driver返回Executor注册成功的消息,然后就会创建Executor对象。
case RegisteredExecutor(hostname) =>
logInfo("Successfully registered with driver")
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
//Driver返回Executor注册失败的消息,然后程序结束执行。
case RegisterExecutorFailed(message) =>
logError("Slave registration failed: " + message)
System.exit(1)
//接受Driver发送过来的LaunchTask消息,这个消息作用就是要求Executor开始执行Task任务
case LaunchTask(data) =>
if (executor == null) {
logError("Received LaunchTask command but executor was null")
System.exit(1)
} else {
//首先会对传过来的TaskDescription进行反序列化,
val taskDesc = ser.deserialize[TaskDescription](data.value)
logInfo("Got assigned task " + taskDesc.taskId)
//调用executor的launchTask方法开始执行Task任务。
//this:ExecutorBackend,taskId:task的索引Id,attemptNumber:尝试执行的次数,
//taskDesc.name:task的名称,taskDesc.serializedTask:TaskDescription序列化后的对象
executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
taskDesc.name, taskDesc.serializedTask)
}
case KillTask(taskId, _, interruptThread) =>
if (executor == null) {
logError("Received KillTask command but executor was null")
System.exit(1)
} else {
executor.killTask(taskId, interruptThread)
}
case StopExecutor =>
logInfo("Driver commanded a shutdown")
// Cannot shutdown here because an ack may need to be sent back to the caller. So send
// a message to self to actually do the shutdown.
self.send(Shutdown)
case Shutdown =>
executor.stop()
stop()
rpcEnv.shutdown()
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
Executor的launchTask方法:该方法的作用是为每个Task创建一个TaskRunner,然后将TaskRunner放入内存缓存中,然后再将TaskRunner放入线程池中,等待线程执行。
def launchTask(
context: ExecutorBackend,
taskId: Long,
attemptNumber: Int,
taskName: String,
serializedTask: ByteBuffer): Unit = {
//为每一个Task都创建一个对应的TaskRunner对象,TaskRunner继承了Java的Runnable接口
val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
serializedTask)
//将TaskRunner放入内存缓存
runningTasks.put(taskId, tr)
//Executor内部有一个Java线程池,然后将Task封装到TaskRunner线程,直接放到
//线程池中去执行,如果线程池中线程不够用的,就会等待有了空闲的线程在开始执行
threadPool.execute(tr)
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
TaskRunner继承了Runable接口,执行Task的程序都放在了多线程的run方法里了,每当一个Task过来就会创建一个TaskRunner对象,并且创建一个线程线程去执行Task,然后这些TaskRunner会放到线程池中去执行。下边是run方法的源码解析
override def run(): Unit = {
//为Task分配一个内存管理器
val taskMemoryManager = new TaskMemoryManager(env.memoryManager, taskId)
//记录反序列化的时间
val deserializeStartTime = System.currentTimeMillis()
Thread.currentThread.setContextClassLoader(replClassLoader)
//创建一个序列化器,用来对Task数据进行反序列化
val ser = env.closureSerializer.newInstance()
logInfo(s"Running $taskName (TID $taskId)")
//向Driver发送Task当前的执行状态
execBackend.statusUpdate(taskId, TaskState.RUNNING, EMPTY_BYTE_BUFFER)
var taskStart: Long = 0
startGCTime = computeTotalGcTime()
try {
//对序列化后的Task数据进行反序列化
val (taskFiles, taskJars, taskBytes) = Task.deserializeWithDependencies(serializedTask)
//通过网络通信,获取Task依赖的文件、资源、jar包,比如说Hadoop的配置文件
updateDependencies(taskFiles, taskJars)
//通过反序列化将Task进行反序列化
//类加载的作用:用发射动态加载一个类,创建类的对象
task = ser.deserialize[Task[Any]](taskBytes, Thread.currentThread.getContextClassLoader)
task.setTaskMemoryManager(taskMemoryManager)
//如果在序列化之前以及被停掉了,那么就会马上退出,否则就会继续执行Task
if (killed) {
throw new TaskKilledException
}
logDebug("Task " + taskId + "'s epoch is " + task.epoch)
env.mapOutputTracker.updateEpoch(task.epoch)
// 计算出Task开始的时间
taskStart = System.currentTimeMillis()
.........
作者:清风_d587
链接:https://www.jianshu.com/p/c42695ee8c4f