Node SN2 where the job is submitted in yarn-client mode
In yarn-client mode, Spark driver to run inside the client process that initiates the Spark application.
In yarn-cluster mode, the driver runs in the Application Master. This means that the same process is responsible for both driving the application and requesting resources from YARN, and this process runs inside a YARN container. The client that starts the app doesn’t need to stick around for its entire lifetime.
[root@sn2 logs]# ps aux | grep spark
mapr 27239 119 5.5 7124764 911280 pts/0 Sl+ 16:47 0:47 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -cp /opt/mapr/spark/spark-2.1.0/conf/:/opt/mapr/spark/spark-2.1.0/jars/*:/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/:/opt/mapr/conf/:/opt/mapr/lib/maprbuildversion-5.2.0-mapr.jar:/opt/mapr/lib/maprfs-5.2.0-mapr.jar:/opt/mapr/lib/maprfs-diagnostic-tools-5.2.0-mapr.jar:/opt/mapr/lib/maprdb-5.2.0-mapr.jar:/opt/mapr/lib/maprdb-5.2.0-mapr-tests.jar:/opt/mapr/lib/maprdb-mapreduce-5.2.0-mapr.jar:/opt/mapr/lib/maprdb-mapreduce-5.2.0-mapr-tests.jar:/opt/mapr/lib/maprdb-shell-5.2.0-mapr.jar:/opt/mapr/lib/mapr-hbase-5.2.0-mapr.jar:/opt/mapr/lib/mapr-hbase-5.2.0-mapr-tests.jar:/opt/mapr/lib/mapr-java-utils-5.2.0-mapr-tests.jar:/opt/mapr/lib/mapr-streams-5.2.0-mapr.jar:/opt/mapr/lib/mapr-streams-5.2.0-mapr-tests.jar:/opt/mapr/lib/mapr-tools-5.2.0-mapr.jar:/opt/mapr/lib/mapr-tools-5.2.0-mapr-tests.jar:/opt/mapr/lib/slf4j-api-1.7.12.jar:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar:/opt/mapr/lib/log4j-1.2.17.jar:/opt/mapr/lib/central-logging-5.2.0-mapr.jar:/opt/mapr/lib/antlr4-runtime-4.5.jar:/opt/mapr/lib/guava-14.0.1.jar:/opt/mapr/lib/jackson-annotations-2.7.2.jar:/opt/mapr/lib/jackson-core-2.7.2.jar:/opt/mapr/lib/jackson-databind-2.7.2.jar:/opt/mapr/lib/jline-2.11.jar:/opt/mapr/lib/kafka-clients-0.9.0.0-mapr-1703.jar:/opt/mapr/lib/ojai-1.1.jar:/opt/mapr/lib/ojai-mapreduce-1.1.jar:/opt/mapr/lib/protobuf-java-2.5.0.jar:/opt/mapr/lib/spring-asm-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-beans-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-context-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-core-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-expression-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-shell-1.2.0.M1-mapr-1607.jar:/opt/mapr/lib/zookeeper-3.4.5-mapr-1604.jar:/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/*:/opt/mapr/hadoop/hadoop-2.7.0/contrib/capacity-scheduler/*.jar:/opt/mapr/lib/kvstore*.jar:/opt/mapr/lib/libprotodefs*.jar:/opt/mapr/lib/baseutils*.jar:/opt/mapr/lib/maprutil*.jar:/opt/mapr/lib/json-20080701.jar:/opt/mapr/lib/flexjson-2.1.jar -Xmx1g -Dlog4j.configuration=file:/home/mapr/log4j.properties org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode client --conf spark.driver.memory=1g --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/home/mapr/log4j.properties --class org.apache.spark.examples.SparkPi --executor-memory 2g --executor-cores 1 /opt/mapr/spark/spark-2.1.0/examples/jars/spark-examples_2.11-2.1.0-mapr-1703.jar 10000
2) Application Master Container
mapr 27878 0.0 0.0 108228 1352 ? Ss 16:47 0:00 /bin/bash -c /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx512m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/tmp -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg '10.10.70.179:36548' --properties-file /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/__spark_conf__/__spark_conf__.properties 1> /opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/stdout 2> /opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/stderr
mapr 27882 29.0 2.2 2445076 372544 ? Sl 16:47 0:09 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx512m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/tmp -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg 10.10.70.179:36548 --properties-file /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/__spark_conf__/__spark_conf__.properties
3) Executors
mapr 27950 0.0 0.0 108232 1360 ? Ss 16:47 0:00 /bin/bash -c /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/tmp '-Dspark.authenticate.enableSaslEncryption=true' '-Dspark.driver.port=36548' '-Dspark.authenticate=true' -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 1 --hostname sn2 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/__app__.jar 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/stdout 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/stderr
mapr 27954 127 5.0 4084152 825108 ? Sl 16:47 0:34 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/tmp -Dspark.authenticate.enableSaslEncryption=true -Dspark.driver.port=36548 -Dspark.authenticate=true -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002 -XX:OnOutOfMemoryError=kill %p org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 1 --hostname sn2 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/__app__.jar
root 28128 0.0 0.0 103320 868 pts/1 S+ 16:48 0:00 grep spark
SN1 Data Node where executors are spanned
1) Executors - Only executors running on SN1 node
[mapr@sn1 ~]$ ps aux | grep spark
mapr 6779 0.0 0.0 108220 1364 ? Ss 16:47 0:00 /bin/bash -c /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/tmp '-Dspark.authenticate.enableSaslEncryption=true' '-Dspark.driver.port=36548' '-Dspark.authenticate=true' -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 2 --hostname sn1 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/__app__.jar 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/stdout 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/stderr
mapr 6787 168 3.0 4111728 820532 ? Sl 16:47 0:26 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/tmp -Dspark.authenticate.enableSaslEncryption=true -Dspark.driver.port=36548 -Dspark.authenticate=true -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003 -XX:OnOutOfMemoryError=kill %p org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 2 --hostname sn1 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/__app__.jar
mapr 8077 0.0 0.0 103256 848 pts/0 S+ 16:47 0:00 grep spark
Reference : http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/
Reference : http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/
very nice post.keep sharing more posts with us.
ReplyDeletebig data training