Labels

Wednesday, May 17, 2017

Spark Client Mode - PID

Node SN2 where the job is submitted in yarn-client mode


In yarn-client mode, Spark driver to run inside the client process that initiates the Spark application.

In yarn-cluster mode, the driver runs in the Application Master. This means that the same process is responsible for both driving the application and requesting resources from YARN, and this process runs inside a YARN container. The client that starts the app doesn’t need to stick around for its entire lifetime.

1) Driver PID
[root@sn2 logs]# ps aux | grep spark
mapr     27239  119  5.5 7124764 911280 pts/0  Sl+  16:47   0:47 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -cp /opt/mapr/spark/spark-2.1.0/conf/:/opt/mapr/spark/spark-2.1.0/jars/*:/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/:/opt/mapr/conf/:/opt/mapr/lib/maprbuildversion-5.2.0-mapr.jar:/opt/mapr/lib/maprfs-5.2.0-mapr.jar:/opt/mapr/lib/maprfs-diagnostic-tools-5.2.0-mapr.jar:/opt/mapr/lib/maprdb-5.2.0-mapr.jar:/opt/mapr/lib/maprdb-5.2.0-mapr-tests.jar:/opt/mapr/lib/maprdb-mapreduce-5.2.0-mapr.jar:/opt/mapr/lib/maprdb-mapreduce-5.2.0-mapr-tests.jar:/opt/mapr/lib/maprdb-shell-5.2.0-mapr.jar:/opt/mapr/lib/mapr-hbase-5.2.0-mapr.jar:/opt/mapr/lib/mapr-hbase-5.2.0-mapr-tests.jar:/opt/mapr/lib/mapr-java-utils-5.2.0-mapr-tests.jar:/opt/mapr/lib/mapr-streams-5.2.0-mapr.jar:/opt/mapr/lib/mapr-streams-5.2.0-mapr-tests.jar:/opt/mapr/lib/mapr-tools-5.2.0-mapr.jar:/opt/mapr/lib/mapr-tools-5.2.0-mapr-tests.jar:/opt/mapr/lib/slf4j-api-1.7.12.jar:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar:/opt/mapr/lib/log4j-1.2.17.jar:/opt/mapr/lib/central-logging-5.2.0-mapr.jar:/opt/mapr/lib/antlr4-runtime-4.5.jar:/opt/mapr/lib/guava-14.0.1.jar:/opt/mapr/lib/jackson-annotations-2.7.2.jar:/opt/mapr/lib/jackson-core-2.7.2.jar:/opt/mapr/lib/jackson-databind-2.7.2.jar:/opt/mapr/lib/jline-2.11.jar:/opt/mapr/lib/kafka-clients-0.9.0.0-mapr-1703.jar:/opt/mapr/lib/ojai-1.1.jar:/opt/mapr/lib/ojai-mapreduce-1.1.jar:/opt/mapr/lib/protobuf-java-2.5.0.jar:/opt/mapr/lib/spring-asm-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-beans-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-context-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-core-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-expression-3.0.3.RELEASE.jar:/opt/mapr/lib/spring-shell-1.2.0.M1-mapr-1607.jar:/opt/mapr/lib/zookeeper-3.4.5-mapr-1604.jar:/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/hdfs/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/yarn/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/lib/*:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/*:/opt/mapr/hadoop/hadoop-2.7.0/contrib/capacity-scheduler/*.jar:/opt/mapr/lib/kvstore*.jar:/opt/mapr/lib/libprotodefs*.jar:/opt/mapr/lib/baseutils*.jar:/opt/mapr/lib/maprutil*.jar:/opt/mapr/lib/json-20080701.jar:/opt/mapr/lib/flexjson-2.1.jar -Xmx1g -Dlog4j.configuration=file:/home/mapr/log4j.properties org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode client --conf spark.driver.memory=1g --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/home/mapr/log4j.properties --class org.apache.spark.examples.SparkPi --executor-memory 2g --executor-cores 1 /opt/mapr/spark/spark-2.1.0/examples/jars/spark-examples_2.11-2.1.0-mapr-1703.jar 10000


2) Application Master Container 
mapr     27878  0.0  0.0 108228  1352 ?        Ss   16:47   0:00 /bin/bash -c /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx512m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/tmp -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg '10.10.70.179:36548' --properties-file /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/__spark_conf__/__spark_conf__.properties 1> /opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/stdout 2> /opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/stderr

mapr     27882 29.0  2.2 2445076 372544 ?      Sl   16:47   0:09 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx512m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/tmp -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001 org.apache.spark.deploy.yarn.ExecutorLauncher --arg 10.10.70.179:36548 --properties-file /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000001/__spark_conf__/__spark_conf__.properties

3) Executors 
mapr     27950  0.0  0.0 108232  1360 ?        Ss   16:47   0:00 /bin/bash -c /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/tmp '-Dspark.authenticate.enableSaslEncryption=true' '-Dspark.driver.port=36548' '-Dspark.authenticate=true' -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 1 --hostname sn2 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/__app__.jar 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/stdout 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/stderr

mapr     27954  127  5.0 4084152 825108 ?      Sl   16:47   0:34 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/tmp -Dspark.authenticate.enableSaslEncryption=true -Dspark.driver.port=36548 -Dspark.authenticate=true -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002 -XX:OnOutOfMemoryError=kill %p org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 1 --hostname sn2 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000002/__app__.jar
root     28128  0.0  0.0 103320   868 pts/1    S+   16:48   0:00 grep spark



SN1 Data Node where executors are spanned 

1) Executors - Only executors running on SN1 node
[mapr@sn1 ~]$ ps aux | grep spark
mapr      6779  0.0  0.0 108220  1364 ?        Ss   16:47   0:00 /bin/bash -c /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/tmp '-Dspark.authenticate.enableSaslEncryption=true' '-Dspark.driver.port=36548' '-Dspark.authenticate=true' -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 2 --hostname sn1 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/__app__.jar 1>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/stdout 2>/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/stderr

mapr      6787  168  3.0 4111728 820532 ?      Sl   16:47   0:26 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/java -server -Xmx2048m -Djava.io.tmpdir=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/tmp -Dspark.authenticate.enableSaslEncryption=true -Dspark.driver.port=36548 -Dspark.authenticate=true -Dspark.yarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003 -XX:OnOutOfMemoryError=kill %p org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.10.70.179:36548 --executor-id 2 --hostname sn1 --cores 1 --app-id application_1494451147099_0011 --user-class-path file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1494451147099_0011/container_e57_1494451147099_0011_01_000003/__app__.jar
mapr      8077  0.0  0.0 103256   848 pts/0    S+   16:47   0:00 grep spark


Reference : http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/


1 comment: