最新消息:天气越来越冷,记得加一件厚衣裳

Spark Driver与Executor端添加调试信息

Spark w3sun 877浏览 0评论

前言

由于Spark应用程序本身运行在JVM上运行,因此–verbose和–verbose:class选项都是可用的。–verbose会在输出设备上显示Java虚拟机运行的相关信息和Spark配置的详细信息,——verbose:class选项显示Driver和Executor加载的类。这些调试选项可以帮助用户在Driver和Executor端识别某些class类路径冲突。

适用环境

Spark 2.0, Apache Spark, Spark 2.1, Spark 2.x

配置方法

如果想要列出运行Java程序时JVM加载类的详细信息可以使用——verbose选项,该选项将输出一个列表且列表中包含了类加载器加载的所有类。使用——verbose:class选项的示例代码如下:

[your_account@36.36.36.36 spark-2.2.0-bin-hadoop2.6]$ bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--queue root.default \
--num-executors 20 \
--executor-memory 2048MB \
--conf "spark.driver.extraJavaOptions=-verbose:class" \
--conf "spark.executor.extraJavaOptions=-verbose:class"  \
--class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.2.0.jar 100000

登陆到YARN Web UI上,从Driver和Executor的stdout可以看到类加载的详细信息:

[Loaded sun.reflect.GeneratedSerializationConstructorAccessor76 from __JVM_DefineClass__]
[Loaded sun.reflect.GeneratedSerializationConstructorAccessor77 from __JVM_DefineClass__]
[Loaded sun.reflect.GeneratedSerializationConstructorAccessor78 from __JVM_DefineClass__]
[Loaded scala.collection.parallel.immutable.ParIterable from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/scala-library-2.11.8.jar]
[Loaded scala.collection.parallel.immutable.ParSeq from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/scala-library-2.11.8.jar]
[Loaded scala.collection.parallel.immutable.ParVector from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/scala-library-2.11.8.jar]
[Loaded org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$8$$anonfun$9 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded org.apache.spark.scheduler.TaskSchedulerImpl$$anonfun$8$$anonfun$apply$10 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded io.netty.buffer.PoolArena$1 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/netty-all-4.0.43.Final.jar]
[Loaded scala.Product4$class from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/scala-library-2.11.8.jar]
[Loaded org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onExecutorMetricsUpdate$2$$anonfun$41 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded org.apache.spark.executor.TaskMetrics$$anonfun$fromAccumulatorInfos$1 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded org.apache.spark.executor.TaskMetrics$$anonfun$fromAccumulatorInfos$2 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded org.apache.spark.executor.TaskMetrics$$anonfun$fromAccumulatorInfos$2$$anonfun$apply$1 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded org.apache.spark.ui.jobs.JobProgressListener$$anonfun$onExecutorMetricsUpdate$2$$anonfun$apply$17 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/spark-core_2.11-2.2.0.jar]
[Loaded org.apache.hadoop.hdfs.LeaseRenewer$2 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/hadoop-hdfs-2.6.5.jar]
[Loaded org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseRequestProto$Builder from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/hadoop-hdfs-2.6.5.jar]
[Loaded org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseResponseProto$Builder from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/hadoop-hdfs-2.6.5.jar]
[Loaded org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$RenewLeaseResponseProto$1 from file:/data1/yarn/nm/usercache/your_account/filecache/10/__spark_libs__1451447860571528177.zip/hadoop-hdfs-2.6.5.jar]
[Loaded sun.reflect.GeneratedConstructorAccessor69 from __JVM_DefineClass__]
[Loaded sun.reflect.GeneratedConstructorAccessor70 from __JVM_DefineClass__]
[Loaded sun.reflect.GeneratedMethodAccessor61 from __JVM_DefineClass__]

启动spark-shell或者使用spark-submit运行Spark程序时可以启用–verbose选项。–verbose选项添加以后将会输出细粒度的调试信息,例如Spark程序中各配置选项和相关class的加载位置。

[your_account@36.36.36.36 spark-2.2.0-bin-hadoop2.6]$ bin/spark-shell --verbose

命令行执行完毕后会打印出以下信息:

Using properties file: /home/your_path/spark/spark-2.2.0-bin-hadoop2.6/conf/spark-defaults.conf
Adding default property: spark.port.maxRetries=1000
Adding default property: spark.serializer=org.apache.spark.serializer.KryoSerializer
Adding default property: spark.sql.hive.convertMetastoreParquet=false
Adding default property: spark.history.fs.logDirectory=hdfs://demo/user/your_account/spark-2.2.0/history/
Adding default property: spark.eventLog.enabled=true
Adding default property: spark.yarn.historyServer.address=36.36.36.36:18090
Adding default property: spark.history.fs.cleaner.enabled=true
Adding default property: spark.sql.hive.caseSensitiveInferenceMode=NEVER_INFER
Adding default property: spark.history.fs.cleaner.interval=1d
Adding default property: spark.history.fs.cleaner.maxAge=7d
Adding default property: spark.history.retainedApplications=200
Adding default property: spark.eventLog.dir=hdfs://demo/user/your_account/spark-2.2.0/history
Adding default property: spark.eventLog.compress=true
Adding default property: spark.history.ui.port=18090
Parsed arguments:
  master                  local[*]
  deployMode              null
  executorMemory          null
  executorCores           null
  totalExecutorCores      null
  propertiesFile          /home/your_path/spark/spark-2.2.0-bin-hadoop2.6/conf/spark-defaults.conf
  driverMemory            null
  driverCores             null
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               org.apache.spark.repl.Main
  primaryResource         spark-shell
  name                    Spark shell
  childArgs               []
  jars                    null
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file /home/your_path/spark/spark-2.2.0-bin-hadoop2.6/conf/spark-defaults.conf:
  (spark.history.fs.cleaner.enabled,true)
  (spark.yarn.historyServer.address,36.36.36.36:18090)
  (spark.eventLog.enabled,true)
  (spark.eventLog.compress,true)
  (spark.history.ui.port,18090)
  (spark.history.fs.cleaner.maxAge,7d)
  (spark.history.retainedApplications,200)
  (spark.serializer,org.apache.spark.serializer.KryoSerializer)
  (spark.history.fs.logDirectory,hdfs://demo/user/your_account/spark-2.2.0/history/)
  (spark.sql.hive.caseSensitiveInferenceMode,NEVER_INFER)
  (spark.eventLog.dir,hdfs://demo/user/your_account/spark-2.2.0/history)
  (spark.port.maxRetries,1000)
  (spark.sql.hive.convertMetastoreParquet,false)
  (spark.history.fs.cleaner.interval,1d)


Main class:
org.apache.spark.repl.Main
Arguments:

System properties:
(spark.history.fs.cleaner.enabled,true)
(spark.yarn.historyServer.address,36.36.36.36:18090)
(spark.eventLog.enabled,true)
(spark.eventLog.compress,true)
(spark.history.ui.port,18090)
(spark.history.fs.cleaner.maxAge,7d)
(SPARK_SUBMIT,true)
(spark.history.retainedApplications,200)
(spark.serializer,org.apache.spark.serializer.KryoSerializer)
(spark.history.fs.logDirectory,hdfs://demo/user/your_account/spark-2.2.0/history/)
(spark.app.name,Spark shell)
(spark.sql.hive.caseSensitiveInferenceMode,NEVER_INFER)
(spark.jars,)
(spark.submit.deployMode,client)
(spark.eventLog.dir,hdfs://demo/user/your_account/spark-2.2.0/history)
(spark.master,local[*])
(spark.port.maxRetries,1000)
(spark.sql.hive.convertMetastoreParquet,false)
(spark.history.fs.cleaner.interval,1d)
Classpath elements:



Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/11/30 15:47:50 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
21/11/30 15:47:50 WARN util.Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
21/11/30 15:47:50 WARN util.Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
21/11/30 15:47:50 WARN util.Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044.
21/11/30 15:47:50 WARN util.Utils: Service 'SparkUI' could not bind on port 4044. Attempting port 4045.
Spark context Web UI available at http://36.36.36.36:4045
Spark context available as 'sc' (master = local[*], app id = local-1638258470147).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

转载请注明:雪后西塘 » Spark Driver与Executor端添加调试信息

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址