Are Log Messages too Verbose in Spark Shell? Then Fix it!

Spark’s Interactive Shell (spark-shell/pyspark) is one of the great features offered by Apache Spark.

You can find advantages of Spark’s interactive shell below:

  • Analysing Data Interactively - No need to create, build and deploy applications
  • Learning Spark API - One can easily practice, learn and master the API
  • Debugging Code - Debugging some part of Spark Application code is easy with Interactive Shell

This looks really amazing! But most of the people feel discomfort with logs produced at the time of Spark code is executed in Interactive Shell

For the sake of good, Spark allows you to TURN OFF the logs. To achieve that you need to execute the following commands as soon as you launches the respective Spark shell.(You are flexible to choose log level between INFO, WARN, ERROR etc)

Scala Users:

sc.setLogLevel("WARN")

OR

import org.apache.log4j.Logger
import org.apache.log4j.Level
Logger.getLogger("org").setLevel(Level.WARN)
Logger.getLogger("akka").setLevel(Level.WARN)

Python Users:

sc.setLogLevel("WARN")

OR

log4j = sc._jvm.org.apache.log4j
log4j.LogManager.getRootLogger().setLevel(log4j.Level.WARN)

P.S: It is always a good practice to observe and understand the logs

3 Likes

Can we set the Spark shell to WARN globally? The INFO messages we see in spark-shell are mostly unecessary.