Apache Spark 1.6 - Transform, Stage and Store - Initializing Spark job using pyspark

Initializing the job

  • Initialize using pyspark
  • Running in yarn mode (client or cluster mode)
  • Control arguments
  • Deciding on the number of executors
  • Setting up additional properties
  • Programmatic initialization of job
  • Create configuration object
  • Create spark context object

Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster