As part of this topic, we will explore about YARN
- In certifications Spark typically runs in YARN mode
- We should be able to check the memory configuration to understand the cluster capacity
- Spark default settings
- Number of executors – 2
- Memory – 1 GB
- Quite often we underutilize resources. Understanding memory settings thoroughly and then mapping them with data size we are trying to process we can accelerate the execution of jobs.