Few questions related to Spark and Scala


Hi Everyone,

Need you input. Single ans. is also appreciated …

  1. What is NameNode RPC ?
  2. What is Livy Server ?
  3. What is DAG ? Why do we require DAG in Spark ?
  4. Case Class vs Normal Class
  5. How will you decide, which components has to be installed on which nodes ?
  6. What is Journal Node ?




  1. DAG - Direct Acyclic Graph. Based on the action invocation of the RDD, a graph would be created based on the (Transformations - Stages & Tasks) and that forms a lineage. Whenever we invoke an action it recomputes the lineage and evaluates and return the results to the driver program. Why we do we need that?. If any of the executors fails then based on this DAG it can go back to the point it failed and recomputes it without affecting the rest of the successful executors.

  2. Case Class Vs Normal Class.
    In my opinion, Case class is an easy way to create a hassle-free class with the factory methods. And need not to worry about the getters and setters. But in the case of the normal class need to do the getters and setters manually.