i noticed lots of people have been posting (including myself) for solutions via pyspark … if anyone needs to validate arunsBlog and jaysQuestions here are solutions for reference via pyspark api.
i hope this helps … please share if needed.
http://arun-teaches-u-tech.blogspot.com (some pyspark solutions are scattered across itversity threads)
jaysQuestions (others have posted some pyspark solutions in the thread too):
If you can solve these problems.. you may be ready for CCA-175 . Give it a shot!
Prepare for certifications on our state of the art labs which have Hadoop, Spark, Kafka, Hive and other Big Data technologies
- Click here for signing up for our state of the art 13 node Hadoop and Spark Cluster