CCA175 ----> recap

I’m going to take the exam early certification.
I would like to make a recup of some of the things I’ve read on the examination:

the questions are typically simple
The data model of the mysql tables is similar to that of the VM, but with much more data
The screen resolution is very small, solvable problem with ctrl +
To spark (both scala and python) must fill in the blanks and run a .sh script associated with a .scala or .py files
Usually the questions are: 3 Hive, Sqoop 2 (1 export, import 1), 1 avro, 2 scale, 2 python

Some doubts:

  1. For hive is required for the creation of DDL (partitioned, from avro file, …) or even the execution of queries?
  2. For hive may be required the use of regex functions?
  3. to someone have happened questions on flume?
  4. strings and host connection are provided?



Ans: According to the question you need to create Hive DDL statements and execute them and verify output.

Ans: Regex functions are not in scope.

Ans: Till now no one has got Flume questions. But for ‘prepare-for-worst’ scenario. Please be prepared for Flume.

Ans: Yes for MySQL they will provide connection parameters.