Hive is very slow in the Cloudera CDH 5.8

I have started using CDH 5.8. (in oracle virtual box) I have allocated memory 10240 MB out of my 16384 MB…Inspite of this, I feel, CDH is very slow. I mean, when I open Hive shell, and run Show databases, it takes lot of time to display results. When I run create table command, It never executes. It just hangs and I have to forcefully stop it. I have CDH 4.7 version in the same virtual box it run executes very fast. Is there any specific setting I need to do for cdh 5.x versions, so that execution environment is fast.

Yes, it will be typical behavior on virtual machines.

@anveshan_reddy How many processors have been allocated to the VM?

I have allocated 2 processers

Try closing down the processes from cloudera manager which ever is not required for you.

1 Like

Hi Anveshan - I am using CDH 5.8 on Virtual box but the services are not coming up. I am using 2 CPUs and 8GB RAM(out of 16GB in windows) for VM.

I am getting error "Failed to start Hadoop namenode. Return value: 1 " when i try to start hdfs service.

Can you guide ?

As your computer is having 16 gb RAM, you can try adding more RAM to vm may be 10gb or more if possible.

Even its the same issue in my case. In my case I have allocated 4 processors and 10240MB memory which is same as demonstrated in your videos. In spite of that the Hive queries are very slow, even whilst running simple queries such as “show databases” or “show tables”. However, when I run the same same queries on spark-shell/pyspark using HiveContext the queries returns immediately, which is really strange. Can you think of something in this situation?

Typical timing using Hive / Beeline is around 60-68 seconds.

I am still facing this issue

that’s the problem with CDH5.8 it needs 16 GB RAM because of Cloudera Manager, as mentioned by Durga sir, CDH will be slow in VM. Once an expert mentioned that Hadoop is best on bare metal server rather than virtual machine (it has it’s own reasons: it uses virtual kernel not host system kernel), That’s why hadoop guys prefer Cloud setup (AWS, Azure etc), bare metal setup options.

For you i will suggest 2 options use Hortonworks distribution instead of CDH, or use Itversity labs or some other hadoop labs available in market.

1 Like