Vectorization in Hive


#1

Hi all,
I want to learn Vectorization in Hive and how does it help in improving the performance ?
Kindly suggest me any good link/video to follow?

Thanks in advance,
Aparna


#2

@AparnaSen,

Vectors are efficient way of computation in BigData or distributed computing. When we are processing any data, we used to process one record after another record, but with Vectors, Hive process batches of rows instead of single row.

Please have in depth read here:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_data-access/content/query-vectorization.html


#3

Hi Ravi,
Thank you so much for your reply.
My concern is that, Is enabling vectorization enough to increase performance or we need additional coding for the same?

Thanks in advance,
Aparna


#4

@AparnaSen,

Enabling vectorization is just an parameter in Hive to increase computational performance, no need for additional coding.


#5

Thanks a lot Raviā€¦ :slight_smile: