Query Example - Word Count

#1

Let us see how we can perform word count using Hive QL.

  • Create table by name wordcount.
  • Insert data into the table.
CREATE TABLE wordcount (s STRING);

INSERT INTO wordcount VALUES
  ('Hello World'),
  ('How are you'),
  ('Let us perform the word count'),
  ('The definition of word count is'),
  ('to get the count of each word from this data');

Now let us develop the logic to get the word count.

  • Split the lines into array of words
  • Explode them into records
SELECT split(s, ' ') FROM wordcount;
SELECT explode(split(s, ' ')) FROM wordcount;

Let us come up with the query to get word count.

  • We need to use nested subquery to get the count.
  • We will understand more about queries and nested queries later.
SELECT word, count(1) FROM (
  SELECT explode(split(s, ' ')) AS word FROM wordcount
) q
GROUP BY word;

Practice hive on state of the art Big Data cluster - https://labs.itversity.com
You can sign up for our courses on Udemy using $10 coupons - Udemy Coupons - Big Data Courses


0 Likes