Big Data


Apache HBase This category is to discuss more about HBase. Apache NiFi Apache Hadoop This is to discuss all the topics around Hadoop core components such as Workshop Exercises This category is to create Exercises who are part of live training sessions. Apache Pig This is all about Apache Pig - a data flow language used to process both structured and unstructured data. Virtual Machines This category is to discuss about all the issues related to virtual machine images for big data Administration This is to discuss all about Big Data administration CCA 131 - Cloudera Certified Associate - Admin Apache Flume This is the topic to track the issues with respect to Flume. cca131 Apache Sqoop This is to discuss all about Apache Sqoop which is used to export and import data between relational databases and Hadoop Apache Hive Let us start discussing all topics with respect to Apache Hive. Apache Kafka This category is to discuss all about Kafka Apache Spark This subcategory of big data is all about discussing Apache Spark
Topic Replies Created
About the Big Data category 1 November 6, 2016
Unable to use Spark2X in lab 2 July 18, 2019
Sqoop import not working 5 July 16, 2019
Getting error when trying below code on labs 1 July 17, 2019
How do i use latest pyththon say 3.6 in pyspark version 2+ on the lab 4 July 12, 2019
Exercise 13 - Sqoop import and export 8 February 14, 2017
Not enough replicas available for query at consistency LOCAL_QUORUM (1 required but only 0 alive) 1 July 16, 2019
Getting error Exception in thread "main" java.lang.NoSuchMethodException 1 July 16, 2019
Install CM and CDH - Setup CM, Install CDH and Setup Cloudera Management Service - Install CM and CDH on all nodes 4 June 4, 2019
Bigdata CCA 131 EXAM LAB 2 July 14, 2019
Kafka connectors | unable to run connect-standalone.sh 2 July 15, 2019
Kafka connect is failing while trying to flush the message to consumer 7 July 5, 2018
Load dat file in into hive table 2 July 13, 2019
Why to use pyspark.sql.Row()? 1 July 15, 2019
Load .DAT file into HDFS 8 November 15, 2018
Pyspark zip two RDDS 4 May 12, 2019
Unable to process spark job on lab 2 July 14, 2019
Bucketing with Sorting 1 July 14, 2019
Inserting Data Into Bucketed Tables 1 July 14, 2019
Creating Bucketed Tables 1 July 14, 2019
My solutions in pyspark to Arun's Blog questions 4 July 11, 2019
How to copy data from remote server to HDFS and the file size in 1000 gbs, so scp will not work in that case what are the other options? 2 January 25, 2019
Truncating Tables in Hive 1 July 12, 2019
Dropping Tables and Databases in Hive 1 July 12, 2019
Overview of File Formats - STORED AS Clause 1 July 12, 2019
Managed Tables vs. External Tables 1 July 12, 2019
Running Hive Queries using Beeline 1 July 12, 2019
Overview of beeline 1 July 12, 2019
Role of Hive Metastore 1 July 12, 2019
Retrieve metadata of Hive Tables 1 July 12, 2019