Big Data

Apache NiFi Apache HBase This category is to discuss more about HBase. Workshop Exercises This category is to create Exercises who are part of live training sessions. Administration This is to discuss all about Big Data administration Apache Hadoop This is to discuss all the topics around Hadoop core components such as Apache Kafka This category is to discuss all about Kafka Apache Pig This is all about Apache Pig - a data flow language used to process both structured and unstructured data. Apache Flume This is the topic to track the issues with respect to Flume. Virtual Machines This category is to discuss about all the issues related to virtual machine images for big data CCA 131 - Cloudera Certified Associate - Admin Apache Sqoop This is to discuss all about Apache Sqoop which is used to export and import data between relational databases and Hadoop Apache Hive Let us start discussing all topics with respect to Apache Hive. Apache Spark This subcategory of big data is all about discussing Apache Spark
Topic Replies Created
About the Big Data category 1 November 6, 2016
Need help while running pyspark in yarn mode 1 May 23, 2019
Metrics jar file not found 1 May 23, 2019
No sqoop client: how do i install sqoop client tool in my lab environmnt 2 May 22, 2019
Impact of having different number of rows in files after import 2 June 13, 2018
CCA 175 Cleared on 18 May, 2019 4 May 19, 2019
Pyspark handling null values 3 May 14, 2019
Writing JSON file in compressed format 5 May 15, 2019
Integration of Talend with Hadoop Cluster 6 July 25, 2017
Flume-ng agent is now working 4 May 13, 2019
How many vCores allocated for Tasks within the Executors? 2 July 22, 2018
Unable to push/pull message to kafka topic in lab from IDE installed local 8 February 28, 2019
Pyspark save text file for each group and the name of the file should refelct the group name (salary) 2 May 12, 2019
Unable to run flume job for HDFS Sink for Webserver Logs 4 May 11, 2019
Pyspark zip two RDDS 1 May 12, 2019
How to split when the primary key is alphanumeric 1 May 12, 2019
Apache Spark GCP cluster Integrate with local machine Pycharm 2 May 12, 2019
Unable to LOAD DATA from hdfs://host/dir/file.txt because Impala does not have WRITE permissions on its parent directory hdfs://host/dir 3 September 26, 2018
Unable to create hive database from pyspark 6 May 6, 2019
Spark reading data from csv shows no values? 1 May 10, 2019
Use Pyspark in jupyter notebook 6 June 26, 2018
How to pass date/timestamp as lowerBound/upperBound in spark-sql-2.4.1v? 1 May 8, 2019
Hdfs read error using spark 2 May 7, 2019
While submit the spark job in cmd at my desktop getting following error 5 April 20, 2019
Set up problem in command line prompt 2 May 6, 2019
Error creating hive table using avro schema 6 August 1, 2017
Prepare HR Database with EMPLOYEES Table 1 May 3, 2019
Perform aggregations using Windowing Functions 1 May 3, 2019
Lambda function giving error 3 May 2, 2019
Spark SQL error while partitioning data based on date 1 May 3, 2019