Big Data

Apache NiFi Apache HBase This category is to discuss more about HBase. Workshop Exercises This category is to create Exercises who are part of live training sessions. Apache Hadoop This is to discuss all the topics around Hadoop core components such as Administration This is to discuss all about Big Data administration Apache Pig This is all about Apache Pig - a data flow language used to process both structured and unstructured data. Virtual Machines This category is to discuss about all the issues related to virtual machine images for big data CCA 131 - Cloudera Certified Associate - Admin Apache Spark This subcategory of big data is all about discussing Apache Spark Apache Sqoop This is to discuss all about Apache Sqoop which is used to export and import data between relational databases and Hadoop Apache Flume This is the topic to track the issues with respect to Flume. Apache Hive Let us start discussing all topics with respect to Apache Hive. cca131 Apache Kafka This category is to discuss all about Kafka
Topic Replies Activity
About the Big Data category 1 November 6, 2016
Difference launching pyspark with and without conf parameter? 1 January 24, 2020
Finished executors containing jar files 1 January 24, 2020
Not able to create hdfs user home directory 2 January 22, 2020
How to find the failed tables and resume sqoop --import-all of 100 tables 1 January 3, 2020
Join in Rdd ValueError 1 January 21, 2020
Sqoop Compress/ UnCompress 1 January 21, 2020
No such file or directory 6 January 18, 2020
Sqoop Uncompress gzip not working 1 January 17, 2020
Pyspark ,input path does not exist while creating the RDD from HDFS location 1 January 17, 2020
Bash: command not found 9 January 13, 2020
Not able to initialize spark context 4 January 13, 2020
AWS EMR - Submitting Spark Jobs 1 December 23, 2019
Cannot import SparkSession in Spark-sumbit 7 January 10, 2020
Hive import is failing when used --as-parquetfile 1 January 10, 2020
Syntax error while running a pyspark program 1 January 9, 2020
Cloudera client installed failed on all nodes 1 January 8, 2020
Dataset for Udemy Course for local set up 2 January 8, 2020
self-training from scratch in big data to get cca 175 1 January 8, 2020
Tab Autocomplete does not work in pyspark shell 6 January 6, 2020
Json Nested Struct File 1 January 6, 2020
Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'retail_db' not found; 2 January 6, 2020
Load 100gb file to db 1 January 5, 2020
Appears flume is not running correctly 4 January 5, 2020
No log4j2 configuration file found while running agent 2 January 5, 2020
'DataFrameReader' object has no attribute 'csv' 1 January 4, 2020
'path does not exist' error - when I invoked pyspark from python3 and tried to create dataframe 1 December 30, 2019
Cluster Nodes explanation 1 December 9, 2019
Unable to create dataframe using local file 5 January 5, 2020
Apache Spark - How to determine executors for a given Spark Job? 2 January 3, 2020