Big Data

Apache Hadoop This is to discuss all the topics around Hadoop core components such as
Apache Flume This is the topic to track the issues with respect to Flume.
Apache Spark This subcategory of big data is all about discussing Apache Spark
Apache Pig This is all about Apache Pig - a data flow language used to process both structured and unstructured data.
Apache HBase This category is to discuss more about HBase.
Virtual Machines This category is to discuss about all the issues related to virtual machine images for big data
Apache Sqoop This is to discuss all about Apache Sqoop which is used to export and import data between relational databases and Hadoop
Workshop Exercises This category is to create Exercises who are part of live training sessions.
Apache Hive Let us start discussing all topics with respect to Apache Hive.
Apache Kafka This category is to discuss all about Kafka
Administration This is to discuss all about Big Data administration

About the Big Data category [Big Data] (1)
Trying to import mysql data using sqoop. please let me know credential for mysql [Big Data] (4)
Not able to launch sqoop [Apache Sqoop] (3)
NAmeserver problem [Big Data] (1)
Spark-shell is not getting launch [Apache Spark] (1)
Where to find subtractByKey in RDD [Apache Spark] (1)
Flume error : Telnet Connection refused [Apache Flume] (4)
Is there any way to run .hql or .hive file using Pyspark ? If there is a way, could any one help me how to do that. I know using spark-sql we can execute .hql files by using -i or -f options [Apache Spark] (1)
Error when import com.databricks.spark.avro._ [Apache Spark] (5)
Hadoop fs -ls /user/cloudera - command not working [Virtual Machines] (7)
ERROR: Not able to read sequence file through scala [Apache Spark] (11)
Unable to avro data in spark shell [Apache Spark] (4)
Sqoop Password file error [Apache Sqoop] (4)
How to create parquet files out of orc tables data? [Big Data] (2)
Mechanism behind -copyFromLocal command [Apache Hadoop] (1)
Hive tables Creation [Apache Hadoop] (4)
Big Data Development Lifecycle [Big Data] (1)
Snappy/Gzip compression on ORC files using Scala [Big Data] (6)
DAG failed due to vertex failure [Big Data] (1) [Apache Hadoop] (1) [Apache Hadoop] (1)
Compression Codecs list [Apache Hadoop] (1)
Error while creating external table in hive [Apache Hive] (2)
Ranking per key - Example 1 [Apache Spark] (1)
Oozie on YARN vs MR [Apache Hadoop] (1)
Sqoop import error - importing to already existing db [Apache Sqoop] (4)
Umask for hdfs directory [Apache Hadoop] (1)
Spark Streaming joining DStream with RDD [Apache Spark] (5)
Spark-shell --master yarn failing [Apache Spark] (2)
Pyspark - Windows 8 - cmd not recognized as an internal or external command” in Windows [Apache Spark] (5)