Urgent: Unable to read sequence file despite class is defined explicitly, thank you very much


#1

Here is the script:

case class Orders(order_id: Int, order_date: String, order_customer_id: Int, order_status: String)

val seqRDD=sc.sequenceFile("orders03132_seq",classOf[org.apache.hadoop.io.Text],classOf[org.apache.hadoop.io.Text])

Here is the error:

18/03/20 07:37:04 ERROR Executor: Exception in task 0.0 in stage 16.0 (TID 111)
java.lang.RuntimeException: java.io.IOException: WritableName can’t load class: orders


#2

Now I know why it is throwing error: it is because for sequence file the type of Key and Value are needed in order to read the sequence, in this example, the sequence file contains only one type.

But, this sequence file was generated using sqoop, why only one type is seen in the file?

sqoop import -m 1 \
--connect=jdbc:mysql://ms.itversity.com/retail_db \
--username=retail_user \
--password=itversity \
--table=orders \
--as-sequencefile \
--target-dir=order20180320_seq

[paslechoix@gw03 ~]$ hdfs dfs -cat order20180320_seq/part-m-00000 |head
SEQ!org.apache.hadoop.io.LongWritableorders7▒▒P▒ U3▒3▒$@▒▒-OCLOSED@▒▒PENDING_PAYMENT@▒▒/COMPLETE@▒▒"{CLOSED@▒▒,COMPLETE@▒COMPLETE@▒▒COMPLET@▒▒


Why the command works somewhere else but not in the paid lab environment? Can anyone explain?