How to read an ORC File using Spark Core APIs

Hello Everyone,

Is anyone tried to access ORC files by using the Spark Core API’s. I do see that using Spark SQL it can be done. But I am giving a try to access by using the Spark Core API’s. If anyone tried please do let me know.

@email2dgk One using Dataframe


@Rahul Thanks for sharing. I am not interested to do that using Spark SQL (Data Frame).

I could not find any other approach.

I also tried but could not find any direct solution for read ORC using rdd.

And my finding below. I didn’t test, if you get time check below thing is working.

Spark context providing option to read file using hadoopfs. like bleow

sparkContext. hadoopFile(String path, Class<? extends org.apache.hadoop.mapred.InputFormat<K,V>> inputFormatClass, Class keyClass, Class valueClass, int minPartitions)

You can specify

Suresh Selvaraj