Problem statement confusion

apache-spark

#1

below is the part of problem statement

I am confused about the meaning of line in quote below,

HDFS location: /public/nyse_symbols
First line is header and it should be included
Instructions
Get the name of stocks displayed along with other information

Data Description
NYSE data with “,” as delimiter is available in HDFS

NYSE data information:

HDFS location: /public/nyse
There is no header in the data
NYSE Symbols data with tab character (\t) as delimiter is available in HDFS

NYSE Symbols data information:

HDFS location: /public/nyse_symbols
First line is header and it should be included


#2

nyse_symbols is the file name, it is located in HDFS in “/public/” folder. And that file has header, which means it is having column names in the first row of the file.


#3

Hi Vinod,

I have confusion abt 2nd line i.e “first line is header and it should be included” in fact I have solved the problem as well