Regular expression for dates using pyspark(CCA175)

pyspark

#1

Hello Guys,
I am looking for solution for this problem.Please help.
ABCTECH.com has done survey on their Exam Products feedback using a web based form. With the following free text field as input in web ui.
Name: String Subscription Date: String Rating : String
And servey data has been saved in a file called spark9/feedback.txt Christopher|Jan 11, 2015|5
Kapil|11 Jan, 2015|5
Thomas|6/17/2014|5
John|22-08-2013|5
Mithun|2013|5
Jitendra||5
Write a spark program using regular expression which will filter all the valid dates and save in two separate file (good record and bad record)


#2

Can you let me know what is the valid date format in this case…is it
dd mon,yyyy
mm/dd/yyyy
dd-mm-yyyy

  1. re.findall(r’\d\d\s(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec),\ \d{4}’, date)
  2. re.findall(r"\d\d\s[$-/:-?{-~!"^_\[\]]\d\d$-/:-?{-~!"^_[]]\d{4}’,date)

Try and let me know