Median of Median (PIVOT Style) using Spark API

Hello Friends,

I am trying to struggle to find the median of the median. I have the following data and I want to find Median of 1st, 2nd and 3rd lines, then I will get 3 numbers (each line one number) and then Median of 3 numbers:

324 22 123 234 322 32 2 444
420 212 224 22
134 44 1 2 1 233 134

Note: Median formula is, let’s say if you take the 1st line - 324 22 123 234 322 32 2 444

  1. Should be split by space and convert Array[String] to Array[Int]: 324 22 123 234 322 32 2 444
  2. Sort the complete list: 2 22 32 123 234 322 324 444
  3. For this group, I got 2 middle numbers - 123 and 234, then I need to find the middle number of these 2, that is (123+234)/2 = 357/2 = 178.5

Like this for each line, I will get 3 numbers and again I will have to calculate median of these 3 numbers. That’s the final answer.

Thank you!!