Thread 84 spilling sort data of 334.0 MiB to disk (3 times

Hello Team

I am using Spark -Standalone cluster

Currently, I am facing some issue and when trying to access large table from oracle database and write into parquet format as partition by month format

I am observing below logs like and it’s taking too much time to write that file

UnsafeExternalSorter: Thread 84 spilling sort data of 334.0 MiB to disk (1 time so far)

is there a better way to do this {or} do I need to adjust configuration?

its doing shuffle internally

Please help


Learn Spark 1.6.x or Spark 2.x on our state of the art big data labs

  • Click here for access to state of the art 13 node Hadoop and Spark Cluster