Spark SQL valid functions

spark-sql

#1

Hi, What all functions can be used in Spark SQL. I noticed that some mysql functions e.g. ADDDATE doesnt work in Spark SQL instead there is a pyspark.sql.function.date_add. But then i found there is a function pyspark.sql.function.countDistinct which works on dataframe but not on Spark SQL. Bit confused now. Can someone help?

  1. orderdataframe.agg(F.countDistinct(odf.order_customer_id)).show() ==> works
    orderdataframe.registerTempTable(“orders”)

  2. sqlContext.sql(“select countDistinct(order_customer_id) from orders”).show() ==> throws error

  3. sqlContext.sql(“select approx_count_distinct(order_customer_id) from orders”).show() ==> works but result different from (1)

  4. s.sql(“select count(distinct(order_customer_id)) from orders”).show() ==> works