Hi, What all functions can be used in Spark SQL. I noticed that some mysql functions e.g. ADDDATE doesnt work in Spark SQL instead there is a pyspark.sql.function.date_add. But then i found there is a function pyspark.sql.function.countDistinct which works on dataframe but not on Spark SQL. Bit confused now. Can someone help?
orderdataframe.agg(F.countDistinct(odf.order_customer_id)).show() ==> works
sqlContext.sql(“select countDistinct(order_customer_id) from orders”).show() ==> throws error
sqlContext.sql(“select approx_count_distinct(order_customer_id) from orders”).show() ==> works but result different from (1)
s.sql(“select count(distinct(order_customer_id)) from orders”).show() ==> works