Pyspark RDD manimulation


I have been learning Spark using python as language (pyspark).

now this might sound a dumb question :slightly_smiling_face:

Do we manipulate RDD using pyspark or we just work on Dataframe using pyspark/

I was not able to fine proper documentation which shows the RDD operations in pyspark.

everyone was mentioning you need to know reduceByKey(), JoinByKey() for certification but i am not sure how to use this in pyspark. Any help?