# Apache Spark Python - Basic Transformations - Total Aggregations

Let us go through the details related to total aggregations using Spark.

• We can perform total aggregations directly on Dataframe or we can perform aggregations after grouping by a key(s).

• Here are the functions which we typically use to perform aggregations.

• `count`

• `sum`, `avg`

• `min`, `max`

In this section, we will break down the key concepts related to total aggregations using Spark.

## Aggregation Functions

Aggregation functions are used to perform calculations on groups of rows of a DataFrame. Here are the commonly used aggregation functions:

``````# Counting total number of rows
airtraffic.count()
``````

## Distinct Values

Calculating the number of distinct values in a DataFrame is essential. Here’s how you can do it:

``````# Counting distinct values
airtraffic. \
select('Year', 'Month', 'DayOfMonth'). \
distinct(). \
count()
``````

## Total Bonus Amount

Calculating the total bonus amount from a dataset can be done using the `sum` function:

``````# Calculating total bonus amount
employeesDF. \
select(((sum(coalesce(col('bonus').cast('int'), lit(0)) * col('salary'))) / lit(100)).alias('total_bonus')). \
show()
``````

## Revenue Calculation

Determining the revenue generated for a given order from a dataset can be achieved using the `sum` function:

``````# Calculating order revenue
order_items. \
filter(col('order_item_order_id') == lit(int(order_id))). \
select(sum('order_item_subtotal').alias('order_revenue')). \
show()
``````

Watch the video tutorial here

1. Calculate the total number of rows in the `airtraffic` DataFrame.
2. Find the distinct count of dates from the `airtraffic` DataFrame.
3. Calculate the total bonus amount from the `employeesDF` DataFrame.
4. Determine the revenue generated for a specific order from the `order_items` dataset.