Programming Essentials Python - Map Reduce Libraries - Row Level Transformations using map

In this article, we will explore how to perform row-level transformations using the map function in Python. By applying map to a dataset, we can derive new fields from existing ones and perform various data transformations.

Key Concepts Explanation

Derive New Fields

One common use case for row-level transformations is deriving new fields from existing data. This can involve extracting specific information, formatting values, or creating new indicators based on existing fields.

# Example: Derive a new field by extracting information from an existing field
orders['order_day_name'] = orders['order_date'].apply(lambda x: x.weekday_name)

Weekend Flag

Another typical transformation is adding a weekend flag based on the day of the week. This helps categorize dates as either weekdays or weekends for further analysis.

# Example: Add a weekend flag based on the day of the week
orders['weekend_flag'] = orders['order_date'].apply(lambda x: x.weekday() in [5, 6])

Hands-On Tasks

Let’s apply these concepts with practical tasks on a dataset.

  1. Task 1: Get the day name of each date in the orders dataset and output the order ID, order date, and day name.
def get_order_day_info(order):
    order_id, order_date = order.split(',')[0], dt.datetime.strptime(order.split(',')[1].split(' ')[0], '%Y-%m-%d')
    return order_id, str(order_date), order_date.strftime('%A')

order_day_info = map(get_order_day_info, orders)
  1. Task 2: Add a weekend flag for Saturday and Sunday dates in the orders dataset.
def add_weekend_flag(order):
    order_id, order_date = order.split(',')[0], dt.datetime.strptime(order.split(',')[1].split(' ')[0], '%Y-%m-%d')
    return order_id, str(order_date), order_date.strftime('%A'), order_date.weekday() in [5, 6]

order_with_weekend_flag = map(add_weekend_flag, orders)

Conclusion

In this article, we have explored how to perform row-level transformations using the map function in Python. By following the hands-on tasks provided, you can practice applying these concepts to your own datasets. Feel free to engage with the community for further learning and support.

Watch the video tutorial here