Programming Essentials Python - Overview of Pandas Libraries - Data Frames - Basic Operations

Here are some of the basic operations we typically perform on top of Pandas Data Frame.

  • Getting number of records and columns.
  • Getting data types of the columns.
  • Replacing NaN with some standard values.
  • Dropping a column from the Data Frame.
  • Getting or updating column names.
  • Sorting by index or values.
import pandas as pd

# Creating Pandas Data Frame using list of dicts.
sals_ld = [
    {'id': 1, 'sal': 1500.0},
    {'id': 2, 'sal': 2000.0, 'comm': 10.0},
    {'id': 3, 'sal': 2200.0, 'active': False}
]

# Automatically inherit column names from dict keys.
sals_df = pd.DataFrame(sals_ld)

# Printing Data Frame
print(sals_df)

# Getting specific columns
print(sals_df['id'])
print(sals_df[['id', 'sal']])

# Getting shape and count
print(sals_df.shape)
print(sals_df.count())
print(sals_df.count()[:2])
print(sals_df.count()['id'])

# Data types
print(sals_df.dtypes)

# Filling NaN values
print(sals_df.fillna(0.0))
print(sals_df.fillna({'comm': 0.0}))
print(sals_df.fillna({'comm': 0.0, 'active': True}))

# Dropping columns
print(sals_df.drop(columns='comm'))
print(sals_df.drop(columns=['comm', 'active']))
print(sals_df.drop(['comm', 'active'], axis=1))

sals_df = sals_df.drop(columns='comm')
print(sals_df.columns)

# Updating column names
sals_df.columns = ['employee_id', 'salary', 'commission']
print(sals_df)

# Sorting
print(sals_df.sort_index())
print(sals_df.sort_values(by='employee_id', ascending=False))
print(sals_df.sort_values(by='salary'))
print(sals_df.sort_values(by='salary', ascending=False)

# Sorting with multiple columns
sals_ld = [
    {'id': 1, 'sal': 1500.0},
    {'id': 2, 'sal': 2000.0, 'comm': 10.0},
    {'id': 3, 'sal': 2200.0, 'active': False},
    {'id': 4, 'sal': 2000.0}
]
sals_df = pd.DataFrame(sals_ld)
print(sals_df.sort_values(by=['sal', 'id']))
print(sals_df.sort_values(by=['sal', 'id'], ascending=[False, True]))

In this video tutorial, we will cover the basic operations on Pandas Data Frames. Watch the video for a hands-on demonstration.

Click here to watch the video

Summary

This article provided insights into basic operations on Pandas Data Frames, including handling NaN values, updating column names, and sorting by values or index. It is essential to practice these operations to enhance your data manipulation skills. Feel free to engage with the community for further learning opportunities.

Remember to join our community for more discussions and learning resources!

Watch the video tutorial here