Programming Essentials Python - Overview of Pandas Libraries - Pandas Data Structures

Pandas is a powerful data manipulation library in Python that provides high-level data structures and functions. It is not a core Python module, so you need to install it using pip install pandas. Pandas has two primary data structures - Series and DataFrame.

Series

A Series is a one-dimensional array that contains an index for each row and a single attribute or column. You can create a Series using a list of values.

import pandas as pd

sals_l = [1500.0, 2000.0, 2200.00]
sals_s = pd.Series(sals_l, name='sal')
print(sals_s)
print(sals_s[:2])

DataFrame

A DataFrame is a two-dimensional tabular data structure with an index for each row and multiple columns. Each column in a DataFrame is a Series. You can create a DataFrame using a list of tuples or a list of dictionaries.

sals_ld = [(1, 1500.0), (2, 2000.0), (3, 2200.00)]
sals_df = pd.DataFrame(sals_ld, columns=['id', 'sal'])
print(sals_df)
print(sals_df['id'])

Hands-On Tasks

  1. Install Pandas library using pip install pandas.
  2. Import Pandas library in your Python script.
  3. Create a Series using a list of values.
  4. Create a DataFrame using a list of tuples or dictionaries.

Conclusion

In this article, we covered the fundamental concepts of Pandas Data Structures - Series and DataFrame. Understanding these structures is crucial for data manipulation tasks. We encourage you to practice creating Series and DataFrame and explore the vast functionalities Pandas offers for data analysis. Happy coding!

Placeholder for the video: [Click here to watch the video](provide the link to the video here)

Watch the video tutorial here