Day 8: Data Visualization with Matplotlib (Part 1)

Python for Data Science

Muhammad Dawood
5 min readJul 23, 2023

Welcome to Day 8 of our Python for data science challenge! Data visualization is vital to data analysis, allowing us to communicate insights and patterns effectively. Today, we will explore Matplotlib, one of the most popular libraries for creating captivating visualizations in Python. Matplotlib enables us to generate various plots, customize appearances, and convey complex information visually. Let’s dive into the world of Matplotlib and discover the art of data visualization!

Introduction to Matplotlib:

Matplotlib is a highly versatile and user-friendly Python library used for creating a wide range of visualizations. Whether you need static, interactive, or publication-quality plots, Matplotlib covers you. In this introduction, we’ll guide you through importing and setting up Matplotlib in your Python environment and introduce you to the fundamental components of a Matplotlib figure.

To get started with Matplotlib, make sure you have it installed in your Python environment. If not, you can install it using pip:

pip install matplotlib

Once installed, you can import Matplotlib using the following convention:

import matplotlib.pyplot as plt

Matplotlib primarily revolves around the concept of figures and axes. A figure is a canvas that holds one or multiple plots, while axes represent the individual plots within the figure. For most simple plots, you’ll work with a single figure and a pair of axes.

Creating Line Plots and Scatter Plots:

Two of the most commonly used plot types are line plots and scatter plots. Line plots represent trends and variations in continuous data over a specific range, such as time-series data. On the other hand, scatter plots are used to display the correlation between two variables, showcasing how they relate to each other.

To create a line plot using Matplotlib, you can use the plt.plot() function:

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 25, 18, 30, 15]

plt.plot(x, y)
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')
plt.title('Line Plot Example')
plt.show()

Output

A line plot using Matplotlib

For scatter plots, you can use the plt.scatter() function:

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 25, 18, 30, 15]

plt.scatter(x, y)
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')
plt.title('Scatter Plot Example')
plt.show()

Output

A scatter plot using Matplotlib

Customizing Plot Appearance:

To enhance the clarity and interpretability of your plots, it’s essential to customize their appearance. Add axis labels, titles, and legends to provide context and better understand the data.

Here’s how you can customize the appearance of your plots:

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 25, 18, 30, 15]

plt.plot(x, y, marker='o', linestyle='--', color='b', label='Data')
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')
plt.title('Customized Line Plot')
plt.legend()
plt.grid(True)
plt.show()

Output

Combining Multiple Plots:

Sometimes, displaying multiple plots together is beneficial to gain a comprehensive view of the data. To achieve this, you can create subplots within a single figure using Matplotlib.

Here’s how you can create subplots:

import matplotlib.pyplot as plt

# Sample data
x = [1, 2, 3, 4, 5]
y1 = [10, 25, 18, 30, 15]
y2 = [5, 20, 12, 28, 10]

# Creating subplots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))

ax1.plot(x, y1)
ax1.set_title('Line Plot 1')

ax2.scatter(x, y2)
ax2.set_title('Scatter Plot 2')

plt.show()

Practical Application:

Let’s start with a simple example of visualizing a time series dataset. For this example, we’ll use a hypothetical dataset that contains monthly sales data for a company over a year.

Assuming you have the following data:

Month | Sales

January | 1000 February | 1200 March | 800 April | 1500 May | 1800 June | 2000 July | 2200 August | 2400 September | 1800 October | 1600 November | 1900 December | 2100

We’ll use Matplotlib to create a line plot to visualize the sales trend over the year:

import matplotlib.pyplot as plt

# Sample data (replace this with your actual dataset)
months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
sales = [1000, 1200, 800, 1500, 1800, 2000, 2200, 2400, 1800, 1600, 1900, 2100]

# Create a line plot
plt.figure(figsize=(10, 6))
plt.plot(months, sales, marker='o', color='b', linestyle='-')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Monthly Sales Trend')
plt.grid(True)
plt.xticks(rotation=45) # Rotate x-axis labels for better readability
plt.show()

Output

Using Matplotlib to create a line plot to visualize the sales trend over the year

This code will generate a line plot showing the monthly sales trend for the company. You can customize the plot further by adjusting colours, adding labels, and modifying other plot properties to create more informative and visually appealing visualizations.

Congratulations on completing Day 8 of our Python for data science challenge! Today, you explored the foundations of data visualization with Matplotlib, learning how to create line plots, scatter plots, and customize plot appearances. Matplotlib equips you with the tools to create visually appealing and informative visualizations to communicate your findings effectively.

As you continue your Python journey, remember to leverage Matplotlib’s capabilities to present data in a compelling and insightful manner. Tomorrow, on Day 9, we will explore more advanced visualizations with Matplotlib and Seaborn, taking your data visualization skills to the next level.

Let’s embark on this exciting journey together and unlock the power of data!

If you found this article interesting, your support by following steps will help me spread the knowledge to others:

👏 Give the article 50 claps

💻 Follow me on Twitter

📚 Read more articles on Medium|Linkedin|

--

--

Muhammad Dawood

Embarking on a journey to unlock the power of data-driven insights. Exploring the world of statistics and machine learning. | Researcher | Curious!