Python for Data Science: Day 4

Functions and Modules

8 min readJul 19, 2023

Python for Data Science: Day 4 Functions and Modules By Muhammad Dawood

Welcome to Day 4 of our Python for data science challenge! Today, we will delve into functions and modules, two essential concepts that enhance code organization, reusability, and maintainability. Functions allow you to break down complex tasks into smaller, manageable blocks, while modules enable you to leverage existing code and expand Python’s capabilities. Let’s explore how functions and modules can revolutionize your programming experience!

Defining and Calling Functions:

Functions are blocks of reusable code that perform specific tasks in programming. They are essential for organizing code, improving readability, and avoiding repetitive tasks. To define a function in Python (and many other programming languages), we use the “def” keyword followed by the function name, a set of parentheses containing parameters (if any), and a colon.

Here’s the basic syntax for defining a function:

def function_name(parameter1, parameter2, ...):
    # Function body - code that performs the task
    # You can use the parameters inside the function to manipulate data

    # Optionally, you can return a value using the "return" keyword
    return result

Let’s create a simple function that adds two numbers together:

def add_numbers(a, b):
    result = a + b
    return result

To use this function, you can call it and pass arguments (values) to it:

result = add_numbers(5, 3)
print(result)  # Output: 8

In this example, the function add_numbers takes two parameters a and b, and it returns the sum of these two values.

When naming functions, it’s important to follow certain conventions for better code readability:

Use descriptive names: Choose a name that reflects the function's task. This makes the code easier to understand.
Use lowercase and underscores: Stick to lowercase letters and separate words with underscores for function names. For example, calculate_average instead of CalculateAverage.
Avoid naming conflicts: Ensure function names don’t clash with existing Python keywords or built-in functions.

Documentation is crucial for making your code understandable to others (including your future self). You can use docstrings to document your functions. A docstring is a multi-line string that comes immediately after the function definition and describes the function’s purpose, parameters, and return value.

Here’s an example of a function with a docstring:

def multiply(a, b):
    """
    This function takes two numbers, 'a' and 'b', and returns their product.
    Parameters:
        a (int or float): The first number.
        b (int or float): The second number.
    Returns:
        int or float: The product of 'a' and 'b'.
    """
    return a * b

With a docstring in place, you can access the function’s documentation using Python’s help() function or integrated development environments (IDEs) to understand its purpose and usage.

Remember, writing clear and well-documented functions can significantly improve the maintainability and collaboration of your code projects.

Parameters and Return Values:

Function parameters play a crucial role in passing data into a function for processing, making functions versatile and reusable. In Python, there are several types of function parameters:

Positional Parameters:

These are the most common type of parameters and are defined by their position in the function’s parameter list. When calling the function, you need to provide values for each positional parameter in the same order they are defined in the function. Example:

def greet(name, age):
    return f"Hello, {name}! You are {age} years old."

result = greet("Alice", 30)
print(result)  # Output: "Hello, Alice! You are 30 years old."

Keyword Parameters:

Keyword parameters are identified by their names when passing values to the function. This allows you to pass arguments in any order, which can be especially useful when a function has many parameters, and you want to specify only a few of them. Example:

def greet(name, age):
    return f"Hello, {name}! You are {age} years old."

result = greet(age=30, name="Alice")
print(result)  # Output: "Hello, Alice! You are 30 years old."

Default Parameters:

Default parameters have predefined values assigned to them in the function’s definition. If the caller does not provide a value for a default parameter, the function uses the default value. Example:

def greet(name, age=25):
    return f"Hello, {name}! You are {age} years old."

result1 = greet("Alice")
result2 = greet("Bob", 30)
print(result1)  # Output: "Hello, Alice! You are 25 years old."
print(result2)  # Output: "Hello, Bob! You are 30 years old."

Variable-Length Argument Lists:

Sometimes, you might need to pass a varying number of arguments to a function. Python allows you to work with variable-length argument lists using *args (for positional arguments) and **kwargs (for keyword arguments). Example:

def sum_numbers(*args):
    total = 0
    for num in args:
        total += num
    return total

result = sum_numbers(1, 2, 3, 4, 5)
print(result)  # Output: 15

Returning Values:

Functions can return values using the return statement. The returned value can be stored in a variable for later use or used directly in expressions. Example:

def multiply(a, b):
    return a * b

result = multiply(5, 3)
print(result)  # Output: 15

By understanding and utilizing different parameter types and return statements, you can create more flexible and powerful functions that can process a wide range of data and produce meaningful results. This makes your code more organized, maintainable, and efficient.

Exploring Built-in and Third-Party Modules:

Python provides a wide range of built-in modules that extend its functionality and enable developers to perform various tasks efficiently. Some of the most commonly used built-in modules include:

math module:

The math module provides various mathematical functions and constants. It is particularly useful for tasks involving complex mathematical operations. Example:

import math

# Calculate the square root of a number
result = math.sqrt(25)
print(result)  # Output: 5.0

# Calculate the factorial of a number
factorial_result = math.factorial(5)
print(factorial_result)  # Output: 120

# Calculate the value of pi
pi_value = math.pi
print(pi_value)  # Output: 3.141592653589793

random module:

The random module allows you to work with random numbers and make random selections. It is commonly used in simulations, games, and other scenarios where randomness is required. Example:

import random

# Generate a random integer between 1 and 10 (inclusive)
random_int = random.randint(1, 10)
print(random_int)

# Generate a random float between 0 and 1
random_float = random.random()
print(random_float)

In addition to built-in modules, Python supports third-party modules, which are created and maintained by the Python community to extend the language’s capabilities. These third-party modules can be installed via the Python Package Index (PyPI) using the pip package manager.

Here’s how you can install a third-party module and use its functionality:

pip install module_name

After installation, you can import the module and use its functions in your code:

import module_name

# Use the module's functions
result = module_name.some_function()

For example, let’s explore the popular third-party module “requests,” which is commonly used for making HTTP requests.

# Install the requests module using pip
# Run this command in your terminal or command prompt:
# pip install requests

import requests

# Make a GET request to a URL
response = requests.get("https://api.example.com/data")

# Check the status code of the response
if response.status_code == 200:
    # If the request was successful, access the response data
    data = response.json()
    print(data)
else:
    print("Error: Failed to fetch data.")

By harnessing the potential of both built-in and third-party modules, Python developers can significantly enhance their productivity and solve complex problems with ease. The vast array of available modules makes Python a versatile and powerful programming language.

Importing Modules:

Importing modules in Python allows you to access their functions and attributes, extending the capabilities of your scripts. There are different ways to import modules, each with its advantages. Let’s explore some common import styles:

Importing the Entire Module:

You can import the entire module using the import keyword, followed by the module name. This makes all the functions and attributes of the module available in your code, but you need to use the module name to access them. Example:

import math

result = math.sqrt(25)
print(result)  # Output: 5.0

Importing Specific Functions or Attributes:

If you only need specific functions or attributes from a module, you can import them directly, which can make your code more concise and avoid potential naming conflicts. Example:

from math import sqrt, pi

result = sqrt(25)
print(result)  # Output: 5.0

circle_area = pi * (radius ** 2)

Giving Modules Custom Names:

You can give modules custom names (also known as aliases) using the as keyword. This can be particularly helpful if you are working with long module names or to avoid naming conflicts. Example:

import math as m

result = m.sqrt(25)
print(result)  # Output: 5.0

Importing All Functions and Attributes with an Asterisk ():

While generally discouraged for large modules, you can use the asterisk () to import all functions and attributes from a module. This makes all the module’s functions available without the need to prefix them with the module name. However, it might lead to namespace pollution and confusion, so use it with caution. Example:

from math import *

result = sqrt(25)
print(result)  # Output: 5.0

When choosing an import style, consider the following best practices:

Import the entire module if you plan to use many of its functions or attributes.
Import specific functions or attributes when you only need a few from a module.
Give modules custom names for readability and to avoid naming conflicts.
Avoid using the asterisk (*) to import all functions, except in specific scenarios where it is necessary.

Using appropriate import styles ensures cleaner and more maintainable code, as well as better organization and reduced chances of naming clashes, making your Python scripts more efficient and readable.

Application:

Let’s walk through some practical examples that demonstrate the utility of functions and modules in data science tasks. We’ll cover creating custom functions and using built-in modules for complex calculations. For simplicity, we’ll focus on basic data analysis tasks.

Example 1: Custom Function for Calculating Mean

def calculate_mean(data):
    """
    This function takes a list of numbers 'data' and calculates the mean.
    Parameters:
        data (list): A list of numerical values.
    Returns:
        float: The mean of the data.
    """
    total = sum(data)
    count = len(data)
    mean = total / count
    return mean

data_list = [2, 4, 6, 8, 10]
mean_result = calculate_mean(data_list)
print(f"The mean of the data is: {mean_result}")

Example 2: Using the “statistics” Module

Python’s built-in statistics module provides functions for basic statistical calculations.

import statistics

data_list = [2, 4, 6, 8, 10]

# Calculate the mean
mean_result = statistics.mean(data_list)
print(f"The mean of the data is: {mean_result}")

# Calculate the median
median_result = statistics.median(data_list)
print(f"The median of the data is: {median_result}")

# Calculate the standard deviation
std_dev_result = statistics.stdev(data_list)
print(f"The standard deviation of the data is: {std_dev_result}")

Example 3: Using the “numpy” Module

The numpy library is a fundamental package for scientific computing in Python.

import numpy as np

data_list = [2, 4, 6, 8, 10]

# Calculate the mean using numpy
mean_result = np.mean(data_list)
print(f"The mean of the data is: {mean_result}")

# Calculate the standard deviation using numpy
std_dev_result = np.std(data_list)
print(f"The standard deviation of the data is: {std_dev_result}")

# Create an array using numpy
data_array = np.array(data_list)
# Perform element-wise operations
squared_array = data_array ** 2
print(squared_array)

These examples showcase the versatility of Python functions and modules in data science tasks. Custom functions can be designed to handle specific data analysis operations, while built-in and third-party modules like statistics and numpy provide powerful tools for more complex calculations and array operations. Combining functions and modules allows data scientists to efficiently process and analyze data in Python, making it a powerful and popular language for data science tasks.

Congratulations on completing Day 4 of our Python for data science challenge! Today, you explored the power of functions and modules, understanding how they enhance code reusability and extend Python’s capabilities. Functions allow you to create modular, organized code, while modules enable you to tap into a vast library of functionalities.
Tomorrow, on Day 5, we will delve into data manipulation with Pandas, an essential library for data analysis and exploration.

Let’s embark on this exciting journey together and unlock the power of data!

If you found this article interesting, your support by following steps will help me spread the knowledge to others:

👏 Give the article 50 claps
💻 Follow me on Twitter
📚 Read more articles on Medium|Linkedin|
🔗 Connect on social media |Github| Linkedin| Kaggle|