Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Managing scope, defaults, and side effects

Open In Colab

In the previous section, we built our first functions. Now, we will look at how functions actually behave in your computer’s memory. This section covers critical design concepts that may save you hours of debugging when building more complex spatial data pipelines.


1. Scope and Namespaces

A common point of confusion for new programmers is understanding how variable names inside functions relate to those defined elsewhere in their notebooks.

Think of your main Python script as a giant “White Room.” When you define a function, you are building a smaller, soundproof room inside it.

Diagram showing nested local and global scopes.

A visual analogy of scope: a function creates a “soundproof” local scope inside the main global scope. Variables inside the function can “see” out to the global scope, but the global scope cannot “see” into the function.

Local Variables Stay Local

When you create a variable inside a function, it is a local variable. It only exists within that specific function’s room. Once the function finishes running, the room is demolished, and the variable disappears.

def calculate_area():
    # This variable only exists inside this function
    area = 1500
    return area


# We call the function
calculate_area()

# Guess the output if we try to print the variable from the main script...
print(area)

Output:

NameError: name 'area' is not defined

This is actually a brilliant feature! It means you can use simple variable names like area, distance, or x inside your functions without worrying about accidentally overwriting variables with the same names in your main script.

The Danger of Global Variables

Python searches for variables from the inside out. If it cannot find a variable inside the function (local scope), it will peek outside into the main script (global scope) to see if it exists there.

This can lead to dangerous bugs. Look at this example:

# A global variable defined at the top of our notebook
base_elevation = 1500


def calculate_relative_elevation(elevation):
    # BAD: This function uses a global variable instead of a parameter!
    return elevation - base_elevation


print(calculate_relative_elevation(2000))  # Outputs: 500

This code works, but it is poor software design. The function calculate_relative_elevation is no longer a standalone tool. It secretly relies on base_elevation existing elsewhere in the notebook. If you copy this function into a new script, or if you accidentally change base_elevation in a different cell, your pipeline will break, and it might be hard to figure out why.


2. Required vs. Optional Parameters

Up to this point, every parameter we have created has been a required parameter. If the user forgets to provide an argument, the program crashes.

However, we can design our tools to be much more flexible by providing optional parameters with default values.

Setting Default Values

Imagine a function that checks if a GPS point is within a certain accuracy threshold. Most of the time, our acceptable threshold is 5 meters. We can set that as the default!

# 'threshold' is now an optional parameter with a default of 5
def check_accuracy(point_accuracy, threshold=5):
    if point_accuracy <= threshold:
        return True
    else:
        return False

Now, the user has a choice:

# Option 1: Provide one argument. Python uses the default threshold (5).
check_accuracy(3)

# Option 2: Override the default by providing a second argument.
check_accuracy(3, threshold=2)

The Ordering Rule

There is a strict grammatical rule in Python regarding defaults: Required parameters must always come before optional parameters in the function definition.

# GOOD: Required comes first
def create_buffer(geometry, buffer_size=50): 

# ERROR: Required comes after an optional default
def create_buffer(buffer_size=50, geometry): 


3. The Mutable Default Trap

This is a high-value moment. Understanding this concept separates beginners from professional data scientists.

We just learned that we can set default values (like threshold=5). But what happens if we set a default value to a mutable object, like an empty list ([])?

Let’s write a function that takes a new GPS waypoint and adds it to a route list. If no route list is provided, it should default to an empty list.

# THE DANGEROUS WAY
def add_waypoint(waypoint, route=[]):
    route.append(waypoint)
    return route

Watch what happens when we use it to track two different animals:

track_bear = add_waypoint("Point A")
print(f"Bear Track: {track_bear}")

track_wolf = add_waypoint("Point X")
print(f"Wolf Track: {track_wolf}")

Output:

Bear Track: ['Point A']
Wolf Track: ['Point A', 'Point X']

What happened?! The wolf is somehow inheriting the bear’s GPS points!

When you define a function, Python evaluates the default arguments only once. It creates that single empty list [] and stores it in memory. Every time you call the function without providing a route, it grabs that exact same list. The state carries over (a “side effect”), destroying the reproducibility of your data.

The Fix: Use None

To fix this, we must use None as our default value, and create the fresh list inside the function block.

# THE SAFE WAY
def add_waypoint(waypoint, route=None):
    if route is None:
        route = []  # Creates a brand new, empty list every time it runs!

    route.append(waypoint)
    return route

4. Exercises

Test your understanding of function design and scope!

Exercise 1: The Scope Bug (Core)

A junior analyst wrote the following code to convert a list of elevations from feet to meters. When they try to run print(converted), Python throws a NameError.

Your Task: Without running the code, explain why the error occurs, and fix the code so it properly prints the result.

def feet_to_meters(elevation_ft):
    converted = elevation_ft * 0.3048
    return converted


feet_to_meters(5000)
print(converted)

Exercise 2: The Bounding Box (Stretch)

In spatial analysis, a common operation is to draw a square “bounding box” around a coordinate.

Diagram illustrating how to calculate a simple bounding box from a center point and a buffer.

Calculating a simplified bounding box. By adding and subtracting a buffer value from a center point’s latitude and longitude, you can define the minimum and maximum coordinates of a square area.

Your Task: Write a function create_bbox(lat, lon, buffer) that calculates a simplified bounding box around the point coordinates.

  1. Make lat and lon required parameters.

  2. Make buffer an optional parameter with a default value of 0.5 degrees.

  3. The function should return a dictionary containing min_lat, max_lat, min_lon, and max_lon. (Hint: min_lat is lat - buffer, max_lat is lat + buffer, etc.)


Exercise 3: The Contaminated River (Challenge)

You are writing a code pipeline to track pollution sampling sites along different rivers. Look at the code below.

Your Tasks:

  1. Run the code in your head. What will the output look like?

  2. Why is this happening?

  3. Rewrite the function using best practices so that the Isar and the Inn get their own independent lists of samples.

def log_sample(ph_level, river_samples=[]):
    river_samples.append(ph_level)
    return river_samples


# We take two samples from the Isar
isar_data = log_sample(7.2)
isar_data = log_sample(7.4, isar_data)

# We take one sample from the Inn
inn_data = log_sample(6.8)

print(f"Isar: {isar_data}")
print(f"Inn: {inn_data}")

5. Summary

In this section, we moved from writing functional code to writing safe code:

  • Scope: Variables created inside a function are destroyed when the function finishes. Do not rely on global variables; always use parameters.

  • Defaults: Optional parameters make your tools flexible, but they must always be listed after required parameters.

  • The Mutable Trap: Never use [] or {} as default values. Use None, and create the object inside the function block to prevent state leakage and side effects.

What comes next?

You now know how to build perfectly constrained, safe tools. But Python functions have one more superpower. Next, we will learn how to use *args and **kwargs to handle an infinite, unpredictable amount of spatial data inputs!