How to Write Efficient Python Code Even If You’re a Beginner

You don’t need to be a Python pro to write fast, clean code. Just a few smart coding habits can go a long way.

By Bala Priya C, KDnuggets Contributing Editor & Technical Content Specialist on May 27, 2025 in Python

How to Write Efficient Python Code Even If You're a Beginner

Image by Author | Ideogram

When you're starting out with Python, getting your code to work correctly is your first priority. But as you grow as a developer, you'll want your code to be not just correct, but also efficient.

Efficient code runs faster, uses less memory, and scales better when handling larger datasets. The good news is that you don't need years of experience to start writing more efficient Python. With a few simple techniques, you can write more efficient Python even if you’re a beginner.

In this article, I'll walk you through practical techniques to make your Python code more efficient. For each technique, you'll see a clear comparison between the less-than-efficient approach and the more efficient alternative.

🔗 You can find the code on GitHub

Use Built-In Functions Instead of Manual Implementations

Python comes with many built-in functions that can do almost any simple non-trivial task. These functions have been optimized and are designed to handle common operations efficiently.

Instead of this:

def process_sales_data(sales):
    highest_sale = sales[0]
    for sale in sales:
        if sale > highest_sale:
            highest_sale = sale
    
    total_sales = 0
    for sale in sales:
        total_sales += sale
    
    return highest_sale, total_sales, total_sales / len(sales)

This approach iterates through the list twice to find the highest value and the total, which is not efficient.

Do this:

def process_sales_data(sales):
    return max(sales), sum(sales), sum(sales) / len(sales)

This approach uses Python's built-in max() and sum() functions, which are optimized for these exact operations. This version is not only faster (especially for larger datasets) but also more readable and less prone to errors.

So whenever you find yourself writing loops to perform common operations on data collections, check if there's a built-in function that could do the job more efficiently.

Use List Comprehensions, But Keep Them Readable

List comprehensions are go-to options to create lists from existing lists and other sequences. They are more concise than equivalent for loops and are often faster, too.

Instead of this:

def get_premium_customer_emails(customers):
    premium_emails = []
    for customer in customers:
        if customer['membership_level'] == 'premium' and customer['active']:
            email = customer['email'].lower().strip()
            premium_emails.append(email)
    return premium_emails

This creates an empty list, then repeatedly calls .append() inside a loop. Each append operation comes with some overhead.

Do this:

def get_premium_customer_emails(customers):
    return [
        customer['email'].lower().strip()
        for customer in customers
        if customer['membership_level'] == 'premium' and customer['active']
    ]

The list comprehension expresses the entire operation in one statement. The result is code that runs faster while also being more readable once you're familiar with the pattern.

🔖 List comprehensions work best when the transformation is straightforward. If your logic gets complex, consider breaking it into simpler steps or using a traditional loop for clarity.

Need further advice, read Why You Should Not Overuse List Comprehensions in Python.

Use Sets and Dictionaries for Fast Lookups

When you need to check if an item exists in a collection or perform frequent lookups, sets and dictionaries are far more efficient than lists. They provide nearly constant-time operations regardless of size, while list lookups get slower as the list grows.

Instead of this:

def has_permission(user_id, permitted_users):
    # permitted_users is a list of user IDs
    for p_user in permitted_users:
        if p_user == user_id:
            return True
    return False

permitted_users = [1001, 1023, 1052, 1076, 1088, 1095, 1102, 1109]
print(has_permission(1088, permitted_users))

This checks each element in the list until it finds a match, which is linear time O(n).

Do this:

def has_permission(user_id, permitted_users):
    # permitted_users is now a set of user IDs
    return user_id in permitted_users

permitted_users = {1001, 1023, 1052, 1076, 1088, 1095, 1102, 1109}
print(has_permission(1088, permitted_users))

The second approach uses a set (note the curly braces instead of square brackets). Sets in Python use hash tables internally, which allow for very fast lookups.

When you check if an item is in a set, you can get the answer almost instantly, regardless of the set's size. This is constant time complexity (O(1)).

For small collections, the difference might be negligible. But as your data grows, set approach is faster.

Use Generators to Process Large Data Efficiently

When working with large datasets, trying to load everything into memory at once can cause your program to slow down or crash. Generators provide a memory-efficient solution by producing values one at a time, on demand.

Instead of this:

def find_errors(log_file):
    with open(log_file, 'r') as file:
        lines = file.readlines()
    
    error_messages = []
    for line in lines:
        if '[ERROR]' in line:
            timestamp = line.split('[ERROR]')[0].strip()
            message = line.split('[ERROR]')[1].strip()
            error_messages.append((timestamp, message))
    
    return error_messages

This reads the entire file into memory with readlines() before processing any data. If the log file is very large (several gigabytes, for example), this could use a lot of memory and potentially cause your program to crash.

Do this:

def find_errors(log_file):
    with open(log_file, 'r') as file:
        for line in file:
            if '[ERROR]' in line:
                timestamp = line.split('[ERROR]')[0].strip()
                message = line.split('[ERROR]')[1].strip()
                yield (timestamp, message)

# Usage:
for timestamp, message in find_errors('application.log'):
    print(f"Error at {timestamp}: {message}")

Here we use a generator. Also note how generator functions use the yield keyword instead of return. It reads and processes just one line at a time, returning each result as it's found. This means:

Memory usage stays low regardless of file size
You start getting results immediately without waiting for the entire file to be processed
If you only need to process part of the data, you can stop early and save time

Generators are great for processing large files, web streams, database queries, or any data source that might be too large to fit comfortably in memory all at once.

Don't Repeat Expensive Operations in Loops

A simple but powerful optimization is to avoid performing the same expensive calculation repeatedly in a loop. If an operation doesn't depend on the loop variable, do it only once outside the loop.

Instead of this:

import re
from datetime import datetime

def find_recent_errors(logs):
    recent_errors = []
    
    for log in logs:
        # This regex compilation happens on every iteration
        timestamp_pattern = re.compile(r'\[(.*?)\]')
        timestamp_match = timestamp_pattern.search(log)
        
        if timestamp_match and '[ERROR]' in log:
            # The datetime parsing happens on every iteration
            log_time = datetime.strptime(timestamp_match.group(1), '%Y-%m-%d %H:%M:%S')
            current_time = datetime.now()
            
            # Check if the log is from the last 24 hours
            time_diff = (current_time - log_time).total_seconds() / 3600
            if time_diff <= 24:
                recent_errors.append(log)
    
    return recent_errors

The first approach has two operations inside the loop that don't need to be repeated:

Compiling a regular expression with re.compile() on every iteration
Getting the current time with datetime.now() on every iteration

Since these values don't change during the loop execution, calculating them repeatedly is wasteful.

Do this:

import re
from datetime import datetime

def find_recent_errors(logs):
    recent_errors = []
    
    # Compile the regex once
    timestamp_pattern = re.compile(r'\[(.*?)\]')
    # Get the current time once
    current_time = datetime.now()
    
    for log in logs:
        timestamp_match = timestamp_pattern.search(log)
        
        if timestamp_match and '[ERROR]' in log:
            log_time = datetime.strptime(timestamp_match.group(1), '%Y-%m-%d %H:%M:%S')
            
            # Check if the log is recent (last 24 hours)
            time_diff = (current_time - log_time).total_seconds() / 3600
            if time_diff <= 24:
                recent_errors.append(log)
    
    return recent_errors

In this second approach, we move the expensive operations outside the loop so they're performed just once.

This simple change can significantly improve performance, especially for loops that run many times. The savings grow proportionally with the number of iterations. Meaning with thousands of log entries, you could save thousands of unnecessary operations.

Don't Use += on Strings in Loops

When building strings incrementally, using += in a loop is inefficient. Each operation creates a new string object, which becomes increasingly expensive as the string grows larger. Instead, collect string parts in a list and join them at the end.

Instead of this:

def generate_html_report(data_points):
    html = "<html><body><h1>Data Report</h1><ul>"
    
    for point in data_points:
        # This creates a new string object on each iteration
        html += f"<li>{point['name']}: {point['value']} ({point['timestamp']})</li>"
    
    html += "</ul></body></html>"
    return html

The problem with the first approach is that strings in Python are immutable: they can't be changed after creation. When you use += on a string, Python:

Creates a new string large enough to hold both strings
Copies all the characters from the original string
Adds the new content
Discards the old string

As your string grows larger, this process becomes expensive.

Do this:

def generate_html_report(data_points):
    parts = ["<html><body><h1>Data Report</h1><ul>"]
    
    for point in data_points:
        parts.append(f"<li>{point['name']}: {point['value']} ({point['timestamp']})</li>")
    
    parts.append("</ul></body></html>")
    return "".join(parts)

The second approach builds a list of string fragments with the .append() method, then joins them all at once at the end. This avoids creating and destroying multiple intermediate string objects.

This pattern becomes particularly important when building long strings iteratively, such as when generating reports, concatenating file contents, or building large XML or HTML documents.

Wrapping Up

Writing efficient Python code doesn't require advanced knowledge. It's often about knowing which approach to use in common situations. The techniques covered in this guide focus on practical patterns that can make a real difference in your code's performance:

Using built-in functions instead of manual implementations
Choosing list comprehensions for clear and efficient transformations
Selecting the right data structure (sets and dictionaries) for lookups
Using generators to process large data efficiently
Moving invariant operations out of loops
Building strings efficiently by joining lists

Remember that code readability should still be a priority. Fortunately, many of these efficient approaches also lead to cleaner, more expressive code, giving you programs that are both easy to understand and performant.

I hope these tips help you on your journey to becoming a better Python programmer. Keep coding!

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

How to Write Efficient Python Code Even If You’re a Beginner

Use Built-In Functions Instead of Manual Implementations

Use List Comprehensions, But Keep Them Readable

Use Sets and Dictionaries for Fast Lookups

Use Generators to Process Large Data Efficiently

Don't Repeat Expensive Operations in Loops

Don't Use += on Strings in Loops

Wrapping Up

More On This Topic

Latest Posts

Top Posts