Getting Started with Python Data Structures in 5 Steps

This tutorial covers Python's foundational data structures - lists, tuples, dictionaries, and sets. Learn their characteristics, use cases, and practical examples, all in 5 steps.



Getting Started with Python Data Structures in 5 Steps

 

Introduction to Python Data Structures

 

When it comes to learning how to program, regardless of the particular programming language you use for this task, you find that there are a few major topics of your newly-chosen discipline that into which most of what you are being exposed to could be categorized. A few of these, in general order of grokking, are: syntax (the vocabulary of the language); commands (putting the vocabulary together into useful ways); flow control (how we guide the order of command execution); algorithms (the steps we take to solve specific problems... how did this become such a confounding word?); and, finally, data structures (the virtual storage depots that we use for data manipulation during the execution of algorithms (which are, again... a series of steps).

Essentially, if you want to implement the solution to a problem, by cobbling together a series of commands into the steps of an algorithm, at some point data will need to be processed, and data structures will become essential. Such data structures provide a way to organize and store data efficiently, and are critical for creating fast, modular code that can perform useful functions and scale well. Python, a particular programming language, has a series of built-in data structures of its own.

This tutorial will focus on these four foundational Python data structures:

  • Lists - Ordered, mutable, allows duplicate elements. Useful for storing sequences of data.
  • Tuples - Ordered, immutable, allows duplicate elements. Think of them as immutable lists.
  • Dictionaries - Unordered, mutable, mapped by key-value pairs. Useful for storing data in a key-value format.
  • Sets - Unordered, mutable, contains unique elements. Useful for membership testing and eliminating duplicates.

Beyond the fundamental data structures, Python also provides more advanced structures, such as heaps, queues, and linked lists, which can further enhance your coding prowess. These advanced structures, built upon the foundational ones, enable more complex data handling and are often used in specialized scenarios. But you aren't constrained here; you can use all of the existing structures as a base to implement your own structures as well. However, the understanding of lists, tuples, dictionaries, and sets remains paramount, as these are the building blocks for more advanced data structures.

This guide aims to provide a clear and concise understanding of these core structures. As you start your Python journey, the following sections will guide you through the essential concepts and practical applications. From creating and manipulating lists to leveraging the unique capabilities of sets, this tutorial will equip you with the skills needed to excel in your coding.

 

Step 1: Using Lists in Python

 

What is a List in Python?

 

A list in Python is an ordered, mutable data type that can store various objects, allowing for duplicate elements. Lists are defined by the use of square brackets [ ], with elements being separated by commas.

For example:

fibs = [0, 1, 1, 2, 3, 5, 8, 13, 21]

 

Lists are incredibly useful for organizing and storing data sequences.

 

Creating a List

 

Lists can contain different data types, like strings, integers, booleans, etc. For example:

mixed_list = [42, "Hello World!", False, 3.14159]

 

Manipulating a List

 

Elements in a list can be accessed, added, changed, and removed. For example:

# Access 2nd element (indexing begins at '0')
print(mixed_list[1])

# Append element 
mixed_list.append("This is new")

# Change element
mixed_list[0] = 5

# Remove last element
mixed_list.pop(0)

 

Useful List Methods

 

Some handy built-in methods for lists include:

  • sort() - Sorts list in-place
  • append() - Adds element to end of list
  • insert() - Inserts element at index
  • pop() - Removes element at index
  • remove() - Removes first occurrence of value
  • reverse() - Reverses list in-place

 

Hands-on Example with Lists

 

# Create shopping cart as a list
cart = ["apples", "oranges", "grapes"]

# Sort the list 
cart.sort()

# Add new item 
cart.append("blueberries") 

# Remove first item
cart.pop(0)

print(cart)

 

Output:

['grapes', 'oranges', 'blueberries']

 

Step 2: Understanding Tuples in Python

 

What Are Tuples?

 

Tuples are another type of sequence data type in Python, similar to lists. However, unlike lists, tuples are immutable, meaning their elements cannot be altered once created. They are defined by enclosing elements in parentheses ( ).

# Defining a tuple
my_tuple = (1, 2, 3, 4)

 

When to Use Tuples

 

Tuples are generally used for collections of items that should not be modified. Tuples are faster than lists, which makes them great for read-only operations. Some common use-cases include:

  • Storing constants or configuration data
  • Function return values with multiple components
  • Dictionary keys, since they are hashable

 

Accessing Tuple Elements

 

Accessing elements in a tuple is done in a similar manner as accessing list elements. Indexing and slicing work the same way.

# Accessing elements
first_element = my_tuple[0]
sliced_tuple = my_tuple[1:3]

 

Operations on Tuples

 

Because tuples are immutable, many list operations like append() or remove() are not applicable. However, you can still perform some operations:

  • Concatenation: Combine tuples using the + operator.
concatenated_tuple = my_tuple + (5, 6)
  • Repetition: Repeat a tuple using the * operator.
repeated_tuple = my_tuple * 2
  • Membership: Check if an element exists in a tuple with the in keyword.
exists = 1 in my_tuple

 

Tuple Methods

 

Tuples have fewer built-in methods compared to lists, given their immutable nature. Some useful methods include:

  • count(): Count the occurrences of a particular element.
count_of_ones = my_tuple.count(1)
  • index(): Find the index of the first occurrence of a value.
index_of_first_one = my_tuple.index(1)

 

Tuple Packing and Unpacking

 

Tuple packing and unpacking are convenient features in Python:

  • Packing: Assigning multiple values to a single tuple.
packed_tuple = 1, 2, 3
  • Unpacking: Assigning tuple elements to multiple variables.
a, b, c = packed_tuple

 

Immutable but Not Strictly

 

While tuples themselves are immutable, they can contain mutable elements like lists.

# Tuple with mutable list
complex_tuple = (1, 2, [3, 4])

 

Note that while you can't change the tuple itself, you can modify the mutable elements within it.

 

Step 3: Mastering Dictionaries in Python

 

What is a Dictionary in Python?

 

A dictionary in Python is an unordered, mutable data type that stores mappings of unique keys to values. Dictionaries are written with curly braces { } and consist of key-value pairs separated by commas.

For example:

student = {"name": "Michael", "age": 22, "city": "Chicago"}

 

Dictionaries are useful for storing data in a structured manner and accessing values by keys.

 

Creating a Dictionary

 

Dictionary keys must be immutable objects like strings, numbers, or tuples. Dictionary values can be any object.

student = {"name": "Susan", "age": 23}

prices = {"milk": 4.99, "bread": 2.89}

 

Manipulating a Dictionary

 

Elements can be accessed, added, changed, and removed via keys.

# Access value by key
print(student["name"])

# Add new key-value 
student["major"] = "computer science"  

# Change value
student["age"] = 25

# Remove key-value
del student["city"]

 

Useful Dictionary Methods

 

Some useful built-in methods include:

  • keys() - Returns list of keys
  • values() - Returns list of values
  • items() - Returns (key, value) tuples
  • get() - Returns value for key, avoids KeyError
  • pop() - Removes key and returns value
  • update() - Adds multiple key-values

 

Hands-on Example with Dictionaries

 

scores = {"Francis": 95, "John": 88, "Daniel": 82}

# Add new score
scores["Zoey"] = 97

# Remove John's score
scores.pop("John")  

# Get Daniel's score
print(scores.get("Daniel"))

# Print all student names 
print(scores.keys())

 

Step 4: Exploring Sets in Python

 

What is a Set in Python?

 

A set in Python is an unordered, mutable collection of unique, immutable objects. Sets are written with curly braces { } but unlike dictionaries, do not have key-value pairs.

For example:

numbers = {1, 2, 3, 4}

 

Sets are useful for membership testing, eliminating duplicates, and mathematical operations.

 

Creating a Set

 

Sets can be created from lists by passing it to the set() constructor:

my_list = [1, 2, 3, 3, 4]
my_set = set(my_list) # {1, 2, 3, 4}

 

Sets can contain mixed data types like strings, booleans, etc.

 

Manipulating a Set

 

Elements can be added and removed from sets.

numbers.add(5) 

numbers.remove(1)

 

Useful Set Operations

 

Some useful set operations include:

  • union() - Returns union of two sets
  • intersection() - Returns intersection of sets
  • difference() - Returns difference between sets
  • symmetric_difference() - Returns symmetric difference

 

Hands-on Example with Sets

 

A = {1, 2, 3, 4}
B = {2, 3, 5, 6}

# Union - combines sets 
print(A | B) 

# Intersection 
print(A & B)

# Difference  
print(A - B)

# Symmetric difference
print(A ^ B)

 

Step 5: Comparing Lists, Dictionaries, and Sets

 

Comparison of Characteristics

 

The following is a concise comparison of the four Python data structures we referred to in this tutorial.

Structure Ordered Mutable Duplicate Elements Use Cases
List Yes Yes Yes Storing sequences
Tuple Yes No Yes Storing immutable sequences
Dictionary No Yes Keys: No
Values: Yes
Storing key-value pairs
Set No Yes No Eliminating duplicates, membership testing

 

When to Use Each Data Structure

 

Treat this as a soft guideline for which structure to turn to first in a particular situation.

  • Use lists for ordered, sequence-based data. Useful for stacks/queues.
  • Use tuples for ordered, immutable sequences. Useful when you need a fixed collection of elements that should not be changed.
  • Use dictionaries for key-value data. Useful for storing related properties.
  • Use sets for storing unique elements and mathematical operations.

 

Hands-on Example Using All Four Data Structures

 

Let's have a look at how these structures can all work together in an example that is a little more complex than a one liner.

# Make a list of person names
names = ["John", "Mary", "Bob", "Mary", "Sarah"]

# Make a tuple of additional information (e.g., email)
additional_info = ("john@example.com", "mary@example.com", "bob@example.com", "mary@example.com", "sarah@example.com")

# Make set to remove duplicates
unique_names = set(names)

# Make dictionary of name-age pairs
persons = {}
for name in unique_names:
  persons[name] = random.randint(20,40)

print(persons)

 

Output:

{'John': 34, 'Bob': 29, 'Sarah': 25, 'Mary': 21}

 

This example utilizes a list for an ordered sequence, a tuple for storing additional immutable information, a set to remove duplicates, and a dictionary to store key-value pairs.

 

Moving Forward

 

In this comprehensive tutorial, we've taken a deep look at the foundational data structures in Python, including lists, tuples, dictionaries, and sets. These structures form the building blocks of Python programming, providing a framework for data storage, processing, and manipulation. Understanding these structures is essential for writing efficient and scalable code. From manipulating sequences with lists, to organizing data with key-value pairs in dictionaries, and ensuring uniqueness with sets, these essential tools offer immense flexibility in data handling.

As we've seen through code examples, these data structures can be combined in various ways to solve complex problems. By leveraging these data structures, you can open the doors to a wide range of possibilities in data analysis, machine learning, and beyond. Don't hesitate to explore the official Python data structures documentation for more insights.

Happy coding!

 
 
Matthew Mayo (@mattmayo13) holds a Master's degree in computer science and a graduate diploma in data mining. As Editor-in-Chief of KDnuggets, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.