How to Implement Complex Filters on DataFrame Columns with Pandas

Learn how to acquire data you need with Pandas filter syntax.



How to Implement Complex Filters on DataFrame Columns with Pandas
Image by Editor | Ideogram

 

Let’s learn how to perform complex filtering in Pandas.

 

Preparation

 
Before we start, we need the Pandas package installed. You can install them using the following code:

pip install pandas

 

With the packages installed, let’s jump into the article.

 

Pandas DataFrame Complex Filtering

 

DataFrame is a Pandas object that can store data and be manipulated as needed. It is especially powerful because we can filter the data using conditions, logical operators, and Pandas functions.

Let’s try to create a simple DataFrame object.

import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Leah', 'Jessica', 'Kenny', 'Brad'],
    'Age': [50, 27, 22, 30, 40],
    'Salary': [100000, 154000, 120000, 78000, 88000],
    'Occupation': ['Doctor', 'Soldier', 'Doctor', 'Accountant', 'Florist']
})

 

With this sample data, we would work on how to filter it. First, we can filter the data based on a certain condition.

df[df['Age'] > 30]

 

Output:

   Name  Age  Salary Occupation
0  Alice   50  100000     Doctor
4   Brad   40   88000    Florist

 

Combining the condition with the And (&) operator is also possible.

df[(df['Age'] > 25) & (df['Salary'] < 100000)]

 

   Name  Age  Salary  Occupation
3  Kenny   30   78000  Accountant
4   Brad   40   88000     Florist

 

Using condition, we can combine them with the Or (|) operator as well.

df[(df['Salary'] < 100000) | (df['Occupation'] == 'Soldier')]

 

Output:

   Name  Age  Salary  Occupation
1   Leah   27  154000     Soldier
3  Kenny   30   78000  Accountant
4   Brad   40   88000     Florist

 

There are also ways to filter the data with the string function. For example, we can filter if the columns contain certain values.

df[df['Occupation'].str.contains('Sol')]

 

Output:

  Name  Age  Salary Occupation
1  Leah   27  154000    Soldier

 

If you need to filter the data with certain string values you require, we can use the following code.

df[df['Occupation'].isin(['Doctor', 'Florist'])]

 

Output:

    Name  Age  Salary Occupation
0    Alice   50  100000     Doctor
2  Jessica   22  120000     Doctor
4     Brad   40   88000    Florist

 

There is also a way to filter the data with lambda functions.

df[df['Name'].apply(lambda x: len(x) > 5)]

 

Output:

     Name  Age  Salary Occupation
2  Jessica   22  120000     Doctor

 

If you want to simplify it, we can use the query method for the data filtering.

df.query('Age  100000')

 

Output:

     Name  Age  Salary Occupation
1     Leah   27  154000    Soldier
2  Jessica   22  120000     Doctor

 

Lastly, we can combine any filtering conditions we have learned previously like this.

df[(df['Age'] > 30) & (
                 (df['Salary'] > 60000) | 
                 (df['Occupation'].str.contains('Doc')))]

 

Output:

   Name  Age  Salary Occupation
0  Alice   50  100000     Doctor
4   Brad   40   88000    Florist

 

Master the filtering functions to improve your data analysis process.

 

Additional Resources

 

 
 

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

No, thanks!