5 Useful DIY Python Functions for Parsing Dates and Times

Dates and times shouldn’t break your code, but they often do. These five DIY Python functions help turn real-world dates and times into clean, usable data.

By Bala Priya C, KDnuggets Contributing Editor & Technical Content Specialist on January 26, 2026 in Python

5 Useful DIY Python Functions for Parsing Dates and Times

Image by Author

# Introduction

Parsing dates and times is one of those tasks that seems simple until you actually try to do it. Python's datetime module handles standard formats well, but real-world data is messy. User input, scraped web data, and legacy systems often throw curveballs.

This article walks you through five practical functions for handling common date and time parsing tasks. By the end, you'll understand how to build flexible parsers that handle the messy date formats you see in projects.

Link to the code on GitHub

# 1. Parsing Relative Time Strings

Social media apps, chat applications, and activity feeds display timestamps like "5 minutes ago" or "2 days ago". When you scrape or process this data, you need to convert these relative strings back into actual datetime objects.

Here's a function that handles common relative time expressions:

from datetime import datetime, timedelta
import re

def parse_relative_time(time_string, reference_time=None):
    """
    Convert relative time strings to datetime objects.
    
    Examples: "2 hours ago", "3 days ago", "1 week ago"
    """
    if reference_time is None:
        reference_time = datetime.now()
    
    # Normalize the string
    time_string = time_string.lower().strip()
    
    # Pattern: number + time unit + "ago"
    pattern = r'(\d+)\s*(second|minute|hour|day|week|month|year)s?\s*ago'
    match = re.match(pattern, time_string)
    
    if not match:
        raise ValueError(f"Cannot parse: {time_string}")
    
    amount = int(match.group(1))
    unit = match.group(2)
    
    # Map units to timedelta kwargs
    unit_mapping = {
        'second': 'seconds',
        'minute': 'minutes',
        'hour': 'hours',
        'day': 'days',
        'week': 'weeks',
    }
    
    if unit in unit_mapping:
        delta_kwargs = {unit_mapping[unit]: amount}
        return reference_time - timedelta(**delta_kwargs)
    elif unit == 'month':
        # Approximate: 30 days per month
        return reference_time - timedelta(days=amount * 30)
    elif unit == 'year':
        # Approximate: 365 days per year
        return reference_time - timedelta(days=amount * 365)

The function uses a regular expression (regex) to extract the number and time unit from the string. The pattern (\d+) captures one or more digits, and (second|minute|hour|day|week|month|year) matches the time unit. The s? makes the plural 's' optional, so both "hour" and "hours" work.

For units that timedelta supports directly (seconds through weeks), we create a timedelta and subtract it from the reference time. For months and years, we approximate using 30 and 365 days respectively. This isn't perfect, but it's good enough for most use cases.

The reference_time parameter lets you specify a different "now" for testing or when processing historical data.

Let's test it:

result1 = parse_relative_time("2 hours ago")
result2 = parse_relative_time("3 days ago")
result3 = parse_relative_time("1 week ago")

print(f"2 hours ago: {result1}")
print(f"3 days ago: {result2}")
print(f"1 week ago: {result3}")

Output:

2 hours ago: 2026-01-06 12:09:34.584107
3 days ago: 2026-01-03 14:09:34.584504
1 week ago: 2025-12-30 14:09:34.584558

# 2. Extracting Dates from Natural Language Text

Sometimes you need to find dates buried in text: "The meeting is scheduled for January 15th, 2026" or "Please respond by March 3rd". Instead of manually parsing the entire sentence, you want to extract just the date.

Here's a function that finds and extracts dates from natural language:

import re
from datetime import datetime

def extract_date_from_text(text, current_year=None):
    """
    Extract dates from natural language text.
    
    Handles formats like:
    - "January 15th, 2024"
    - "March 3rd"
    - "Dec 25th, 2023"
    """
    if current_year is None:
        current_year = datetime.now().year
    
    # Month names (full and abbreviated)
    months = {
        'january': 1, 'jan': 1,
        'february': 2, 'feb': 2,
        'march': 3, 'mar': 3,
        'april': 4, 'apr': 4,
        'may': 5,
        'june': 6, 'jun': 6,
        'july': 7, 'jul': 7,
        'august': 8, 'aug': 8,
        'september': 9, 'sep': 9, 'sept': 9,
        'october': 10, 'oct': 10,
        'november': 11, 'nov': 11,
        'december': 12, 'dec': 12
    }
    
    # Pattern: Month Day(st/nd/rd/th), Year (year optional)
    pattern = r'(january|jan|february|feb|march|mar|april|apr|may|june|jun|july|jul|august|aug|september|sep|sept|october|oct|november|nov|december|dec)\s+(\d{1,2})(?:st|nd|rd|th)?(?:,?\s+(\d{4}))?'
    
    matches = re.findall(pattern, text.lower())
    
    if not matches:
        return None
    
    # Take the first match
    month_str, day_str, year_str = matches[0]
    
    month = months[month_str]
    day = int(day_str)
    year = int(year_str) if year_str else current_year
    
    return datetime(year, month, day)

The function builds a dictionary mapping month names (both full and abbreviated) to their numeric values. The regex pattern matches month names followed by day numbers with optional ordinal suffixes (st, nd, rd, th) and an optional year.

The (?:...) syntax creates a non-capturing group. This means we match the pattern but don't save it separately. This is useful for optional parts like the ordinal suffixes and the year.

When no year is provided, the function defaults to the current year. This is logical because if someone mentions "March 3rd" in January, they typically refer to the upcoming March, not the previous year's.

Let's test it with various text formats:

text1 = "The meeting is scheduled for January 15th, 2026 at 3pm"
text2 = "Please respond by March 3rd"
text3 = "Deadline: Dec 25th, 2026"

date1 = extract_date_from_text(text1)
date2 = extract_date_from_text(text2)
date3 = extract_date_from_text(text3)

print(f"From '{text1}': {date1}")
print(f"From '{text2}': {date2}")
print(f"From '{text3}': {date3}")

Output:

From 'The meeting is scheduled for January 15th, 2026 at 3pm': 2026-01-15 00:00:00
From 'Please respond by March 3rd': 2026-03-03 00:00:00
From 'Deadline: Dec 25th, 2026': 2026-12-25 00:00:00

# 3. Parsing Flexible Date Formats with Smart Detection

Real-world data comes in many formats. Writing separate parsers for each format is tedious. Instead, let's build a function that tries multiple formats automatically.

Here's a smart date parser that handles common formats:

from datetime import datetime

def parse_flexible_date(date_string):
    """
    Parse dates in multiple common formats.
    
    Tries various formats and returns the first match.
    """
    date_string = date_string.strip()
    
    # List of common date formats
    formats = [
        '%Y-%m-%d',           
        '%Y/%m/%d',           
        '%d-%m-%Y',           
        '%d/%m/%Y',         
        '%m/%d/%Y',           
        '%d.%m.%Y',          
        '%Y%m%d',            
        '%B %d, %Y',      
        '%b %d, %Y',         
        '%d %B %Y',          
        '%d %b %Y',           
    ]
    
    # Try each format
    for fmt in formats:
        try:
            return datetime.strptime(date_string, fmt)
        except ValueError:
            continue
    
    # If nothing worked, raise an error
    raise ValueError(f"Unable to parse date: {date_string}")

This function uses a brute-force approach. It tries each format until one works. The strptime function raises a ValueError if the date string doesn't match the format, so we catch that exception and move to the next format.

The order of formats matters. We put International Organization for Standardization (ISO) format (%Y-%m-%d) first because it's the most common in technical contexts. Ambiguous formats like %d/%m/%Y and %m/%d/%Y appear later. If you know your data uses one consistently, reorder the list to prioritize it.

Let's test it with various date formats:

# Test different formats
dates = [
    "2026-01-15",
    "15/01/2026",
    "01/15/2026",
    "15.01.2026",
    "20260115",
    "January 15, 2026",
    "15 Jan 2026"
]

for date_str in dates:
    parsed = parse_flexible_date(date_str)
    print(f"{date_str:20} -> {parsed}")

Output:

2026-01-15           -> 2026-01-15 00:00:00
15/01/2026           -> 2026-01-15 00:00:00
01/15/2026           -> 2026-01-15 00:00:00
15.01.2026           -> 2026-01-15 00:00:00
20260115             -> 2026-01-15 00:00:00
January 15, 2026     -> 2026-01-15 00:00:00
15 Jan 2026          -> 2026-01-15 00:00:00

This approach isn't the most efficient, but it's simple and handles the vast majority of date formats you'll encounter.

# 4. Parsing Time Durations

Video players, workout trackers, and time-tracking apps display durations like "1h 30m" or "2:45:30". When parsing user input or scraped data, you need to convert these to timedelta objects for calculations.

Here's a function that parses common duration formats:

from datetime import timedelta
import re

def parse_duration(duration_string):
    """
    Parse duration strings into timedelta objects.
    
    Handles formats like:
    - "1h 30m 45s"
    - "2:45:30" (H:M:S)
    - "90 minutes"
    - "1.5 hours"
    """
    duration_string = duration_string.strip().lower()
    
    # Try colon format first (H:M:S or M:S)
    if ':' in duration_string:
        parts = duration_string.split(':')
        if len(parts) == 2:
            # M:S format
            minutes, seconds = map(int, parts)
            return timedelta(minutes=minutes, seconds=seconds)
        elif len(parts) == 3:
            # H:M:S format
            hours, minutes, seconds = map(int, parts)
            return timedelta(hours=hours, minutes=minutes, seconds=seconds)
    
    # Try unit-based format (1h 30m 45s)
    total_seconds = 0
    
    # Find hours
    hours_match = re.search(r'(\d+(?:\.\d+)?)\s*h(?:ours?)?', duration_string)
    if hours_match:
        total_seconds += float(hours_match.group(1)) * 3600
    
    # Find minutes
    minutes_match = re.search(r'(\d+(?:\.\d+)?)\s*m(?:in(?:ute)?s?)?', duration_string)
    if minutes_match:
        total_seconds += float(minutes_match.group(1)) * 60
    
    # Find seconds
    seconds_match = re.search(r'(\d+(?:\.\d+)?)\s*s(?:ec(?:ond)?s?)?', duration_string)
    if seconds_match:
        total_seconds += float(seconds_match.group(1))
    
    if total_seconds > 0:
        return timedelta(seconds=total_seconds)
    
    raise ValueError(f"Unable to parse duration: {duration_string}")

The function handles two main formats: colon-separated time and unit-based strings. For colon format, we split on the colon and interpret the parts as hours, minutes, and seconds (or just minutes and seconds for two-part durations).

For unit-based format, we use three separate regex patterns to find hours, minutes, and seconds. The pattern (\d+(?:\.\d+)?) matches integers or decimals like "1.5". The pattern \s*h(?:ours?)? matches "h", "hour", or "hours" with optional whitespace.

Each matched value is converted to seconds and added to the total. This approach lets the function handle partial durations like "45s" or "2h 15m" without requiring all units to be present.

Let's now test the function with various duration formats:

durations = [
    "1h 30m 45s",
    "2:45:30",
    "90 minutes",
    "1.5 hours",
    "45s",
    "2h 15m"
]

for duration in durations:
    parsed = parse_duration(duration)
    print(f"{duration:15} -> {parsed}")

Output:

1h 30m 45s      -> 1:30:45
2:45:30         -> 2:45:30
90 minutes      -> 1:30:00
1.5 hours       -> 1:30:00
45s             -> 0:00:45
2h 15m          -> 2:15:00

# 5. Parsing ISO Week Dates

Some systems use ISO week dates instead of regular calendar dates. An ISO week date like "2026-W03-2" means "week 3 of 2026, day 2 (Tuesday)". This format is common in business contexts where planning happens weekly.

Here's a function to parse ISO week dates:

from datetime import datetime, timedelta

def parse_iso_week_date(iso_week_string):
    """
    Parse ISO week date format: YYYY-Www-D
    
    Example: "2024-W03-2" = Week 3 of 2024, Tuesday
    
    ISO week numbering:
    - Week 1 is the week with the first Thursday of the year
    - Days are numbered 1 (Monday) through 7 (Sunday)
    """
    # Parse the format: YYYY-Www-D
    parts = iso_week_string.split('-')
    
    if len(parts) != 3 or not parts[1].startswith('W'):
        raise ValueError(f"Invalid ISO week format: {iso_week_string}")
    
    year = int(parts[0])
    week = int(parts[1][1:])  # Remove 'W' prefix
    day = int(parts[2])
    
    if not (1 <= week <= 53):
        raise ValueError(f"Week must be between 1 and 53: {week}")
    
    if not (1 <= day <= 7):
        raise ValueError(f"Day must be between 1 and 7: {day}")
    
    # Find January 4th (always in week 1)
    jan_4 = datetime(year, 1, 4)
    
    # Find Monday of week 1
    week_1_monday = jan_4 - timedelta(days=jan_4.weekday())
    
    # Calculate the target date
    target_date = week_1_monday + timedelta(weeks=week - 1, days=day - 1)
    
    return target_date

ISO week dates follow specific rules. Week 1 is defined as the week containing the year's first Thursday. This means week 1 might start in December of the previous year.

The function uses a reliable approach: find January 4th (which is always in week 1), then find the Monday of that week. From there, we add the appropriate number of weeks and days to reach the target date.

The calculation jan_4.weekday() returns 0 for Monday through 6 for Sunday. Subtracting this from January 4th gives us the Monday of week 1. Then we add (week - 1) weeks and (day - 1) days to get the final date.

Let's test it:

# Test ISO week dates
iso_dates = [
    "2024-W01-1",  # Week 1, Monday
    "2024-W03-2",  # Week 3, Tuesday
    "2024-W10-5",  # Week 10, Friday
]

for iso_date in iso_dates:
    parsed = parse_iso_week_date(iso_date)
    print(f"{iso_date} -> {parsed.strftime('%Y-%m-%d (%A)')}")

Output:

2024-W01-1 -> 2024-01-01 (Monday)
2024-W03-2 -> 2024-01-16 (Tuesday)
2024-W10-5 -> 2024-03-08 (Friday)

This format is less common than regular dates, but when encountered, having a parser ready saves significant time.

# Wrapping Up

Each function in this article uses regex patterns and datetime arithmetic to handle variations in formatting. These techniques transfer to other parsing challenges, as you can adapt these patterns for custom date formats in your projects.

Building your own parsers helps you understand how date parsing operates. When you run into a non-standard date format that standard libraries cannot handle, you will be ready to write a custom solution.

These functions are particularly useful for small scripts, prototypes, and learning projects where adding heavy external dependencies might be overkill. Happy coding!

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

5 Useful DIY Python Functions for Parsing Dates and Times

# Introduction

# 1. Parsing Relative Time Strings

# 2. Extracting Dates from Natural Language Text

# 3. Parsing Flexible Date Formats with Smart Detection

# 4. Parsing Time Durations

# 5. Parsing ISO Week Dates

# Wrapping Up

More On This Topic

Latest Posts

Top Posts