5 Useful DIY Python Functions for Parsing Dates and Times
Dates and times shouldn’t break your code, but they often do. These five DIY Python functions help turn real-world dates and times into clean, usable data.

Image by Author
# Introduction
Parsing dates and times is one of those tasks that seems simple until you actually try to do it. Python's datetime module handles standard formats well, but real-world data is messy. User input, scraped web data, and legacy systems often throw curveballs.
This article walks you through five practical functions for handling common date and time parsing tasks. By the end, you'll understand how to build flexible parsers that handle the messy date formats you see in projects.
# 1. Parsing Relative Time Strings
Social media apps, chat applications, and activity feeds display timestamps like "5 minutes ago" or "2 days ago". When you scrape or process this data, you need to convert these relative strings back into actual datetime objects.
Here's a function that handles common relative time expressions:
from datetime import datetime, timedelta
import re
def parse_relative_time(time_string, reference_time=None):
"""
Convert relative time strings to datetime objects.
Examples: "2 hours ago", "3 days ago", "1 week ago"
"""
if reference_time is None:
reference_time = datetime.now()
# Normalize the string
time_string = time_string.lower().strip()
# Pattern: number + time unit + "ago"
pattern = r'(\d+)\s*(second|minute|hour|day|week|month|year)s?\s*ago'
match = re.match(pattern, time_string)
if not match:
raise ValueError(f"Cannot parse: {time_string}")
amount = int(match.group(1))
unit = match.group(2)
# Map units to timedelta kwargs
unit_mapping = {
'second': 'seconds',
'minute': 'minutes',
'hour': 'hours',
'day': 'days',
'week': 'weeks',
}
if unit in unit_mapping:
delta_kwargs = {unit_mapping[unit]: amount}
return reference_time - timedelta(**delta_kwargs)
elif unit == 'month':
# Approximate: 30 days per month
return reference_time - timedelta(days=amount * 30)
elif unit == 'year':
# Approximate: 365 days per year
return reference_time - timedelta(days=amount * 365)
The function uses a regular expression (regex) to extract the number and time unit from the string. The pattern (\d+) captures one or more digits, and (second|minute|hour|day|week|month|year) matches the time unit. The s? makes the plural 's' optional, so both "hour" and "hours" work.
For units that timedelta supports directly (seconds through weeks), we create a timedelta and subtract it from the reference time. For months and years, we approximate using 30 and 365 days respectively. This isn't perfect, but it's good enough for most use cases.
The reference_time parameter lets you specify a different "now" for testing or when processing historical data.
Let's test it:
result1 = parse_relative_time("2 hours ago")
result2 = parse_relative_time("3 days ago")
result3 = parse_relative_time("1 week ago")
print(f"2 hours ago: {result1}")
print(f"3 days ago: {result2}")
print(f"1 week ago: {result3}")
Output:
2 hours ago: 2026-01-06 12:09:34.584107
3 days ago: 2026-01-03 14:09:34.584504
1 week ago: 2025-12-30 14:09:34.584558
# 2. Extracting Dates from Natural Language Text
Sometimes you need to find dates buried in text: "The meeting is scheduled for January 15th, 2026" or "Please respond by March 3rd". Instead of manually parsing the entire sentence, you want to extract just the date.
Here's a function that finds and extracts dates from natural language:
import re
from datetime import datetime
def extract_date_from_text(text, current_year=None):
"""
Extract dates from natural language text.
Handles formats like:
- "January 15th, 2024"
- "March 3rd"
- "Dec 25th, 2023"
"""
if current_year is None:
current_year = datetime.now().year
# Month names (full and abbreviated)
months = {
'january': 1, 'jan': 1,
'february': 2, 'feb': 2,
'march': 3, 'mar': 3,
'april': 4, 'apr': 4,
'may': 5,
'june': 6, 'jun': 6,
'july': 7, 'jul': 7,
'august': 8, 'aug': 8,
'september': 9, 'sep': 9, 'sept': 9,
'october': 10, 'oct': 10,
'november': 11, 'nov': 11,
'december': 12, 'dec': 12
}
# Pattern: Month Day(st/nd/rd/th), Year (year optional)
pattern = r'(january|jan|february|feb|march|mar|april|apr|may|june|jun|july|jul|august|aug|september|sep|sept|october|oct|november|nov|december|dec)\s+(\d{1,2})(?:st|nd|rd|th)?(?:,?\s+(\d{4}))?'
matches = re.findall(pattern, text.lower())
if not matches:
return None
# Take the first match
month_str, day_str, year_str = matches[0]
month = months[month_str]
day = int(day_str)
year = int(year_str) if year_str else current_year
return datetime(year, month, day)
The function builds a dictionary mapping month names (both full and abbreviated) to their numeric values. The regex pattern matches month names followed by day numbers with optional ordinal suffixes (st, nd, rd, th) and an optional year.
The (?:...) syntax creates a non-capturing group. This means we match the pattern but don't save it separately. This is useful for optional parts like the ordinal suffixes and the year.
When no year is provided, the function defaults to the current year. This is logical because if someone mentions "March 3rd" in January, they typically refer to the upcoming March, not the previous year's.
Let's test it with various text formats:
text1 = "The meeting is scheduled for January 15th, 2026 at 3pm"
text2 = "Please respond by March 3rd"
text3 = "Deadline: Dec 25th, 2026"
date1 = extract_date_from_text(text1)
date2 = extract_date_from_text(text2)
date3 = extract_date_from_text(text3)
print(f"From '{text1}': {date1}")
print(f"From '{text2}': {date2}")
print(f"From '{text3}': {date3}")
Output:
From 'The meeting is scheduled for January 15th, 2026 at 3pm': 2026-01-15 00:00:00
From 'Please respond by March 3rd': 2026-03-03 00:00:00
From 'Deadline: Dec 25th, 2026': 2026-12-25 00:00:00
# 3. Parsing Flexible Date Formats with Smart Detection
Real-world data comes in many formats. Writing separate parsers for each format is tedious. Instead, let's build a function that tries multiple formats automatically.
Here's a smart date parser that handles common formats:
from datetime import datetime
def parse_flexible_date(date_string):
"""
Parse dates in multiple common formats.
Tries various formats and returns the first match.
"""
date_string = date_string.strip()
# List of common date formats
formats = [
'%Y-%m-%d',
'%Y/%m/%d',
'%d-%m-%Y',
'%d/%m/%Y',
'%m/%d/%Y',
'%d.%m.%Y',
'%Y%m%d',
'%B %d, %Y',
'%b %d, %Y',
'%d %B %Y',
'%d %b %Y',
]
# Try each format
for fmt in formats:
try:
return datetime.strptime(date_string, fmt)
except ValueError:
continue
# If nothing worked, raise an error
raise ValueError(f"Unable to parse date: {date_string}")
This function uses a brute-force approach. It tries each format until one works. The strptime function raises a ValueError if the date string doesn't match the format, so we catch that exception and move to the next format.
The order of formats matters. We put International Organization for Standardization (ISO) format (%Y-%m-%d) first because it's the most common in technical contexts. Ambiguous formats like %d/%m/%Y and %m/%d/%Y appear later. If you know your data uses one consistently, reorder the list to prioritize it.
Let's test it with various date formats:
# Test different formats
dates = [
"2026-01-15",
"15/01/2026",
"01/15/2026",
"15.01.2026",
"20260115",
"January 15, 2026",
"15 Jan 2026"
]
for date_str in dates:
parsed = parse_flexible_date(date_str)
print(f"{date_str:20} -> {parsed}")
Output:
2026-01-15 -> 2026-01-15 00:00:00
15/01/2026 -> 2026-01-15 00:00:00
01/15/2026 -> 2026-01-15 00:00:00
15.01.2026 -> 2026-01-15 00:00:00
20260115 -> 2026-01-15 00:00:00
January 15, 2026 -> 2026-01-15 00:00:00
15 Jan 2026 -> 2026-01-15 00:00:00
This approach isn't the most efficient, but it's simple and handles the vast majority of date formats you'll encounter.
# 4. Parsing Time Durations
Video players, workout trackers, and time-tracking apps display durations like "1h 30m" or "2:45:30". When parsing user input or scraped data, you need to convert these to timedelta objects for calculations.
Here's a function that parses common duration formats:
from datetime import timedelta
import re
def parse_duration(duration_string):
"""
Parse duration strings into timedelta objects.
Handles formats like:
- "1h 30m 45s"
- "2:45:30" (H:M:S)
- "90 minutes"
- "1.5 hours"
"""
duration_string = duration_string.strip().lower()
# Try colon format first (H:M:S or M:S)
if ':' in duration_string:
parts = duration_string.split(':')
if len(parts) == 2:
# M:S format
minutes, seconds = map(int, parts)
return timedelta(minutes=minutes, seconds=seconds)
elif len(parts) == 3:
# H:M:S format
hours, minutes, seconds = map(int, parts)
return timedelta(hours=hours, minutes=minutes, seconds=seconds)
# Try unit-based format (1h 30m 45s)
total_seconds = 0
# Find hours
hours_match = re.search(r'(\d+(?:\.\d+)?)\s*h(?:ours?)?', duration_string)
if hours_match:
total_seconds += float(hours_match.group(1)) * 3600
# Find minutes
minutes_match = re.search(r'(\d+(?:\.\d+)?)\s*m(?:in(?:ute)?s?)?', duration_string)
if minutes_match:
total_seconds += float(minutes_match.group(1)) * 60
# Find seconds
seconds_match = re.search(r'(\d+(?:\.\d+)?)\s*s(?:ec(?:ond)?s?)?', duration_string)
if seconds_match:
total_seconds += float(seconds_match.group(1))
if total_seconds > 0:
return timedelta(seconds=total_seconds)
raise ValueError(f"Unable to parse duration: {duration_string}")
The function handles two main formats: colon-separated time and unit-based strings. For colon format, we split on the colon and interpret the parts as hours, minutes, and seconds (or just minutes and seconds for two-part durations).
For unit-based format, we use three separate regex patterns to find hours, minutes, and seconds. The pattern (\d+(?:\.\d+)?) matches integers or decimals like "1.5". The pattern \s*h(?:ours?)? matches "h", "hour", or "hours" with optional whitespace.
Each matched value is converted to seconds and added to the total. This approach lets the function handle partial durations like "45s" or "2h 15m" without requiring all units to be present.
Let's now test the function with various duration formats:
durations = [
"1h 30m 45s",
"2:45:30",
"90 minutes",
"1.5 hours",
"45s",
"2h 15m"
]
for duration in durations:
parsed = parse_duration(duration)
print(f"{duration:15} -> {parsed}")
Output:
1h 30m 45s -> 1:30:45
2:45:30 -> 2:45:30
90 minutes -> 1:30:00
1.5 hours -> 1:30:00
45s -> 0:00:45
2h 15m -> 2:15:00
# 5. Parsing ISO Week Dates
Some systems use ISO week dates instead of regular calendar dates. An ISO week date like "2026-W03-2" means "week 3 of 2026, day 2 (Tuesday)". This format is common in business contexts where planning happens weekly.
Here's a function to parse ISO week dates:
from datetime import datetime, timedelta
def parse_iso_week_date(iso_week_string):
"""
Parse ISO week date format: YYYY-Www-D
Example: "2024-W03-2" = Week 3 of 2024, Tuesday
ISO week numbering:
- Week 1 is the week with the first Thursday of the year
- Days are numbered 1 (Monday) through 7 (Sunday)
"""
# Parse the format: YYYY-Www-D
parts = iso_week_string.split('-')
if len(parts) != 3 or not parts[1].startswith('W'):
raise ValueError(f"Invalid ISO week format: {iso_week_string}")
year = int(parts[0])
week = int(parts[1][1:]) # Remove 'W' prefix
day = int(parts[2])
if not (1 <= week <= 53):
raise ValueError(f"Week must be between 1 and 53: {week}")
if not (1 <= day <= 7):
raise ValueError(f"Day must be between 1 and 7: {day}")
# Find January 4th (always in week 1)
jan_4 = datetime(year, 1, 4)
# Find Monday of week 1
week_1_monday = jan_4 - timedelta(days=jan_4.weekday())
# Calculate the target date
target_date = week_1_monday + timedelta(weeks=week - 1, days=day - 1)
return target_date
ISO week dates follow specific rules. Week 1 is defined as the week containing the year's first Thursday. This means week 1 might start in December of the previous year.
The function uses a reliable approach: find January 4th (which is always in week 1), then find the Monday of that week. From there, we add the appropriate number of weeks and days to reach the target date.
The calculation jan_4.weekday() returns 0 for Monday through 6 for Sunday. Subtracting this from January 4th gives us the Monday of week 1. Then we add (week - 1) weeks and (day - 1) days to get the final date.
Let's test it:
# Test ISO week dates
iso_dates = [
"2024-W01-1", # Week 1, Monday
"2024-W03-2", # Week 3, Tuesday
"2024-W10-5", # Week 10, Friday
]
for iso_date in iso_dates:
parsed = parse_iso_week_date(iso_date)
print(f"{iso_date} -> {parsed.strftime('%Y-%m-%d (%A)')}")
Output:
2024-W01-1 -> 2024-01-01 (Monday)
2024-W03-2 -> 2024-01-16 (Tuesday)
2024-W10-5 -> 2024-03-08 (Friday)
This format is less common than regular dates, but when encountered, having a parser ready saves significant time.
# Wrapping Up
Each function in this article uses regex patterns and datetime arithmetic to handle variations in formatting. These techniques transfer to other parsing challenges, as you can adapt these patterns for custom date formats in your projects.
Building your own parsers helps you understand how date parsing operates. When you run into a non-standard date format that standard libraries cannot handle, you will be ready to write a custom solution.
These functions are particularly useful for small scripts, prototypes, and learning projects where adding heavy external dependencies might be overkill. Happy coding!
Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she's working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.