Back to blog

Guide

How to clean up messy date formats in CSVs

Tame the chaos of MM/DD/YYYY, DD-MM-YY, and everything in between.

Jan 12, 20255 min read
Is 01/02/03 January 2nd, February 1st, or 2001? Date format inconsistency is the silent killer of data accuracy.

Identify what you have

Scan the date column for patterns. Look for separators (/, -, .), component order (MDY, DMY, YMD), and year format (2 or 4 digits).

  • US systems typically use MM/DD/YYYY
  • European systems use DD/MM/YYYY
  • ISO 8601 uses YYYY-MM-DD

Pick a target format

Standardize on one format. ISO 8601 (YYYY-MM-DD) sorts correctly and is unambiguous. It's the best choice for data work.

  • YYYY-MM-DD sorts chronologically as text
  • No ambiguity between day and month
  • International standard, widely supported

Quick CTA

See dates clearly

Readable CSV displays dates as they are, making inconsistencies obvious at a glance.

Check your dates

Handle edge cases

Watch for dates that could be either format. 03/04/2025 is ambiguous. You may need context or source knowledge to resolve these.

  • Dates 1-12 in both positions are ambiguous
  • Check source system's locale settings
  • When in doubt, verify against other data

Validate after conversion

After reformatting, sort by date and scan for outliers. Future dates, very old dates, or dates that break the expected pattern indicate conversion errors.

  • Sort to find min and max dates
  • Check for dates in the future
  • Verify against known events or records

Key takeaway

Standardize on ISO 8601 (YYYY-MM-DD) and you'll never confuse January 2nd with February 1st again.