Back to blog

Tips

CSV red flags that signal bad data

Warning signs that should make you pause and investigate.

Dec 15, 20244 min read
Some patterns in data scream 'something is wrong.' Learn to recognize them and you'll catch problems early.

Suspiciously round numbers

Real data is messy. If every value ends in 00 or every percentage is a multiple of 5, someone might be estimating or fabricating.

  • Real metrics have irregular decimals
  • Too-round numbers suggest estimates
  • Check if precision makes sense

Impossible values

Negative ages, dates in the future, percentages over 100. These should never exist but somehow always appear.

  • Age should be positive and reasonable
  • Dates should be within expected range
  • Percentages usually cap at 100

Quick CTA

Investigate your data

Sort and search in Readable CSV to spot red flags quickly.

Try it

Too many nulls

Some nulls are normal. A column that's 80% empty suggests a data collection problem or a field no one fills out.

  • Calculate null percentage per column
  • Investigate columns with high nulls
  • Consider dropping mostly-empty columns

Duplicate explosion

When your row count is much higher than expected, look for duplicates. Bad joins and import bugs create row multiplication.

  • Compare row count to expectation
  • Check for duplicate key values
  • Verify joins didn't multiply rows

Key takeaway

Trust but verify. Red flags don't always mean bad data, but they always mean you should check.