Back to blog

Guide

How to fix broken CSV encoding

Why you see weird characters and how to get your text back.

Jan 18, 20255 min read
When names turn into étranger or prices show £ instead of £, you have an encoding mismatch. It's fixable once you understand what went wrong.

What encoding actually means

Encoding is the system that maps characters to numbers. UTF-8, Latin-1, and Windows-1252 are common encodings. When you open a file with the wrong encoding, characters get misinterpreted.

  • UTF-8 is the modern standard
  • Latin-1 covers Western European characters
  • Windows-1252 is common in older Windows exports

Spot the signs

Garbled characters follow patterns. é usually means UTF-8 read as Latin-1. Blank boxes or question marks mean missing characters. Once you recognize the pattern, you can guess the correct encoding.

  • Ã followed by a character suggests UTF-8 misread
  • Diamond question marks mean unknown characters
  • Blank boxes indicate missing font glyphs

Quick CTA

Encoding handled automatically

Readable CSV detects encoding and displays your text correctly without manual configuration.

Open a CSV

Re-open with correct encoding

Most CSV tools let you specify encoding on import. Try UTF-8 first, then Latin-1, then Windows-1252. One of these usually fixes the problem.

  • UTF-8 handles most modern files
  • Latin-1 works for older European data
  • Windows-1252 covers legacy Windows exports

Convert to UTF-8 for safety

Once you identify the correct encoding, convert the file to UTF-8 for future compatibility. This prevents the same problem from recurring.

  • UTF-8 is universally supported
  • Save with BOM for Excel compatibility
  • Verify special characters after conversion

Key takeaway

Encoding issues look scary but follow predictable patterns. Try UTF-8 first, then work backwards to older encodings.