Guide
How to fix broken CSV encoding
Why you see weird characters and how to get your text back.
What encoding actually means
Encoding is the system that maps characters to numbers. UTF-8, Latin-1, and Windows-1252 are common encodings. When you open a file with the wrong encoding, characters get misinterpreted.
- UTF-8 is the modern standard
- Latin-1 covers Western European characters
- Windows-1252 is common in older Windows exports
Spot the signs
Garbled characters follow patterns. é usually means UTF-8 read as Latin-1. Blank boxes or question marks mean missing characters. Once you recognize the pattern, you can guess the correct encoding.
- Ã followed by a character suggests UTF-8 misread
- Diamond question marks mean unknown characters
- Blank boxes indicate missing font glyphs
Quick CTA
Encoding handled automatically
Readable CSV detects encoding and displays your text correctly without manual configuration.
Open a CSVRe-open with correct encoding
Most CSV tools let you specify encoding on import. Try UTF-8 first, then Latin-1, then Windows-1252. One of these usually fixes the problem.
- UTF-8 handles most modern files
- Latin-1 works for older European data
- Windows-1252 covers legacy Windows exports
Convert to UTF-8 for safety
Once you identify the correct encoding, convert the file to UTF-8 for future compatibility. This prevents the same problem from recurring.
- UTF-8 is universally supported
- Save with BOM for Excel compatibility
- Verify special characters after conversion
Key takeaway
Encoding issues look scary but follow predictable patterns. Try UTF-8 first, then work backwards to older encodings.