Back to blog

Comparison

UTF-8 vs Latin-1: CSV encoding guide

Character encoding explained for people who just want their text to work.

Dec 27, 20245 min read
Encoding errors turn names into garbage and break imports. Understanding the basics prevents most problems.

What encoding actually does

Encoding maps characters to numbers. Different encodings map differently. Open a file with the wrong encoding and characters get scrambled.

  • Encoding = character-to-number mapping
  • Same bytes, different encoding = different text
  • Mismatch causes garbled characters
  • Correct encoding restores text

UTF-8: the modern standard

UTF-8 handles virtually every character in every language. It's backwards compatible with ASCII and is the default for most modern systems.

  • Supports all languages
  • Backwards compatible with ASCII
  • Variable width (1-4 bytes per character)
  • The safe default choice

Quick CTA

Encoding handled automatically

Readable CSV detects encoding and displays your text correctly.

Open a CSV

Latin-1 and Windows-1252

These older encodings handle Western European characters but fail on others. Legacy systems and older exports often use them.

  • Western European characters only
  • Fixed width (1 byte per character)
  • Common in older Windows exports
  • Often mislabeled as UTF-8

How to fix encoding issues

When you see garbled text, try opening the file with different encodings. Once you find the right one, convert to UTF-8 for safety.

  • Try UTF-8 first
  • Then try Latin-1 or Windows-1252
  • Convert to UTF-8 once identified
  • Verify special characters display correctly

Key takeaway

Use UTF-8 for new files. For existing files, try different encodings until the text looks right.