— Charset Converter

Free Charset Converter

Quick Tips

  • This tool runs entirely in your browser - your data stays private.
  • Press Ctrl+V (Cmd+V on Mac) to quickly paste text.
  • Use the Copy button to save your result to clipboard.
  • Bookmark this page for quick access!

Convert text between different character encodings (UTF-8, Latin-1, etc.).

Your Recent Tools

Examples

Input
é
Output
é (UTF-8 misread as Latin-1)
Input
café
Output
caf\xe9 (UTF-8 to Latin-1 bytes)

Why Use This Tool?

What problems does this solve?

Character encoding mismatches cause garbled text. This tool helps identify the correct encoding and convert between formats, fixing Mojibake and enabling data migration from legacy systems.

Common use cases:

  • Fixing garbled text (Mojibake) from encoding mismatches
  • Converting legacy database exports to UTF-8
  • Diagnosing encoding issues in file imports
  • Preparing text for systems requiring specific encodings
  • Testing how text survives encoding conversions

Who benefits from this tool?

Developers migrating legacy systems. Database administrators fixing encoding issues. Anyone dealing with garbled international text.

Privacy first: All conversion happens locally in your browser. Your text never leaves your device.

Frequently Asked Questions

Mojibake is garbled text caused by interpreting bytes with the wrong character encoding. UTF-8 text displayed as Latin-1 (or vice versa) shows wrong characters like "é" instead of "é".

Look for patterns: UTF-8 multibyte sequences are distinctive. Latin-1 uses single bytes for accented characters. The tool can suggest likely encodings based on byte analysis.

UTF-8 uses multiple bytes for non-ASCII characters. If software interprets each byte as a separate Latin-1 character, accented letters become multiple garbled characters.

Latin-1 only supports 256 characters. Characters outside this range (emoji, CJK, many symbols) cannot be represented and will be lost or replaced with question marks.

Yes, for new projects. UTF-8 supports all Unicode characters and is the web standard. Only use other encodings for legacy system compatibility.

Windows-1252 extends Latin-1 by adding characters in the 128-159 range (smart quotes, Euro sign, etc.) where Latin-1 has control characters. They are often confused with each other.