How to Fix Character Encoding Issues

Last updated: February 8, 2026 | Reading time: 12 minutes

Character encoding problems can turn perfectly good text into unreadable gibberish. Whether you're dealing with garbled Chinese characters in Excel, broken Japanese text on a website, or mysterious symbols in your database, this guide will help you fix it.

🎯 Quick Fix: If you just need to fix garbled text right now, use our Free Garbled Text Fixer →

Understanding the Problem

Character encoding issues occur when text is saved in one encoding but read in another. It's like writing a letter in Spanish but someone tries to read it assuming it's French—the words don't make sense.

Common Symptoms:

Fix #1: Text Files and Documents

1Open in a Text Editor with Encoding Support

Use editors like Notepad++, VS Code, or Sublime Text that let you change encoding.

2Try Different Encodings

In the editor menu, look for "Encoding" or "Character Set" and try:

3Save as UTF-8

Once you find the encoding that displays text correctly, immediately save the file as UTF-8 to prevent future issues.

Fix #2: Excel and CSV Files

Excel often causes encoding issues with Chinese, Japanese, or special characters.

Opening a Garbled CSV in Excel:

1Don't Double-Click the CSV

Opening directly can cause encoding issues.

2Import via Data Tab

  1. Open Excel
  2. Go to Data → Get Data → From File → From Text/CSV
  3. Select your file
  4. In the import dialog, change File Origin to:
    • 65001: Unicode (UTF-8) for UTF-8 files
    • 936: Chinese Simplified (GB2312) for GBK files
    • 950: Chinese Traditional (Big5) for Big5 files
  5. Click Load

Saving Excel to UTF-8 CSV:

File → Save As → CSV UTF-8 (Comma delimited) (*.csv)
⚠️ Warning: Regular "CSV (Comma delimited)" in Excel does NOT save as UTF-8. Always use "CSV UTF-8" if you have non-English characters.

Fix #3: Database Encoding Issues

MySQL / MariaDB

Check current encoding:

SHOW VARIABLES LIKE 'character_set%'; SHOW VARIABLES LIKE 'collation%';

Set database to UTF-8:

ALTER DATABASE your_database CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ALTER TABLE your_table CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

In your connection string:

mysqli_set_charset($conn, "utf8mb4"); -- or in PDO: $pdo = new PDO('mysql:host=localhost;dbname=test;charset=utf8mb4', $user, $pass);
💡 Tip: Use utf8mb4 (not just utf8) in MySQL. The utf8mb4 encoding supports full Unicode including emojis, while utf8 is incomplete.

PostgreSQL

-- Check encoding SHOW SERVER_ENCODING; -- Create database with UTF-8 CREATE DATABASE mydb ENCODING 'UTF8';

Fix #4: Website Encoding Issues

HTML Files

Add this in the <head> section:

<meta charset="UTF-8">
⚠️ Important: This meta tag must be within the first 1024 bytes of your HTML file. Place it as early as possible in the <head>.

HTTP Headers

Apache (.htaccess):

AddDefaultCharset UTF-8

Nginx:

charset utf-8;

PHP:

header('Content-Type: text/html; charset=utf-8');

Fix #5: Email Encoding Problems

Emails with garbled subjects or body text usually have encoding issues in MIME headers.

Fixing Email Subject Lines:

Email subjects should be encoded using RFC 2047 format:

=?UTF-8?B?5L2g5aW977yB?= // Decodes to: 你好!

If you see raw encoded text like this in your email subject, your email client isn't decoding it properly. Try a different email client or use our decoder tool.

PHP Mail Example:

$subject = "=?UTF-8?B?" . base64_encode($subject) . "?="; $headers = "Content-Type: text/html; charset=UTF-8\r\n"; mail($to, $subject, $message, $headers);

Fix #6: Programming Language Specifics

Python

# Reading files with open('file.txt', 'r', encoding='utf-8') as f: content = f.read() # Writing files with open('output.txt', 'w', encoding='utf-8') as f: f.write(text)

JavaScript (Node.js)

const fs = require('fs'); // Reading const text = fs.readFileSync('file.txt', 'utf8'); // Writing fs.writeFileSync('output.txt', text, 'utf8');

Java

// Reading BufferedReader reader = new BufferedReader( new InputStreamReader(new FileInputStream("file.txt"), StandardCharsets.UTF_8) ); // Writing BufferedWriter writer = new BufferedWriter( new OutputStreamWriter(new FileOutputStream("output.txt"), StandardCharsets.UTF_8) );

Prevention Tips

✅ Best Practices:

  1. Always use UTF-8 for new projects, files, and databases
  2. Declare encoding explicitly in HTML, HTTP headers, and database connections
  3. Test with international characters before going live
  4. Use modern tools that default to UTF-8
  5. Validate data entry to ensure proper encoding from the start

❌ Avoid:

Quick Diagnosis Flowchart

Q: Do you see ’ or “?

→ UTF-8 text displayed as Windows-1252. Fix: Decode as UTF-8.

Q: Do you see Chinese as ÄãºÃ?

→ GBK text displayed as Latin-1. Fix: Decode as GBK.

Q: Do you see ??? or □□□?

→ Original bytes lost during save. Cannot be recovered. Prevention only.

Still Having Issues?

Use our free automatic fixer to repair garbled text instantly.

Fix Garbled Text Now →

Related Resources