Blog
Wild & Free Tools

URL Encoding Unicode, Emoji, and International Characters

Last updated: April 2026 5 min read
Quick Answer

Table of Contents

  1. How UTF-8 Percent-Encoding Works
  2. Encoding in Code
  3. Internationalized Domain Names
  4. Emoji in URLs
  5. Frequently Asked Questions

ASCII characters (letters, digits, and basic punctuation) have a straightforward percent-encoding: one character becomes one %XX code. Non-ASCII characters — accented letters like é, CJK ideographs like , Arabic script, emoji — need more work. They're first converted to UTF-8 byte sequences, then every byte is percent-encoded individually.

The result looks long and opaque, but it's completely standard and all modern servers decode it correctly.

How Non-ASCII Characters Are Encoded

The process:

  1. Take the character (e.g., é, U+00E9)
  2. Convert it to its UTF-8 byte sequence: é0xC3 0xA9 (two bytes)
  3. Percent-encode each byte: %C3%A9

More examples:

CharacterUTF-8 BytesEncoded
éC3 A9%C3%A9
üC3 BC%C3%BC
E6 97 A5%E6%97%A5
E4 B8 AD%E4%B8%AD
😀F0 9F 98 80%F0%9F%98%80
E2 86 92%E2%86%92

Encoding Non-ASCII Characters in Code

All major language encoding functions handle Unicode automatically when you pass a string:

// JavaScript
encodeURIComponent('café')    // 'caf%C3%A9'
encodeURIComponent('日本語')  // '%E6%97%A5%E6%9C%AC%E8%AA%9E'
encodeURIComponent('😀')      // '%F0%9F%98%80'

# Python
from urllib.parse import quote
quote('café')      # 'caf%C3%A9'
quote('日本語')    # '%E6%97%A5%E6%9C%AC%E8%AA%9E'

The encoding functions take care of the UTF-8 conversion step — you don't need to do it manually. Just pass the Unicode string.

Sell Custom Apparel — We Handle Printing & Free Shipping

A Note on Internationalized Domain Names (IDN)

Domain names have their own encoding system for non-ASCII characters: Punycode. A domain like münchen.de becomes xn--mnchen-3ya.de in Punycode. This is handled by your browser and DNS resolver automatically — you don't percent-encode domain names.

Percent-encoding applies to the path, query string, and fragment parts of a URL — not the scheme or domain. A URL like https://münchen.de/search?q=café in practice gets both Punycode encoding on the domain and percent-encoding on the query value.

Using Emoji in URLs

Emoji in URLs expand significantly because they require 4 UTF-8 bytes each, producing 12 characters of percent-encoded output per emoji. A URL like /search?q=🍕+recipes becomes /search?q=%F0%9F%8D%95+recipes.

This is valid and correct — servers decode them back to the original emoji. The encoded form is what's actually transmitted and stored in server logs. The readable form with emoji is a browser display convenience.

Use the Mongoose URL Encoder to encode or decode emoji and international text instantly — paste the character and see the percent-encoded form.

Encode Any Character — Including Emoji

Paste any Unicode text, emoji, or international characters into the Mongoose URL Encoder and see the percent-encoded result instantly.

Open URL Encoder

Frequently Asked Questions

Does URL encoding work for Arabic and Hebrew (right-to-left text)?

Yes. URL encoding works on bytes, not on the visual representation. Arabic and Hebrew characters are converted to their UTF-8 byte sequences and then percent-encoded, just like any other non-ASCII text. The direction of the text doesn't affect the encoding.

What if I need to include a character that doesn't have a UTF-8 encoding?

Every Unicode code point has a UTF-8 encoding — UTF-8 covers all 1.1 million+ Unicode code points. There is no Unicode character that can't be percent-encoded via UTF-8.

Why do some encoded URLs use lowercase hex (%c3%a9) and some use uppercase (%C3%A9)?

Both are valid. RFC 3986 recommends uppercase hex digits, but lowercase is widely accepted. When comparing or normalizing URLs, treat uppercase and lowercase hex as equivalent.

Can browsers display the original Unicode characters in the address bar even though they're encoded?

Yes. Modern browsers decode and display most non-ASCII characters in the address bar for readability. The underlying request still uses the percent-encoded form. Copy the URL from the address bar and paste it somewhere else to see the encoded version.

Ryan Callahan
Ryan Callahan Lead Software Engineer

Ryan architected the client-side processing engine that powers every tool on WildandFree — ensuring your files never leave your browser.

More articles by Ryan →
Launch Your Own Clothing Brand — No Inventory, No Risk