Special characters break HTML when they appear in content without encoding. A raw < character in your page content looks like the start of a tag to the browser, which then tries to parse it as markup — corrupting your content. The fix is HTML entities: standardized codes that tell the browser to render a character, not interpret it as structure.

The Five Characters You Must Always Encode

Five characters have special meaning in HTML and must be encoded when used in text content:

  • &&amp; — always encode ampersands in content
  • <&lt; — prevents browser from reading as tag start
  • >&gt; — technically safe in most contexts but encode for consistency
  • "&quot; — required inside HTML attribute values using double quotes
  • '&apos; — required inside HTML attribute values using single quotes

In attribute values, which quote you used to open determines which you must encode inside. href="page?a=1&amp;b=2" — the ampersand must be encoded because it's inside a double-quoted attribute.

Why & Is the Most Common Problem

Ampersands appear constantly in URLs (query strings), company names ("Johnson & Johnson"), and writing ("cats & dogs"). In URLs inside HTML attributes, every & must become &amp;.

A URL like href="/search?q=cats&type=all" should be href="/search?q=cats&amp;type=all". Browsers are tolerant and usually render the malformed version correctly, but HTML validators flag it as an error, and some parsers will break on it.

Non-Breaking Spaces: The Invisible Problem

Non-breaking space (&nbsp;) is the most commonly misused HTML entity. It prevents line breaks between two words — useful for "New York" (so it never wraps to two lines) or numbers with units ("42 km"). But it's commonly pasted from Word documents where it appears as a regular space that causes layout glitches.

Symptom: extra spacing, content that won't wrap correctly, or text that shifts unexpectedly at certain viewport widths. To find hidden non-breaking spaces: paste content into an HTML encoder and look for &nbsp; in the output. Replace them with regular spaces where line-breaking is acceptable.

UTF-8 vs HTML Entities: Which to Use

If your page declares <meta charset="UTF-8"> and your file is saved as UTF-8 (which all modern editors default to), you can use most special characters directly in content without encoding: , ©, , . They render correctly without entity codes.

Use HTML entities for the five reserved characters above (always), for content inserted via JavaScript where you're building HTML strings, or for characters that might not survive copy-paste through systems that strip encoding (email, some CMSs).

Try the Free Tool

Convert special characters to HTML entities and back — instantly, in your browser.

Encode HTML Characters →

Frequently Asked Questions

Why does & appear in my HTML instead of &?

You're seeing the entity code instead of the character. This usually means it was encoded twice — someone encoded content that was already encoded. Decode once with an HTML decoder tool and you'll get the correct ampersand.

Do I need to encode all characters in HTML?

Only the five reserved characters: &, <, >, ", and '. For everything else, if you're using UTF-8 (you should be), you can use the characters directly without encoding. HTML entities like © and — still work but aren't necessary.

What causes garbled characters like ’ in web pages?

This is a UTF-8 encoding mismatch. The page is serving UTF-8 content but the browser or database is interpreting it as a different encoding (usually Latin-1/ISO-8859-1). The fix is to declare charset=UTF-8 in your meta tag and ensure your database and server both use UTF-8.

How do I encode HTML entities automatically?

Paste your text into an HTML encode/decode tool and it converts all special characters to their entity equivalents in one step. For programmatic use: most languages have built-in functions — Python's html.escape(), PHP's htmlspecialchars(), JavaScript's textContent assignment.

What is   and when should I use it?

  is a non-breaking space — a space character that prevents line breaks. Use it to keep adjacent words together (like '10 km' or 'New York'). Don't use it for general spacing or indentation — use CSS margin and padding instead.