The Hidden Size Limits of Base64 Data URIs

By Alpha Loop · Published June 12, 2026 · Updated June 20, 2026 · 7 min read

The image that vanished between staging and the inbox

The first time this bit me, I was building a transactional email template — an order confirmation with a small embedded logo. On my machine it rendered perfectly. In staging it rendered perfectly. Then QA forwarded a screenshot from the actual Gmail web client and the logo was a broken-image icon. Nothing in the logs. No 404, because there was no request to make — the image was a data: URI baked straight into the HTML. The bytes were right there in the markup, and yet they were gone.

What had happened is that the email sanitizer in the delivery path had quietly truncated the src attribute. The logo was a 71 KB PNG, and once Base64-encoded it blew past a length ceiling the sanitizer enforced on individual attribute values. The browser received a data: URI that ended mid-stream, failed to decode, and showed nothing. No error surfaced anywhere I was looking.

That incident taught me that Base64 data URIs have hidden size limits that don't announce themselves. The encoding works flawlessly in your editor and your dev server, then collapses the moment the payload crosses a boundary you didn't know existed. This guide is about where those boundaries are, expressed in concrete byte counts, and how to decide — before you ship — whether a given asset should be inlined or left as a file.

The 4/3 inflation tax nobody budgets for

Base64 is not compression. It is the opposite: it makes data bigger so that arbitrary bytes survive transport through text-only channels. The mechanism is simple. Base64 takes 3 bytes (24 bits) of input and re-expresses them as 4 ASCII characters, because each output character carries only 6 usable bits (2⁶ = 64, hence the name). So the output is 4/3 the size of the input — a 33.33% inflation, before you count the header.

You can compute the exact encoded length for any input:

// Exact Base64 character count for an N-byte payload (with padding)
function base64Length(byteCount) {
  return 4 * Math.ceil(byteCount / 3);
}

// A 100 KB image:
const bytes = 102400;
const chars = base64Length(bytes);          // 136536
const full  = chars + 'data:image/png;base64,'.length; // + 22 = 136558

console.log(`${bytes} bytes -> ${chars} Base64 chars -> ${full} total`);
// 102400 bytes -> 136536 Base64 chars -> 136558 total

That last line is the number that matters. A 100 KB PNG does not become a 100 KB string. It becomes a 136,558-character string once you prepend the data:image/png;base64, prefix, which is itself exactly 22 characters. Your 100 KB asset has turned into roughly 133 KB of text that has to live inside an HTML attribute, a CSS rule, or a JSON field.

People assume gzip on the wire erases this tax. It does not, not for images. Gzip is excellent at squeezing repetitive text, but a PNG or JPEG is already compressed — its bytes look like high-entropy noise, and Base64-encoding that noise produces more noise. In my own measurements, gzip recovered only around 10% of the bytes Base64 added for already-compressed images. So you pay most of the 33% inflation all the way to the client. (Plain text or SVG markup is a different story, which is the whole point of the SVG tip at the end.)

The 65,519-character cliff

The inflation is predictable. The cliff is the part that ambushes you, because it lives in code you didn't write.

Many HTML sanitizers — the kind embedded in email pipelines, CMS rich-text editors, and comment systems — cap the length of an individual attribute value and silently strip or truncate anything longer. A very common threshold is 65,519 characters, which is essentially the 64 KiB mark (65,536) minus a small allowance for the attribute name and quoting. When your src="data:..." exceeds that, the sanitizer doesn't reject your submission with a helpful message. It just hands the browser a corrupted URI, and you get my broken-logo screenshot.

So the operative question becomes: how big can the original image be before the encoded data URI trips a 65,519-char limit? Solving 4 * ceil(N/3) + 22 ≤ 65519 gives a clean, useful answer:

// Largest image (bytes) whose PNG data URI fits under the sanitizer cap
function maxBytesUnderCap(maxChars = 65519, prefixLen = 22) {
  const usableChars = maxChars - prefixLen;          // 65497
  const groups = Math.floor(usableChars / 4);        // 16374
  return groups * 3;                                  // 49122
}

console.log(maxBytesUnderCap()); // 49122

The threshold is 49,122 bytes — about 48 KB. An image at or below 49,122 bytes encodes to a full data URI of 65,518 characters or fewer and slips under the cap. One byte over, at 49,123 bytes, and the URI jumps to 65,522 characters — past the line. That is a brutally narrow margin to be relying on by accident. If you are pasting data URIs into anything that passes through a sanitizer, treat ~48 KB of original image as your hard ceiling, not the 64 KB the limit nominally describes. The encoding eats the other 16 KB.

When I want to know exactly where a specific image lands, I run it through Image to Base64 and read the character count off the output before deciding. Going the other direction — confirming that a URI someone handed me actually decodes to a valid image and wasn't already truncated upstream — I drop it into Base64 to Image, which fails loudly if the payload is malformed rather than silently rendering nothing.

Inline when, file when: the decision in bytes

Here is the rule I now apply, with numbers instead of vibes.

Inline (data URI) when the original asset is under ~8 KB (8192 bytes). This is not an arbitrary figure — it mirrors the default that build tools converged on. Webpack 5's asset modules inline a resource below 8192 bytes and emit it as a separate file above that. The reasoning is that for tiny assets, the extra HTTP request costs more — in latency and connection overhead — than the 33% size penalty of inlining. Below 8 KB, inlining is a net win: one fewer round trip, no extra request header overhead, and the asset can't 404. An 8 KB image inlines to about 10,946 characters, comfortably clear of every cliff.

Use a file when the asset exceeds ~8 KB, and absolutely when it would exceed ~48 KB through a sanitizer. Three concrete reasons stack up as the asset grows:

Caching. A file is cached by the browser and reused across pages. An inlined data URI is re-downloaded as part of every HTML response that contains it. Inline a 100 KB logo into a 50-page site and you ship 133 KB × 50 instead of 133 KB once.
Parsing cost. Giant data URIs bloat the HTML/CSS the parser must chew through before first paint. The bytes aren't deferrable the way an <img src> request is.
The sanitizer cliff. Above 49,122 bytes, any sanitizer-bearing path can decapitate your URI.

A quick way to make the call programmatically:

function shouldInline(byteCount, { throughSanitizer = false } = {}) {
  if (throughSanitizer && byteCount > 49122) return false; // hard cliff
  return byteCount <= 8192;                                 // request-overhead win
}

shouldInline(6000);                          // true  — tiny, inline it
shouldInline(20000);                         // false — link the file
shouldInline(40000, {throughSanitizer:true}); // false — under cliff but past 8KB

The grey zone is 8 KB to 48 KB: technically it will survive most sanitizers, but you're paying the caching and parsing penalties for no request-count benefit. My default there is to use a file unless I have a specific reason to inline (a single-file HTML deliverable, an offline bundle, an email where external images get blocked by default).

The SVG exception: URL-encode, don't Base64

There is one image format where Base64 is almost always the wrong tool: SVG. An SVG is text — XML markup — not high-entropy compressed bytes. Base64-encoding text still imposes the full 33% inflation and makes it unreadable. The better move is to URL-encode the SVG source and inline it as data:image/svg+xml,....

URL-encoding only escapes the handful of characters that are unsafe inside a URI (<, >, #, %, quotes), leaving letters, digits, spaces, and most punctuation untouched. For typical SVG markup that means you escape maybe 5–15% of characters instead of inflating every byte by 33%. The result is smaller and stays human-readable, so you can eyeball it in your stylesheet.

/* Base64 — opaque, 33% larger */
.icon-b64 {
  background-image: url("data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cD...");
}

/* URL-encoded — smaller, you can read the shape */
.icon-url {
  background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 16 16'%3E%3Cpath d='M2 8h12' stroke='black'/%3E%3C/svg%3E");
}

A small but real gotcha: inside data:image/svg+xml,... use single quotes for the SVG's own attributes (as above) so they don't collide with the CSS url("...") double quotes, and always escape # — an unescaped # in a fill="#000" will be read as a URI fragment and silently truncate everything after it. That last bug is the SVG-flavored cousin of the same truncation problem that cost me an afternoon with the broken email logo: a single mis-handled character, no error, just a missing image. The encoding details are invisible right up until they aren't.

Tools used in this guide

Image to Base64 Converter — Select an image file and generate a Base64 data URL that can be copied into HTML, CSS, JSON, or test fixtures.
Base64 to Image Converter — Paste a Base64 string, preview the decoded image, and download it as a file. The conversion runs locally in your browser.