Why You Can't Just atob() a JWT (and the base64url Fix)
The 401 That Started With a Stack Overflow Answer
I lost the better part of a Tuesday afternoon to a single line copied from a five-year-old Stack Overflow answer. We had a debugging session open on a staging auth flow: tokens minted by our identity provider were validating fine on the backend, but a small internal dashboard I'd thrown together to inspect token contents kept throwing. The line was the canonical one everyone reaches for:
const payload = JSON.parse(atob(token.split('.')[1]));
Most of the time it worked. Then a particular user's token came through and the console lit up with InvalidCharacterError: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded. The token was perfectly valid. The backend accepted it. The signature checked out. And yet atob() flatly refused to touch it. I stared at the payload segment, which looked like a normal blob of base64 with a few dashes in it, and I could not for the life of me see what was wrong.
The thing that was wrong is that a JWT is not encoded in base64. It is encoded in base64url, and those are two different alphabets with two different RFC sections. Once I understood that, the bug was obvious and a little embarrassing. But it's an embarrassing bug that ships in production code constantly, so it's worth taking apart properly.
Two Alphabets, One Letter Apart
Base64 and base64url are both defined in the same document, RFC 4648. Standard base64 lives in §4. The URL- and filename-safe variant lives in §5. They agree on 62 of their 64 characters. They disagree on exactly two, and on padding.
In standard base64 (§4), index 62 is + and index 63 is /. Those two characters are hostile to the contexts JWTs live in. A + in a URL query string gets interpreted as a space. A / inside a path segment is a delimiter. So base64url (§5) swaps them: index 62 becomes - (minus) and index 63 becomes _ (underscore). On top of that, JWTs strip the = padding entirely. That stripping isn't an accident — RFC 7515 §2, the JWS spec that defines the JWT compact serialization, explicitly mandates base64url encoding with all trailing = removed.
So when my problem token contained a byte sequence that encoded to something with a / or + in standard base64, the issuer correctly emitted _ or - instead. Browser atob() implements §4 to the letter. Hand it a -, and - is not in its alphabet, so it throws InvalidCharacterError. The reason the bug is intermittent is pure statistics: only 2 out of every 64 random sextets land on index 62 or 63, so plenty of tokens never contain a single - or _ and decode fine by accident. You ship the broken code, it passes every test you wrote, and then it explodes on one specific user three weeks later. That is exactly the trap I fell into.
The Fix Is Three Transforms
Converting base64url back into something atob() accepts is mechanical. Replace - with +, replace _ with /, then re-pad the string so its length is a multiple of 4. Base64 operates on 4-character groups, and atob() wants those groups intact even though the JWT spec threw the padding away.
function decodeJwtSegment(seg) {
// base64url (RFC 4648 §5) -> base64 (§4)
let b64 = seg.replace(/-/g, '+').replace(/_/g, '/');
// re-pad to a multiple of 4
while (b64.length % 4) b64 += '=';
// atob gives a binary string; decode as UTF-8 properly
const bytes = Uint8Array.from(atob(b64), c => c.charCodeAt(0));
return JSON.parse(new TextDecoder().decode(bytes));
}
function decodeJwt(token) {
const [h, p] = token.split('.');
return { header: decodeJwtSegment(h), payload: decodeJwtSegment(p) };
}
The padding math is worth internalizing. The remainder of length % 4 is never 1 for valid base64url — it's 0, 2, or 3. A remainder of 2 needs two = to complete the group, a remainder of 3 needs one, and 0 needs none. The while loop handles all three cases without a lookup table.
Notice the second subtle fix in there: I don't just atob() and call it done. Raw atob() returns a "binary string" where each character is one byte, which mangles any non-ASCII content — accented names, emoji in a display field, anything outside U+007F. Mapping through Uint8Array and TextDecoder reconstructs the real UTF-8. The Stack Overflow one-liner skipped this too, which is a second latent bug waiting for its own production incident.
Here's a concrete input-output to make the alphabet difference tangible. Take a header segment:
input: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
output: {"alg":"HS256","typ":"JWT"}
That particular segment has no - or _, so it'd survive raw atob(). But a payload carrying, say, a profile photo URL with a base64url-encoded query parameter will eventually produce a -, and that's the one that takes you down. When I'm spot-checking a token by hand rather than in code, I just paste the whole thing into the JWT decoder, which does all three transforms locally in the browser so I'm not pasting a bearer token into someone's server.
Decoding Is Not Verifying — and alg:none Proves It
Here's the part that turns a formatting footgun into a security one. Everything above decodes a JWT. None of it verifies a JWT. A JWT has three dot-separated segments — header.payload.signature — and decoding only reads the first two. The signature is the third segment, and checking it requires a secret (for HS256) or a public key (for RS256). The browser-side decode I wrote above can't and shouldn't do that.
This matters because of a token shaped like this:
eyJhbGciOiJub25lIn0.eyJzdWIiOiJhZG1pbiJ9.
Decode the header and you get {"alg":"none"}. Decode the payload and you get {"sub":"admin"}. Note the trailing dot with nothing after it — the signature segment is empty. This is an "unsecured JWS" from RFC 7519 §6, and it is a legitimate part of the spec. The catastrophe, documented across a wave of library CVEs, is a verifier that reads the attacker-controlled alg field from the header and trusts it. Send alg:none, and a naive verifier skips signature checking entirely and waves through any claims you like. I have watched a junior engineer demo exactly this against a homegrown middleware: he changed "sub" to an admin id, dropped the signature, and the endpoint happily logged him in.
The defense is to never let the token tell you which algorithm to use. Your server pins the expected algorithm and rejects everything else:
import jwt from 'jsonwebtoken';
// pin the algorithm; reject alg:none and algorithm-confusion attacks
const claims = jwt.verify(token, secret, { algorithms: ['HS256'] });
One more trap lives in the claims you just decoded. The time fields — iat, exp, and nbf — are NumericDate values, defined in RFC 7519 §2 as seconds since the Unix epoch, not milliseconds. JavaScript's Date.now() returns milliseconds. If you compare exp against Date.now() directly, every token looks like it expired in 1970 and your expiry logic is off by a factor of 1000. The correct comparison is claims.exp * 1000 > Date.now(), or claims.exp > Math.floor(Date.now() / 1000). When I need to sanity-check what wall-clock time an exp of 1718841600 actually means, I drop it into the Unix timestamp converter rather than risk an off-by-three-zeros mistake in my head.
What I Actually Take Away From That Tuesday
The base64url-vs-base64 distinction is the kind of thing you learn once, painfully, and never forget. The fix is six lines. But the deeper lesson is the layering: atob() failing was the loud, obvious problem, and chasing it forced me to confront the two quiet ones sitting right behind it — that I was mangling UTF-8, and that the entire decode path was being mistaken for a trust decision by people downstream of me.
So the rule I keep now is small and blunt. Use base64url transforms, never raw atob(), on anything JWT-shaped. Treat a decoded payload as untrusted display data until a signature has been checked server-side with a pinned algorithm. And remember that the timestamps are in seconds. None of those three is hard. All three of them have cost me, or someone on a team near me, real hours — and the first one started with a single confident line of copied code that worked just often enough to be dangerous.
Tools used in this guide
- JWT Decoder — Paste a JSON Web Token and inspect its header, payload, and signature segment locally in your browser.
- Unix Timestamp Converter — Convert Unix epoch seconds or milliseconds into local and UTC dates, or generate the current timestamp instantly.