Canonical JSON for Hashing and Digital Signatures — When Key Order Matters
- Canonical JSON is a deterministic byte representation — same data always produces identical bytes.
- Required for cryptographic hashing, digital signatures, and content-addressed storage.
- Alphabetical sort gets you 90 percent there; RFC 8785 (JCS) covers the rest.
Table of Contents
Canonical JSON is a deterministic serialization: the same logical data always produces byte-identical output regardless of who generated it or what library they used. This is essential for cryptographic hashing, digital signatures, content-addressed storage (blockchain, IPFS), and anywhere you need "the JSON hasn't changed" verification.
Plain JSON serializers do not produce canonical output. This post covers what canonical means, why alphabetical key sorting is the main requirement, and how to produce strictly canonical JSON when you need it.
What "Canonical" Actually Means
Two JSON documents are semantically equal if a parser produces the same logical object from both. The spec does not require them to be byte-equal — it just requires they represent the same data.
Semantically equal but byte-different:
{"a":1,"b":2}
{"b": 2, "a": 1}
{ "a" : 1 , "b" : 2 }
All three parse to the same object. All three produce different byte sequences, which produce different hashes.
Canonical form: A set of rules that makes these byte-equal:
1. Keys sorted alphabetically (deterministic order)
2. No insignificant whitespace
3. Standardized string escaping
4. Standardized number representation (no trailing zeros, consistent decimal format)
5. Consistent Unicode normalization
The strict specification is RFC 8785 (JSON Canonicalization Scheme, JCS). Many use cases only need a subset — alphabetical sort plus consistent formatting — and that subset is what our JSON Key Sorter combined with a formatter produces.
When You Need Canonical JSON
Webhook signature verification. Stripe, GitHub, and many API providers sign webhook payloads. To verify, you hash the payload and compare to the provided signature. If the payload's byte representation differs from what the sender hashed, verification fails. Canonical form ensures sender and receiver compute the same hash.
Blockchain / content-addressed storage. IPFS, Git (for commit SHAs), and various blockchains use content hashes as identifiers. Same content must produce same ID, which requires canonical byte form.
Caching. If you cache API responses by hashing the request, non-canonical requests (same data, different key order) produce different cache keys and cache misses. Canonical form fixes this.
Audit logs with tamper-evidence. If each log entry is signed, the signature is over the canonical form of the entry. Reading later, you re-canonicalize and re-hash to verify integrity.
Replayable test fixtures. Snapshot tests compare against stored JSON. If the snapshot library is deterministic and the production serializer is not, tests flicker. Canonical fixtures prevent this.
Sell Custom Apparel — We Handle Printing & Free ShippingSimple Canonical Form — Sort + Consistent Format
For most use cases, "alphabetical keys, no whitespace, standard escaping" is enough. Produce it with:
Python:
import json
canonical = json.dumps(data, sort_keys=True, separators=(',', ':'), ensure_ascii=False)
JavaScript (with json-stable-stringify):
import stringify from 'json-stable-stringify';
const canonical = stringify(data);
Go:
// encoding/json already sorts map keys alphabetically
bytes, _ := json.Marshal(data)
Browser tool workflow:
1. Paste JSON into our JSON Key Sorter, sort recursively.
2. Paste the sorted output into our JSON Formatter, click Minify.
3. The result is simple canonical form — sorted, minified, UTF-8 JSON. Hash or sign this.
For most cache keys, webhook verification, and integrity checks, this is sufficient.
Strict Canonical (RFC 8785 / JCS) Requirements
RFC 8785 adds strict rules for edge cases:
Numbers. Integers: no leading zeros, no plus sign, no decimal point. Floats: scientific notation rules defined precisely. 1.0 serializes as 1 (but only when it is exactly 1, not 1.000000001). Handling floating-point precision correctly is the hardest part of JCS.
Unicode strings. Must be NFC-normalized. Characters with multiple Unicode representations (precomposed vs decomposed) are unified.
Escape sequences. Minimal escaping — only characters that require escaping per JSON spec. No \u00XX for printable ASCII.
Property ordering. Lexicographic ordering of UTF-16 code units (not UTF-8). This affects keys with non-ASCII characters.
Libraries that handle JCS correctly:
- Python: jcs package
- JavaScript: canonicalize package
- Java: org.webpki.jcs.JsonCanonicalizer
- Go: github.com/cyberphone/json-canonicalization
Our browser tool does simple canonical (sort + minify), not strict JCS. For cryptographic applications requiring strict compliance, use a JCS library in your language.
Webhook Verification Example
A practical example: verifying a signed webhook.
Sender side:
1. Build the payload object.
2. Canonicalize: sort keys, minify.
3. Compute HMAC-SHA256 using a shared secret.
4. Send payload + signature.
Receiver side:
1. Receive payload + signature.
2. Canonicalize the received payload: sort keys, minify.
3. Compute HMAC-SHA256 using the shared secret.
4. Compare computed signature to received signature.
If the two sides disagree on what canonical means, the signatures will not match. This is why many webhook providers (Stripe is a notable example) do not actually parse the payload into an object before hashing — they hash the raw received bytes directly. The wire format becomes canonical by fiat. Less elegant, more foolproof.
If you are designing a new signed-payload protocol, raw-bytes-hashing is simpler than JSON canonicalization. Reserve canonical form for cases where you need to re-serialize data (transforming before storing, comparing objects you built from different sources).
Produce Simple Canonical JSON
Sort keys recursively, then minify. Free browser tool chain.
Open Free JSON Key SorterFrequently Asked Questions
Is sorting keys enough for canonical JSON?
For many use cases, yes — alphabetical keys plus consistent formatting (minified, no extra whitespace) covers cache keys, simple signatures, and content identification. For strict cryptographic compliance (RFC 8785 / JCS), you also need Unicode normalization, specific number formatting, and lexicographic key ordering by UTF-16 code units.
Why does Stripe hash the raw webhook bytes instead of canonicalizing?
Because raw-bytes-hashing removes any disagreement about canonical form. The sender hashes the exact bytes they send; the receiver hashes the exact bytes they receive. If any proxy, library, or middleware modifies the JSON representation (even whitespace), the signature fails. This is foolproof but requires the bytes to reach the receiver unchanged.
Does Go encoding/json produce canonical JSON by default?
For map[string]interface{} inputs, partially — Go sorts map keys alphabetically. For struct inputs, fields serialize in declaration order, which may not be alphabetical. Whitespace is minified by default with json.Marshal. For strict canonical form from structs, convert to map[string]interface{} first or use a JCS library.
Can the browser tool produce RFC 8785 canonical JSON?
Not strictly. Our sorter + formatter produces "simple canonical" form — sorted keys plus minified whitespace. This is enough for most hashing and caching use cases but does not guarantee RFC 8785 compliance for edge cases involving Unicode normalization or number formatting. For strict JCS, use a certified library.

