Decentralized Identity (DID)

Protecting User Privacy with SD-JWT and Selective Disclosure

Discover how to implement privacy-preserving protocols that allow users to share specific attributes from a credential without revealing their entire digital identity.

BlockchainIntermediate14 min read

In this article

Beyond the All or Nothing Identity Model

The Problem of Data Correlation

The Mechanics of Selective Disclosure

Salting and Hashing Claims

Implementing Selective Disclosure with SD-JWT

Handling Nested Claims

Verifying Proofs and Managing Cryptographic Salts

Salt Management Best Practices

Architectural Trade-offs and Best Practices

Unlinkability vs. Auditability

Beyond the All or Nothing Identity Model

In the traditional digital identity landscape, sharing a credential often feels like handing over an entire physical wallet just to prove you have a library card. When a user authenticates via a centralized provider, the service provider frequently receives a massive payload containing the user's full name, email, profile picture, and sometimes even their contact list. This excessive data transfer creates a significant privacy risk and increases the liability for the party receiving the data.

The decentralized identity model introduces a paradigm shift by separating the proof of an attribute from the attribute itself. Instead of sharing a static document, users can generate dynamic proofs that satisfy a specific request without revealing unrelated information. This capability is essential for building systems that comply with modern data minimization principles and privacy regulations.

Selective disclosure allows a holder to choose which specific claims within a verifiable credential are shared with a verifier. For instance, if a digital credential contains a user's name, date of birth, home address, and nationality, the user can choose to reveal only the date of birth claim to a liquor store app. The app receives a cryptographically signed proof that the date of birth is valid without ever seeing the home address or nationality.

The goal of selective disclosure is to transform identity from a static set of attributes into a fluid, privacy-preserving conversation where only the minimum necessary information is exchanged.

This approach requires a robust cryptographic foundation to ensure that the partial data shared remains as untamperable as the original full credential. Developers must move away from simple signed JSON objects and toward advanced structures like Selective Disclosure JSON Web Tokens or Zero-Knowledge Proofs. These technologies enable a verifier to trust a fragment of data as if they were looking at the whole set.

The Problem of Data Correlation

Even when sharing minimal attributes, developers must be wary of unique identifiers that can lead to correlation across different services. If a user presents the same public key or Decentralized Identifier to every verifier, those verifiers can collude to build a comprehensive profile of the user's activity. Selective disclosure is most effective when paired with techniques that ensure unlinkability.

Using pairwise-unique identifiers or ephemeral keys allows a user to present proofs to different parties without those parties being able to link the interactions. This ensures that while the attribute is verified as true, the identity of the person holding that attribute cannot be tracked across the web.

The Mechanics of Selective Disclosure

To understand how selective disclosure works, we must look at how data is structured before it is signed by an issuer. In a standard credential, the entire object is hashed and signed, meaning any change to the data would invalidate the signature. To allow for selective disclosure, the issuer must hash each claim individually and then sign a collection of these hashes.

When a user wants to disclose a specific claim, they provide the original value of that claim along with its corresponding salt and hash. The verifier then hashes the provided value and salt, checking if the result matches one of the hashes in the signed collection. This allows the verifier to confirm the claim's authenticity without needing the other hidden claims that make up the rest of the collection.

Claims are salted to prevent brute-force attacks on common values like dates or zip codes.
Individual hashes are aggregated into a top-level structure, often using a Merkle Tree or a flat array of digests.
The issuer signs the aggregate structure rather than the raw data claims.
The holder selectively reveals the salt and value for only the requested claims.

This mechanism relies on the one-way nature of cryptographic hash functions, ensuring that a verifier cannot reverse-engineer hidden claims from the signatures provided. It also ensures that the holder cannot lie about the values, as any modification would result in a hash mismatch against the issuer's signature. This creates a circle of trust between the issuer, the holder, and the verifier.

Salting and Hashing Claims

Salting is a critical step in selective disclosure because many identity attributes have a limited range of possible values. Without a salt, a malicious verifier could pre-compute the hashes for every possible birth date and compare them against the hidden hashes in the user's credential. By adding a high-entropy random string to each claim before hashing, we make this type of dictionary attack computationally infeasible.

During the disclosure process, the salt must be revealed alongside the attribute value so the verifier can reconstruct the hash accurately. This salt is unique to each claim and each credential issuance, preventing cross-credential correlation even for the same user attribute.

Implementing Selective Disclosure with SD-JWT

Selective Disclosure JSON Web Tokens represent one of the most practical and developer-friendly implementations of these concepts. SD-JWT extends the familiar JWT format by introducing a mechanism to hide claims within a standard token structure. This allows developers to leverage existing JWT libraries and infrastructure while gaining advanced privacy features.

An SD-JWT consists of three main parts: the standard JWT header and payload, a set of disclosures containing the salted claims, and the cryptographic signature. The payload contains hashes of the claims rather than the claims themselves, and the disclosures are appended to the token as a series of base64-encoded strings.

javascriptStructure of an SD-JWT Claim Disclosure

1// A disclosure is a JSON array: [salt, claim_name, claim_value]
2// This array is then base64url encoded to be appended to the JWT
3
4const crypto = require('crypto');
5
6function createDisclosure(attributeName, attributeValue) {
7    const salt = crypto.randomBytes(16).toString('base64url');
8    const disclosureArray = [salt, attributeName, attributeValue];
9    const disclosureString = Buffer.from(JSON.stringify(disclosureArray)).toString('base64url');
10    
11    // The hash of this disclosure is what goes into the JWT payload
12    const digest = crypto.createHash('sha256').update(disclosureString).digest('base64url');
13    
14    return { disclosureString, digest };
15}
16
17const nameDisclosure = createDisclosure('given_name', 'Satoshi');
18console.log(`Disclosure: ${nameDisclosure.disclosureString}`);
19console.log(`Digest to put in JWT: ${nameDisclosure.digest}`);

When a holder presents an SD-JWT to a verifier, they simply strip away the disclosure strings for the claims they wish to keep private. The verifier receives the JWT, validates the issuer's signature over the hashes, and then verifies that the provided disclosures match the hashes present in the token. This design ensures that the issuer does not need to be involved in the specific transaction between the holder and the verifier.

Handling Nested Claims

Real-world identity data is often hierarchical, such as an address object containing street, city, and postal code fields. SD-JWT supports recursive disclosure, allowing a holder to reveal an entire object or just specific fields within that object. This granularity is vital for complex use cases like financial KYC where a bank might only need to verify a specific region.

Implementing nested disclosure requires a recursive hashing strategy where the hash of a child object is treated as a claim within the parent object. This allows for a flexible data model that can adapt to various verification requirements without bloating the token size excessively.

Verifying Proofs and Managing Cryptographic Salts

The verification process for a selective disclosure proof is more involved than a standard signature check. The verifier must first validate the authenticity of the token signature to ensure it originated from a trusted issuer. Once the signature is verified, the verifier must iterate through the provided disclosures and map them to the digests found in the credential body.

A critical security check during verification is ensuring that no duplicate digests exist and that all provided disclosures actually correspond to a digest in the signed payload. If a holder provides a valid-looking disclosure that isn't referenced in the signed part of the token, it must be discarded to prevent injection attacks. This rigorous mapping ensures the integrity of the revealed data.

javascriptVerification Logic for Selective Disclosure

1function verifyDisclosedClaims(signedDigests, disclosures) {
2    const verifiedClaims = {};
3
4    disclosures.forEach(disclosureStr => {
5        // 1. Recompute the digest from the disclosure string
6        const hash = crypto.createHash('sha256').update(disclosureStr).digest('base64url');
7
8        // 2. Check if this hash exists in the signed digests from the issuer
9        if (signedDigests.includes(hash)) {
10            const [salt, key, value] = JSON.parse(Buffer.from(disclosureStr, 'base64url').toString());
11            verifiedClaims[key] = value;
12        } else {
13            // Warning: Potential tampering or invalid disclosure provided
14            throw new Error('Invalid disclosure: digest mismatch');
15        }
16    });
17
18    return verifiedClaims;
19}
20
21// Example usage after signature validation
22const rawDigests = ['_h7F...', 'k9L2...']; // From the decoded JWT payload
23const userDisclosures = ['WyJzYWx0IiwgIm5hbWUiLCAiSm9obiJd']; // From the presentation
24
25try {
26    const claims = verifyDisclosedClaims(rawDigests, userDisclosures);
27    console.log('Verified attributes:', claims);
28} catch (e) {
29    console.error('Verification failed:', e.message);
30}

Developers should also implement logic to handle the expiration of credentials and the revocation status of the issuer's keys. Selective disclosure provides privacy, but it does not replace the need for traditional identity lifecycle management. Ensuring that the underlying Decentralized Identifier is still valid on the verifiable data registry is a mandatory step for any production-grade implementation.

Salt Management Best Practices

Salts must never be reused across different claims or different credentials issued to the same person. Reuse of a salt could allow a verifier who has seen one credential to correlate it with another, defeating the purpose of privacy-preserving protocols. The entropy of the salt should be at least 128 bits to ensure resistance against modern computational attacks.

Issuers are responsible for generating these salts during the credential issuance phase. The salts are then securely transmitted to the holder as part of the SD-JWT structure, and the holder stores them in their digital wallet until a disclosure is required.

Architectural Trade-offs and Best Practices

While selective disclosure offers immense privacy benefits, it introduces additional complexity in both the implementation and the user experience. Developers must weigh the overhead of managing multiple hashes and salts against the privacy requirements of their specific application. For low-risk applications, standard JWTs might suffice, but for sensitive identity data, the investment in SD-JWT is justified.

One significant trade-off is the increase in token size; a credential with many selectively disclosable claims will have a much larger payload than a standard token. This can impact network latency and storage requirements for mobile wallets. Optimizing the number of disclosable fields and using efficient encoding schemes can help mitigate these performance concerns.

The most secure data is the data you never collect. Selective disclosure is the technical implementation of that philosophy, turning privacy from a policy into a hard-coded reality.

Finally, always consider the verifier's perspective when designing these systems. If the verification process is too cumbersome or requires specialized libraries that are hard to integrate, adoption will suffer. Providing clear documentation and standardizing on formats like W3C Verifiable Credentials and SD-JWT ensures maximum interoperability across the decentralized identity ecosystem.

Unlinkability vs. Auditability

There is often a tension between the user's desire for total anonymity and the regulatory requirement for auditability in certain industries. In financial services, for example, a verifier may need a way to link a transaction back to a verified identity if fraud is suspected. Selective disclosure can be combined with 'blinded' identifiers that can only be de-anonymized by a trusted third party under specific legal conditions.

This balance ensures that users maintain their privacy during day-to-day interactions while providing the necessary safeguards for high-stakes environments. Choosing the right cryptographic primitives—such as BBS+ signatures for zero-knowledge proofs—allows for even more advanced scenarios where attributes can be proven without revealing any values at all.

Issuing Tamper-Proof Claims with the Verifiable Credentials Data Model Architecting the Trust Triangle for Secure Identity Workflows