Zero-Knowledge Proofs (ZKPs)
Verifying Machine Learning Models with Zero-Knowledge Proofs
Explore the emerging field of ZKML to prove that an AI model was executed correctly on private input data.
In this article
The Trust Gap in Modern AI Deployment
Artificial intelligence is increasingly integrated into high-stakes decision-making processes such as medical diagnosis, financial credit scoring, and legal risk assessment. In most current deployments, users must blindly trust that a service provider is running the specific model they claim to use. There is currently no easy way for a client to verify that a cloud provider hasn't substituted a complex, high-accuracy model for a cheaper, less performant one to save on infrastructure costs.
This transparency problem is compounded when dealing with sensitive data that cannot be shared openly. Organizations often face a dilemma where they need the predictive power of advanced models but cannot risk exposing their proprietary datasets or model weights to third parties. This creates a bottleneck for the adoption of AI in regulated industries where auditability and privacy are non-negotiable requirements.
Zero-Knowledge Machine Learning (ZKML) addresses these challenges by combining the integrity of cryptographic proofs with the predictive capabilities of neural networks. By generating a zero-knowledge proof of inference, a prover can demonstrate that a model was executed correctly on a specific input without revealing the weights of the model or the underlying data. This shift moves the industry from a model of blind trust to one of mathematical verification.
The fundamental goal of ZKML is to decouple the execution of a model from the trust required to believe its output, ensuring that computation is both private and verifiable by any third party.
The Incentive for Verifiable Inference
In a decentralized ecosystem, verifiable inference allows smart contracts to consume AI outputs directly. Without ZKML, a blockchain cannot verify if an off-chain model output is legitimate or if it was simply fabricated by a malicious node. This capability opens the door to decentralized insurance, automated asset management, and privacy-preserving identity verification systems.
Furthermore, ZKML enables a new paradigm of model-as-a-service where model creators can monetize their intellectual property without exposing their weights. A buyer can verify that the model they are paying for is the one being executed, while the seller maintains the secrecy of their training results. This creates a secure marketplace for specialized algorithms that were previously too risky to deploy in shared environments.
Mapping Neural Networks to Arithmetic Circuits
The primary challenge in ZKML lies in the fact that modern neural networks and zero-knowledge proof systems operate in different mathematical domains. Neural networks typically use 32-bit or 16-bit floating-point numbers to represent weights and activations. Conversely, zero-knowledge proofs are built using arithmetic circuits that operate over large prime fields, which only support integer-like operations.
To bridge this gap, developers must perform quantization, which maps continuous floating-point values to a discrete set of integers. This process involves choosing a scale factor that preserves enough precision for the model to remain accurate while staying within the bounds of the prime field. If the scale factor is too small, the model loses accuracy; if it is too large, the circuit may suffer from overflow errors during intermediate calculations.
Standard neural network operations like matrix multiplication are relatively straightforward to implement as additions and multiplications within a circuit. However, non-linear activation functions such as ReLU, Sigmoid, or Softmax present a significant hurdle because they cannot be expressed as simple polynomials. These operations require advanced techniques like lookup tables or polynomial approximations to be represented efficiently in a cryptographic context.
The Complexity of Non-Linearity
Handling non-linearities efficiently is the key to minimizing the computational overhead of a ZK proof. Lookup tables allow the prover to commit to a table of pre-computed values, and the verifier then checks that the output of an activation function corresponds to an entry in that table. This approach significantly reduces the number of constraints in the circuit compared to approximating functions with high-degree polynomials.
- Quantization: The process of converting floating-point weights to fixed-point integers to fit within prime fields.
- Constraint Count: The number of algebraic equations required to represent a model, which directly impacts proving time.
- Lookup Tables: A technique used to handle non-linear operations like ReLU without expensive polynomial expansions.
- Prime Field Selection: Choosing a mathematical field that is large enough to prevent overflows during matrix operations.
Practical Implementation with EZKL
Implementing ZKML from scratch is a massive undertaking that requires expertise in both machine learning and cryptography. To streamline this, frameworks like EZKL have emerged to automate the conversion of standard model formats into verifiable circuits. These tools allow developers to take a pre-trained model in the ONNX format and generate the necessary files for proof generation and verification.
The workflow generally begins by exporting a PyTorch or TensorFlow model to ONNX, which provides a graph representation of the computation. The EZKL library then parses this graph and performs a calibration step to determine the optimal quantization parameters. This step is critical because it ensures the model remains performant while minimizing the cryptographic proof size and the resources required for generation.
Once calibrated, the developer generates a set of proving and verification keys. The proving key is used by the entity performing the inference to create the cryptographic proof, while the verification key is shared with anyone who needs to validate the result. This verification can happen on-chain via a Solidity contract or off-chain in a standard application environment.
Exporting and Calibrating the Circuit
Calibration is an iterative process where the tool analyzes a sample dataset to find the range of values each layer produces. This prevents the integer values from exceeding the maximum value allowed by the underlying curve, such as the BN254 curve used in many Ethereum-compatible systems. Getting the scale factor right is often the difference between a proof that takes seconds to generate and one that takes minutes.
1import torch
2import ezkl
3import os
4
5# Define a simple multi-layer perceptron for classification
6class PrivateModel(torch.nn.Module):
7 def __init__(self):
8 super(PrivateModel, self).__init__()
9 self.fc1 = torch.nn.Linear(10, 5)
10 self.relu = torch.nn.ReLU()
11 self.fc2 = torch.nn.Linear(5, 2)
12
13 def forward(self, x):
14 x = self.fc1(x)
15 x = self.relu(x)
16 return self.fc2(x)
17
18model = PrivateModel()
19model.eval()
20
21# Create a dummy input to trace the model graph
22x = torch.randn(1, 10)
23onnx_path = "model.onnx"
24
25# Export the model to ONNX format
26torch.onnx.export(model, x, onnx_path,
27 export_params=True,
28 opset_version=11,
29 do_constant_folding=True)
30
31# Define settings for quantization and circuit generation
32settings_path = "settings.json"
33ezkl.gen_settings(onnx_path, settings_path)
34# Calibrate to find optimal scale factors based on expected input distribution
35ezkl.calibrate_settings("data.json", onnx_path, settings_path, "resources")After generating the settings, the next step involves the setup of the structured reference string or SRS. This is a common requirement for SNARK-based systems, providing the parameters needed to create short, fast-to-verify proofs. Developers must ensure they use a trusted setup or a publicly available SRS that matches the size of their specific circuit.
Architecture and Performance Trade-offs
While ZKML is powerful, it introduces substantial computational overhead compared to standard inference. Generating a proof for a neural network can be several orders of magnitude slower than simply running the model on a GPU. Developers must carefully balance the size of the model against the time and memory required to produce a proof, especially for real-time applications.
The choice of proof system, such as Halo2 or Plonky2, significantly impacts these trade-offs. Some systems offer faster proof generation times but result in larger proof sizes that are more expensive to verify on-chain. Others produce tiny proofs that are cheap to verify but require significant computational resources and time to generate on the prover side.
Memory consumption is another critical factor, as large models can require hundreds of gigabytes of RAM to generate a proof. This often necessitates the use of specialized hardware or distributed proving systems. For many use cases, it is more efficient to prove a smaller, distilled version of a model or only prove specific critical layers of the architecture.
SNARKs vs STARKs in Machine Learning
SNARKs are currently the preferred choice for ZKML when on-chain verification is required due to their small proof size and constant-time verification. However, they rely on a trusted setup and use elliptic curve cryptography, which can be slower to prove than hash-based systems. This makes them ideal for applications like verifiable credit scores where the proof is sent to a blockchain.
STARKs, on the other hand, do not require a trusted setup and are generally faster to generate because they rely on hash functions. The downside is that STARK proofs are much larger, often tens or hundreds of kilobytes, which makes them expensive to verify on networks like Ethereum. Developers must evaluate the cost of gas versus the speed of proof generation when choosing between these two architectures.
1// Hypothetical Rust implementation for verifying a model proof
2fn verify_model_inference(
3 proof_data: Vec<u8>,
4 public_inputs: Vec<Fr>,
5 vk: VerificationKey
6) -> bool {
7 // Initialize the verification context for the specific curve
8 let params = load_srs_params();
9
10 // The verification step is fast and constant-time
11 // It checks if the proof matches the expected circuit outputs
12 match verify_proof(¶ms, &vk, &proof_data, &public_inputs) {
13 Ok(_) => true,
14 Err(_) => false,
15 }
16}