What’s the Current Best Practice for Aggregating Groth16 Proofs on Ethereum to Cut Gas Costs?

Short description: If you’re verifying many Groth16 proofs on Ethereum, the most cost‑effective pattern in 2026 is to aggregate off‑chain and verify one result on‑chain—either as a recursive aggregated proof (~300k gas) or a batched attestation (~113k gas) from a restaked verification layer—while minimizing public inputs and targeting the right curve/precompiles post‑Pectra.

TL;DR for decision‑makers

For existing BN254 Groth16 stacks that must settle on Ethereum L1: use a production aggregator (e.g., Nebra UPA) to recursively compress N proofs into one and verify that single proof on‑chain (~300k gas base; small per‑proof metadata). Expect 8–10×+ savings vs verifying individually. (docs.nebra.one)
For throughput‑heavy apps that can accept crypto‑economic finality: offload verification to a restaked proof‑verification layer (e.g., Aligned). You get milliseconds latency and ~90–99% cost reductions; Ethereum sees an aggregated BLS attestation (~113k gas) or very low per‑proof cost (~2100 gas/proof at current batch rates). (blog.alignedlayer.com)
For greenfield circuits: re‑evaluate curve choice after Pectra. Ethereum now has BLS12‑381 precompiles (EIP‑2537). Groth16 over BLS12‑381 enjoys cheaper pairing checks (32600·k + 37700 gas) than BN254—though with larger calldata per point—changing the end‑to‑end cost calculus. (blog.ethereum.org)
Always minimize public inputs (hash/pack them), and harden verifiers (field‑range checks, domain separation). Each extra public input is ~6–7k gas; sloppy range checks have led to real bugs. (medium.com)

Why this changed in 2025–2026

Pectra mainnet activation (May 7, 2025) added BLS12‑381 precompiles (EIP‑2537) and increased calldata costs for data‑heavy transactions (EIP‑7623). Aggregation benefits increase as calldata gets pricier, and BLS12‑381 makes more pairing‑based designs L1‑viable. (blog.ethereum.org)
Production “verification layers” matured. Aligned’s mainnet‑beta verifies thousands of proofs per second off‑chain and posts one on‑chain statement, enabling 90–99% savings and new latency/throughput profiles for ZK apps. (blog.alignedlayer.com)

The baseline: how much does a single Groth16 verify cost on L1?

A Groth16 verifier call on Ethereum breaks down roughly as:

Pairing precompile(s) + EVM scaffolding + calldata + MSM over public inputs.
With BN254 (alt_bn128), EIP‑1108 set pairing gas to 34,000·k + 45,000; most verifiers use 4 pairings, plus ~6–7k gas per public input for the MSM. Expect ~200k–230k gas with few public inputs; ~7k more per input. (eips.ethereum.org)

Post‑Pectra, BLS12‑381 (EIP‑2537) offers a pairing cost of 32,600·k + 37,700 (cheaper than BN254 per pairing), but encodes points as 128/256 bytes (G1/G2), increasing calldata. Whether BLS12‑381 beats BN254 for your use case depends on public input count, calldata patterns, and whether you can retarget circuits; many legacy Groth16 circuits remain BN254. (eips.ethereum.org)

Key takeaway: verifying N individual Groth16 proofs on L1 scales linearly and quickly becomes prohibitive.

The three patterns that actually save you money in 2026

1) Recursive aggregation: one proof to rule them all (full L1 finality)

What it is:

Off‑chain, an aggregator verifies N Groth16 proofs and produces one “aggregated” proof (often using recursion) that attests to all N. On‑chain, you verify just this one proof.
Practical services like Nebra UPA do this today for Groth16 (and other systems), typically charging ~300k gas per aggregated proof plus small per‑proof bookkeeping. (docs.nebra.one)

Why it’s the current best default for L1 settlement:

You amortize L1 cost across many proofs, turning O(N) verifications into O(1) on‑chain work.
Real‑world numbers: teams report ~300–350k gas baseline for the aggregated proof, then low thousands of gas per included proof (for inclusion metadata), yielding 8–10×+ savings once batches are even modestly large. (docs.nebra.one)

How to ship it quickly:

Register your Groth16 verifying key (VK) with the aggregator once.
Submit proofs off‑chain (or via their contracts).
Your app contract consumes a single on‑chain verification result (e.g., checkProof or isVerified mapping). See Nebra docs for the exact flow. (docs.nebra.one)

Where recursion comes from:

Paper and production systems compress many SNARK verifications via recursion or SNARK‑friendly inner‑product arguments (e.g., SnarkPack for Groth16). Filecoin pioneered at scale; services today package this into SDKs. Note: on‑chain SnarkPack verifiers are not yet widely standardized on Ethereum; most teams use recursive SNARKs (e.g., Halo2‑KZG) to wrap many Groth16s into one proof that the EVM can verify efficiently. (eprint.iacr.org)

When to prefer it:

You need L1 cryptographic finality (the aggregated proof itself is verified on L1).
You have 10s–1000s of Groth16 proofs per settlement window and can tolerate extra seconds for off‑chain aggregation.

What to expect:

Gas: ~300k base + a few k gas per included proof for bookkeeping.
Latency: seconds to minutes to build large recursive proofs; grows with batch size and parallelism. (blog.alignedlayer.com)

2) Off‑chain verification + on‑chain BLS attestation (crypto‑economic finality, extreme throughput)

What it is:

A decentralized operator set (secured by restaked ETH) runs the actual verifier code natively. Operators co‑sign the decision with BLS, and you verify one aggregated signature or batch receipt on Ethereum. Aligned’s “Proof Verification Layer” is the most visible example. (blog.alignedlayer.com)

Why it’s compelling:

Aligned’s mainnet beta verifies ~200 proofs/sec at ~2100 gas/proof and plans to scale to thousands per second; the on‑chain component is an aggregated BLS signature check (~113k gas), amortized across the batch. This cuts verification cost by 90–99% and reduces latency to milliseconds at the verifier layer. (blog.alignedlayer.com)

Trade‑off:

You inherit EigenLayer‑style crypto‑economic security (with slashing now live) rather than direct L1 cryptographic verification for each proof. Many use cases (e.g., feeds, oracles, “soft finality” flows) are fine with this; others (e.g., settlement‑critical rollups) may still want a periodically‑posted recursive aggregated proof for hard finality. (coindesk.com)

When to prefer it:

You want the absolute lowest per‑proof gas cost and best latency, and crypto‑economic guarantees suffice.
You can optionally combine with recursive aggregation (Aligned’s Aggregation Service) when batches need L1 finality. (blog.alignedlayer.com)

3) “Do nothing clever” (verify N Groth16 proofs one by one)

Only reasonable if:

N is very small and you can keep public inputs to 1–2 elements.
Otherwise, you’re paying ~200k–230k gas per proof plus ~6–7k per public input—quickly exceeding block gas budgets for modest N. (medium.com)

Concrete cost picture (with up‑to‑date numbers)

BN254 Groth16 verification on Ethereum:
- Pairings: 34,000·k + 45,000 gas (EIP‑1108). Common verifiers use 4 pairings → 181k gas just for pairings. (eips.ethereum.org)
- MSM over public inputs: ≈6,150–7,160 gas per input.
- Typical end‑to‑end: ~207–230k gas for 2–3 public inputs; add ~7k per additional input. (medium.com)
BLS12‑381 Groth16 after Pectra:
- Pairings: 32,600·k + 37,700 gas; for 4 pairings, ~168k gas; for 3 pairings, ~135.5k. Calldata is larger per point (64‑byte limbs), so total cost depends on your encoding and public input strategy. (eips.ethereum.org)
Aggregation services:
- Recursive aggregation (Nebra UPA): ~300k gas per aggregated proof + small per‑proof overhead; ~10×+ savings at modest batch sizes; supports Groth16 today. (docs.nebra.one)
- Verification layer (Aligned): ~2100 gas/proof at current batch rates; aggregated BLS signature ~113k gas; supports gnark Groth16 BN254 among others. (blog.alignedlayer.com)
Calldata got more expensive for data‑heavy transactions (EIP‑7623), which makes posting many raw proofs even less attractive vs aggregation. (eips.ethereum.org)

Implementation playbooks (what to do this quarter)

A) You already produce BN254 Groth16 (Circom/gnark) and need L1 finality

Integrate a recursive aggregator (e.g., Nebra):
1. Register your VK once; get a circuit/app ID.
2. Submit proofs off‑chain; watch for “verified” events.
3. In your app contract, read the aggregated verification result (mapping by proof ID) and proceed. (docs.nebra.one)
Engineer for batching: decide batch windows by SLA (e.g., 10–60s) to amortize gas without breaching latency SLOs.

What to expect: ~300–350k gas per aggregation + small per‑proof overhead. If you would have paid ~220k × N, your breakeven is tiny. (docs.nebra.one)

B) You’re throughput/latency bound and can accept crypto‑economic guarantees

Use a verification layer (Aligned):
- Supports gnark Groth16 (BN254) and more; verify thousands of proofs/sec natively; Ethereum sees an aggregated BLS signature (~113k gas) and per‑proof charge ~2100 gas at current parameters. (blog.alignedlayer.com)
- Optionally route the same proofs later into an Aggregation Service to get a single recursive proof posted to L1 for hard finality. (blog.alignedlayer.com)

C) Greenfield circuits (new builds)

Revisit your curve target post‑Pectra:
- BLS12‑381 precompiles are live; pairing gas is lower than BN254, and MSM precompiles exist. If you can generate Groth16 on BLS12‑381 end‑to‑end, L1 verification may now be cheaper and more secure (128‑bit+). Benchmark total gas including calldata before committing. (eips.ethereum.org)
Still plan for aggregation: even with BLS12‑381, verifying N individual proofs scales linearly; aggregation is your lever for 10×+ savings.

D) Cut your per‑proof gas today (with or without aggregation)

Minimize public inputs: hash/pack all public signals into one field element and pass that as the sole public input. Rule‑of‑thumb: ~6–7k gas saved per input removed. Example pattern: compute Poseidon/Keccak inside the circuit; verify only the digest on‑chain. (medium.com)
Use a 3‑pairing verifier where safe: Groth16 theoretically needs 3 pairings; some templates use 4 for simplicity. Audited 3‑pairing verifiers save ~34k gas on BN254. Validate against your generator and security checks before shipping. (xn--2-umb.com)
Export optimized verifiers: gnark/snarkjs can export Solidity verifiers; keep VK points immutable/constant where possible to avoid SLOAD costs. (docs.gnark.consensys.io)

Practical, numbers‑first example

Assume you settle 256 Groth16 proofs every 60 seconds; each has 2 public inputs.

Naïve L1 verify (BN254):
- ~220k gas/proof × 256 ≈ 56.3M gas per batch → exceeds a 45M gas block; would span multiple blocks and inflate fees/latency. (medium.com)
Recursive aggregation (Nebra‑style):
- ~300k gas base for the aggregated proof + ~7k/proof metadata ≈ 300k + 1.8M = 2.1M gas. Fits comfortably in one block with 25× headroom; reduces your fee exposure and confirmation time. (docs.nebra.one)
Verification layer (Aligned):
- On‑chain: one aggregated BLS check (~113k gas) covers the entire batch; off‑chain operators verified all 256 proofs. Your “per‑proof” cost is ~2100 gas at current batch rates with milliseconds‑level verification latency before the attestation is posted. (blog.alignedlayer.com)

Security and correctness checklist (don’t skip this)

Field‑range checks for public inputs: ensure every public signal is checked mod the scalar field r to prevent input aliasing. Old templates and some libraries had known issues; use the latest generators and add explicit checks if needed. (security.snyk.io)
Domain separation: bind your proof to a circuit ID, deployment chain ID, and application context to prevent cross‑circuit replay.
Immutable VK and circuits: pin VK parameters; if you need upgradeability, gate it behind timelocks and explicit governance with clear migration paths.
Aggregation‑specific:
- For verification layers: ensure the attested statement includes all necessary binding (proof IDs, VK IDs, public input digests, and your app domain).
- For recursive aggregation: audit the outer circuit’s verifier gadget and transcript; recursion bugs can break soundness.
Post‑Pectra encoding: BLS12‑381 precompiles expect 64‑byte limbs; verify ABI packing and endianness; mismatches are a common source of reverts. (eips.ethereum.org)

Engineering tips we use in client projects

Pack/commit public inputs:
- Inside the circuit, compute h = Hash(publicSignals); expose only h as public. On‑chain, expect one uint256 (BN254) or two 256‑bit words (BLS12‑381 limb encoding considerations) and save ~6–7k gas per original input. (medium.com)
Precompute calldata off‑chain:
- Use library helpers to produce verifier‑ready calldata; avoid loops that push array elements one‑by‑one.
Keep verifier calls pure and small:
- Avoid storage writes during verification; check and return a boolean; gate follow‑up writes behind branching to skip work on invalid proofs.
Parameterize batch windows:
- For aggregators, expose a config to adjust batch size/time window without redeploying contracts; tune for your SLA and gas market conditions.

Minimal Solidity scaffold (packing public inputs to a single field element and calling a verifier):

// pseudo-interfaces; replace with your generator’s types
interface Groth16Verifier {
    function verifyProof(
        uint256[2] calldata a,
        uint256[2][2] calldata b,
        uint256[2] calldata c,
        uint256[] calldata publicInputs
    ) external view returns (bool);
}

contract UsesProof {
    Groth16Verifier public verifier;
    bytes32 public expectedDigest; // set by admin or computed off-chain

    constructor(address _verifier, bytes32 _expectedDigest) {
        verifier = Groth16Verifier(_verifier);
        expectedDigest = _expectedDigest;
    }

    // Only one public input: digest mod r
    function submit(
        uint256[2] calldata a,
        uint256[2][2] calldata b,
        uint256[2] calldata c,
        bytes32 digest
    ) external view returns (bool ok) {
        // Optional: explicit field-range check here (digest % r == digest)
        uint256 x = uint256(digest); // pack into field element off-chain too
        require(digest == expectedDigest, "bad digest");
        uint256[] memory inputs = new uint256[](1);
        inputs[0] = x;
        ok = verifier.verifyProof(a, b, c, inputs);
    }
}

Tie this into an aggregator by storing proof IDs and checking the aggregator’s “verified” mapping instead of calling the verifier directly.

Emerging practices to watch

BLS12‑381 Groth16 end‑to‑end: with EIP‑2537 live, expect more teams to target BLS12‑381 for new circuits. Pairing gas is lower, MSMs have native precompiles, and security margin is higher than BN254. Your total cost will hinge on calldata vs compute; benchmark for your circuit sizes. (eips.ethereum.org)
Calldata economics are tightening: EIP‑7623 already increased costs for data‑heavy txs; proposals like EIP‑7976 suggest further floors. Aggregation becomes even more valuable as calldata gets pricier. (eips.ethereum.org)
Unified flows: many teams combine both modes—fast, cheap verification via a verification layer for UX, plus periodic recursive aggregated proofs for L1 hard finality (e.g., every M minutes or after K proofs). (blog.alignedlayer.com)

If you need L1 finality for many Groth16 proofs: adopt recursive aggregation (e.g., Nebra UPA). You’ll land in the ~300k gas per batch regime and scale cleanly. (docs.nebra.one)
If you’re throughput/latency constrained and can accept crypto‑economic guarantees: use a verification layer (e.g., Aligned) for 90–99% savings and milliseconds‑level verification, optionally complemented by periodic recursive proofs. (blog.alignedlayer.com)
For new builds: consider generating Groth16 proofs over BLS12‑381 and verify with EIP‑2537 precompiles; aggregation still wins at scale. (blog.ethereum.org)
Regardless of path: minimize public inputs, enforce field‑range checks, and domain‑separate everything. These are low‑effort, high‑impact savings and risk reducers. (medium.com)

If you’d like a numbers‑backed feasibility study for your circuit sizes and traffic profile, we’ll model the gas/latency trade‑offs across BN254 vs BLS12‑381, recursive aggregation vs verification layers, and ship a hardened verifier/aggregator integration plan within two sprints.

Like what you're reading? Let's build together.

Get a free 30-minute consultation with our engineering team.

Talk to us View services

What’s the Current Best Practice for Aggregating Groth16 Proofs on Ethereum to Cut Gas Costs?

TL;DR for decision‑makers

Why this changed in 2025–2026

The baseline: how much does a single Groth16 verify cost on L1?

The three patterns that actually save you money in 2026

1) Recursive aggregation: one proof to rule them all (full L1 finality)

2) Off‑chain verification + on‑chain BLS attestation (crypto‑economic finality, extreme throughput)

3) “Do nothing clever” (verify N Groth16 proofs one by one)

Concrete cost picture (with up‑to‑date numbers)

Implementation playbooks (what to do this quarter)

A) You already produce BN254 Groth16 (Circom/gnark) and need L1 finality

B) You’re throughput/latency bound and can accept crypto‑economic guarantees

C) Greenfield circuits (new builds)

D) Cut your per‑proof gas today (with or without aggregation)

Practical, numbers‑first example

Security and correctness checklist (don’t skip this)

Engineering tips we use in client projects

Emerging practices to watch

Like what you're reading? Let's build together.

Related Posts

Building 'Algorithm-Audited' Financial Statements for DAOs

Developing 'Private Social Networks' with Onchain Keys

How to Tokenize 'Intellectual Property' for AI Models

What’s the Current Best Practice for Aggregating Groth16 Proofs on Ethereum to Cut Gas Costs?

TL;DR for decision‑makers

Why this changed in 2025–2026

The baseline: how much does a single Groth16 verify cost on L1?

The three patterns that actually save you money in 2026

1) Recursive aggregation: one proof to rule them all (full L1 finality)

2) Off‑chain verification + on‑chain BLS attestation (crypto‑economic finality, extreme throughput)

3) “Do nothing clever” (verify N Groth16 proofs one by one)

Concrete cost picture (with up‑to‑date numbers)

Implementation playbooks (what to do this quarter)

A) You already produce BN254 Groth16 (Circom/gnark) and need L1 finality

B) You’re throughput/latency bound and can accept crypto‑economic guarantees

C) Greenfield circuits (new builds)

D) Cut your per‑proof gas today (with or without aggregation)

Practical, numbers‑first example

Security and correctness checklist (don’t skip this)

Engineering tips we use in client projects

Emerging practices to watch

Bottom line (what we recommend at 7Block Labs)

Like what you're reading? Let's build together.

Related Posts

Building 'Algorithm-Audited' Financial Statements for DAOs

Developing 'Private Social Networks' with Onchain Keys

How to Tokenize 'Intellectual Property' for AI Models