What Kind of Throughput Benchmarks Should I Expect From Leading Batch Verification SDKs Validating Thousands of Proofs Per Block?

Decision‑makers: this post turns block‑level capacity and SDK‑level microbenchmarks into concrete throughput you can actually budget for on Ethereum and EVM L2s, with hard numbers, emerging practices, and pitfalls to avoid. If you need to validate hundreds to thousands of proofs per block, here is what is feasible today and what to change to make it feasible tomorrow.

TL;DR (description)

On Ethereum today, naïvely verifying Groth16 proofs tops out at roughly 150–225 proofs per 45M‑gas block; with aggregation or recursion, you compress thousands of proofs to a single ~200–300k‑gas verification. Benchmarked SDKs/libraries (arkworks, gnark‑crypto) sustain hundreds to thousands of verifications per second off‑chain, and specialized verification layers (Aligned, zkVerify) are already pushing >1,000 proofs/sec. (theblock.co)

1) The on‑chain ceiling you’re budgeting against

Before we talk SDK throughput, fix your mental model of what actually fits in an Ethereum block.

Current L1 block gas budget: Ethereum validators raised the block gas limit to 45M in July 2025 (from ~36M earlier that year). That’s your rough “per 12 seconds” compute envelope on mainnet today. (theblock.co)
Pairing precompiles and costs (L1):
- BN254 pairing (alt_bn128): 34,000·k + 45,000 gas (EIP‑1108). Most Groth16 L1 verifiers target BN254 because it’s natively precompiled. (eips.ethereum.org)
- BLS12‑381 precompiles (EIP‑2537) shipped in Pectra (May 7, 2025). Pairing check cost: 32,600·k + 37,700 gas. This is now live on mainnet alongside KZG’s point‑evaluation precompile from EIP‑4844. (blog.ethereum.org)
- KZG point evaluation precompile (EIP‑4844, 0x0A): fixed 50,000 gas per verification. (eips.ethereum.org)

What this implies for raw per‑block proof counts (rule‑of‑thumb):

Groth16 on BN254 (typical verifier ~200–230k gas depending on public inputs): 45,000,000 / 220,000 ≈ ~200 proofs per block, ignoring calldata and other overhead. Horizen Labs’ breakdown pegs ~207,700 gas fixed plus ~7,160 per public input, which tracks with field experience. (medium.com)
BLS signature verifies (often 2 pairings): 37,700 + 2×32,600 = 102,900 gas → ~437 verifications per 45M‑gas block, before calldata and logic. Aggregating BLS signatures keeps this constant even as signer count grows. (eips.ethereum.org)
KZG (EIP‑4844) point checks (50k each): Max theoretical ≈ 900 per block, again ignoring calldata and app logic. (eips.ethereum.org)

Takeaway: on L1, “thousands per block” requires you to compress many proofs to a single verification (aggregation/recursion) or shift verification off‑chain to a specialized layer and attest on‑chain.

2) What off‑chain batch verification SDKs actually deliver (CPU throughput you can trust)

Under the hood, proof verification bottlenecks are dominated by pairings and small multi‑scalar multiplications (MSMs). Good libraries running on modern CPUs hit sub‑millisecond pairings, which translates to hundreds of Groth16 verifies per second per core before batching optimizations.

Real pairing timings (BN254) from a cross‑library benchmark:
- gnark‑crypto: full pairing ≈ 0.589 ms (Miller loop 0.2795 ms + final exponentiation 0.3094 ms).
- herumi/mcl: ≈ 0.609 ms.
- Others range 0.90–1.47 ms. (hackmd.io)
Back‑of‑the‑envelope from those timings:
- A Groth16 verify uses 3 pairings plus light MSM/field ops. Baseline ≈ 3×0.589 ≈ 1.77 ms, or ~565 verifications/sec on a single core before overheads. On a 16‑core server, think in the ~5–8k verifies/sec range. This is consistent with production experience when memory and I/O are not the bottleneck. (Derived from gnark‑crypto’s pairing timings; the math is ours.) (hackmd.io)
Why real batch verification speeds you up:
- Batch verification combines the Miller loops from many proofs and pays one final exponentiation, saving roughly 40–50% for large batches. Using the gnark‑bench breakdown above, asymptotically you approach (Miller share)/(Miller+FinalExp) ≈ 0.2795 / 0.589 ≈ 47% of the cost vs verifying one by one. Arkworks exposes this explicitly via groth16::batch APIs. Expect ≈2× wall‑clock improvement for big batches on one machine, on top of multi‑core parallelization. (hackmd.io)
Library/SDK notes:
- arkworks (Rust): production‑grade Groth16 verifier with documented batch API. Good fit for Rust services ingesting many proofs/sec. (docs.rs)
- gnark + gnark‑crypto (Go): fast pairings/MSM; straight‑through verifiers, Solidity exporter for BN254 Groth16; GPU acceleration hooks for provers exist but not required for verification. (docs.gnark.consensys.io)
- Important caveat: avoid ad‑hoc protocol “extensions.” Zellic reported vulnerabilities in a non‑standard Groth16 extension used for batch flows until patched—stick to upstream, audited code paths. (zellic.io)

3) Aggregation and recursion are how you scale to “thousands per block”

Batch verification alone makes off‑chain services fast; on‑chain, you must compress into one cheap check.

SnarkPack (Groth16 aggregation): aggregates 8,192 proofs in 8.7 s and verifies the aggregate in 163 ms on commodity hardware. Verifier and proof size are O(log n). This is the simplest path to compress thousands of Groth16s into one verifier call per block. (eprint.iacr.org)
STARK/PLONK families: use recursion to wrap many leaf proofs into a single SNARK checked on‑chain. Modern stacks expose ready verifiers:
- Succinct SP1 zkVM: EVM verification ~275–300k gas per wrapped proof (deployed on Ethereum and major L2s). (succinct.xyz)
- RISC Zero (Bonsai): STARK→SNARK wrap; reported on‑chain verify around ~245k–300k gas in real deployments. (chaincatcher.com)
KZG world post‑4844: if your checks are “is this KZG commitment consistent with value y at point z,” each evaluation is a fixed 50k gas, so you either:
- Verify up to ~900 evaluations per block directly, or
- SNARK‑wrap the batch and pay ~200–300k once (often better when bundling many logical checks and app logic). (eips.ethereum.org)
BLS12‑381 on L1 since Pectra: this matters for both KZG‑ish workflows and BLS signature aggregation. A 2‑pairing check is ~102,900 gas on BLS12‑381 vs ~113,000 on BN254, with higher security margins. Plan to standardize new on‑chain verification flows on BLS12‑381 unless a specific proof system forces BN254. Note: despite BLS12‑381 being live on L1, Groth16 over BLS12‑381 isn’t recommended—Groth16’s field requirements make BN254/BLS12‑377/BW6‑761 the usual picks. (eips.ethereum.org)

4) Concrete “proofs per block” scenarios you can explain to finance

Use these when setting OKRs or writing the PRD.

Scenario A — Naïve Groth16 on L1 (no aggregation):
- Budget ~220k gas per proof → ~200 proofs in a 45M‑gas block, best case. Any nontrivial calldata or app logic reduces that. If someone asks for “1,000 proofs per block on L1,” the answer is “not without aggregation/recursion.” (theblock.co)
Scenario B — Aggregated Groth16 (SnarkPack) on L1:
- Off‑chain: aggregate 5,000 proofs; SnarkPack benchmarks suggest O(log n) verify time, and the cited 8,192→163 ms verification provides a strong anchor.
- On‑chain: single Groth16 verify ~200–250k gas. Net: “thousands per block” is feasible with one transaction; your SLA becomes the off‑chain aggregation time (seconds). (eprint.iacr.org)
Scenario C — BLS signature aggregation (committees/oracles):
- Aggregate off‑chain to a single signature; verify on‑chain with 2 pairings (~102,900 gas). Scaling signers from 10 to 10,000 doesn’t change the on‑chain verifier cost—only calldata for the message/metadata moves. (eips.ethereum.org)
Scenario D — KZG point evaluations (post‑4844):
- If you truly need distinct on‑chain evaluations, ~900 checks per block is the ceiling at 50k gas each; otherwise SNARK‑wrap the entire batch to a single ~200–300k gas verify. (eips.ethereum.org)
Scenario E — zkVM receipts (SP1 / RISC0):
- One wrapped receipt ~275–300k gas; you can roll up many leaf proofs into one receipt via recursion.
- Plan one verify per batch per block (or per several blocks) rather than many discrete verifies. (succinct.xyz)

5) SDK‑level, system‑level, and chain‑level knobs that change your throughput by 10×+

Practical levers we deploy for clients to get from “hundreds” to “thousands.”

Keep public inputs small. Verification gas grows per public input (e.g., +~7,160 gas/input for a common Groth16 verifier implementation). Push data into commitments and keep on‑chain public IO minimal. (medium.com)
Prefer batch APIs and multi‑pairing internally.
- arkworks’ groth16::batch and gnark‑crypto multi‑pairing cuts ~40–50% off large batches because you pay one final exponentiation across the batch. Pin your perf goals to pairing timing data, not just end‑to‑end “verify()” microbenches. (docs.rs)
Standardize on BLS12‑381 for new on‑chain cryptography on L1.
- Post‑Pectra, BLS12‑381 precompiles exist for MSMs and pairings, and pairing gas/pair is slightly cheaper than BN254 while offering ~128‑bit security vs ~80‑bit for BN254. Use BLS12‑381 for signatures/KZG‑adjacent verification flows; keep Groth16 on BN254/BLS12‑377/BW6‑761 as appropriate. (blog.ethereum.org)
Use recursion trees for streaming workloads; use SnarkPack‑style aggregation for “many independent leaf proofs.”
- Recursion keeps per‑batch on‑chain costs constant and lets you trade latency for larger batch sizes; aggregation is great when leaf statements are heterogeneous and you want O(log n) verification. (eprint.iacr.org)
Consider specialized verification layers if you truly need sustained 1k–10k proofs/sec.
- Aligned (EigenLayer AVS) reports >1,000 proofs/sec in fast mode and targets ~40k gas amortized per proof at reasonable batch sizes when posting results to Ethereum. zkVerify (Substrate L1) has already processed 1M+ testnet verifications and supports Groth16/PLONK/SP1/RISC0/Plonky2. These cut your on‑chain verification costs and move throughput limits off the EVM. (blog.alignedlayer.com)
Validate your library choices with real pairings/MSM microbenchmarks.
- gnark‑crypto and herumi/mcl are both fast on BN254; per‑pairing timings in the 0.6–0.9 ms range are realistic anchors for capacity planning. (hackmd.io)
Don’t rely on assumptions about precompile gas across chains.
- EIP‑1108 costs are Ethereum‑specific; some L2s/L3s diverge. If you hardcode pairing precompile gas, you risk DoS on chains that price differently (e.g., zkSync’s historical deviations). Fetch or configure per‑chain values. (infsec.io)
Security hygiene for batch flows.
- Do not mix protocol variants ad‑hoc. Audit reports have found subtle bugs in non‑standard Groth16 “extensions” used in batched verifiers; stick to upstream primitives or well‑reviewed aggregators. (zellic.io)

6) Worked numbers you can drop into your spreadsheet

Off‑chain batch verification capacity (one 16‑core CPU VM):
- From 0.589 ms/pairing, expect ≈565 Groth16 verifies/sec/core → ~9,000/sec across 16 cores before batching wins; with batch multi‑pairing, 2× speed‑ups are common → ~18k/sec. This is compute‑bound and assumes hot caches, pinned cores, and sane memory. (We derived these using gnark’s pairing timings; validate on your hardware.) (hackmd.io)
L1 “how many proofs per block” sanity checks at 45M gas:
- Groth16 BN254, 2 public inputs: ≈220k gas → ~204 proofs block‑max (no aggregation). (theblock.co)
- BLS aggregate signature verify: ~103k gas → ~437 verifies per block. (eips.ethereum.org)
- KZG point eval (4844): 50k gas → ~900 checks per block. (eips.ethereum.org)
- Aggregated/recursive SNARK (e.g., SP1 receipt): 275–300k gas → “one per batch per block,” with batch size limited by off‑chain aggregation time, not on‑chain gas. (succinct.xyz)
Time to aggregate thousands of Groth16s with SnarkPack:
- 8,192 proofs aggregated in 8.7 s; verify aggregate in 163 ms. So 5,000 proofs ≈ ~5.3 s aggregation on a similar machine; on‑chain ≈ single Groth16 verify gas. (eprint.iacr.org)

7) Emerging best practices we’re deploying in 2026

BLS12‑381 everywhere you can on L1. Pectra’s EIP‑2537 reduces gas/pair and unlocks native MSM/pairing precompiles; align new verifiers/signature schemes to it. Keep Groth16 on BN254 (or cycle‑friendly pairs like BLS12‑377/BW6‑761) where the protocol requires. (blog.ethereum.org)
Don’t pay the final exponentiation N times. If your SDK exposes multi‑pairing or batch verify, use it; it’s easy 1.5–2× throughput in the verifier hot path. (hackmd.io)
Shrink public inputs; bulk data goes in commitments. On EVM, public inputs are where many verifiers “leak” gas. Use Pedersen/Merkle/KZG to keep on‑chain IO tiny. (medium.com)
Choose aggregation vs recursion by workload:
- Leaf proofs of different statements (heterogeneous): SnarkPack‑style aggregation excels.
- Continuous streams (rollups, oracles): recursion trees smooth latency and amortize costs. (eprint.iacr.org)
Consider external verification layers when throughput matters more than strict L1 minimalism:
- Aligned (>1,000 proofs/sec fast mode) and zkVerify (multi‑prover support, >1M testnet proofs) are viable options today; you post succinct attestations to Ethereum. Budget some integration time but expect order‑of‑magnitude cost/throughput gains. (blog.alignedlayer.com)
Verify your assumptions as Ethereum evolves:
- Block gas moved from ~36M to 45M in 2025 and may move again; EIPs like 4844/2537 change verifier economics materially. Re‑estimate capacity whenever core protocol parameters or fee markets shift. (theblock.co)

8) Quick implementation checklist (copy/paste for your JIRA)

Architecture
- Decide aggregation (SnarkPack/aPlonk/etc.) vs recursion (Halo2‑KZG, zkVM receipts).
- Target BLS12‑381 precompiles for new on‑chain cryptography; keep Groth16 on BN254/BLS12‑377/BW6 as needed. (blog.ethereum.org)
SDKs
- Rust: arkworks groth16::batch for batch verify; profile with release + CPU affinity. (docs.rs)
- Go: gnark/gnark‑crypto; rely on upstream verifiers and exporter; avoid non‑standard extensions. (docs.gnark.consensys.io)
Capacity planning
- Use 0.59 ms/pairing (BN254) as a starting point; multiply by pairings/proof; apply ~2× gain for large batch multi‑pairing. Validate on your hardware. (hackmd.io)
- For on‑chain SKUs, compute per‑block throughput with current gas: Groth16 ≈200/block; BLS verify ≈400/block; KZG eval ≈900/block; recursion/aggregation ≈1/block per batch. (theblock.co)
Safety
- Pin library versions and audit reports; avoid ad‑hoc Groth16 modifications. (zellic.io)
Stretch goals
- Explore specialized verification layers (Aligned, zkVerify) if you need sustained >1,000 proofs/sec and lower amortized gas/verify. (blog.alignedlayer.com)

9) One realistic end‑to‑end example

Goal: validate 5,000 user‑generated Groth16 proofs every 12–24 seconds with an Ethereum L1 attestation.

Off‑chain:
- Accept proofs into a queue; verify locally using arkworks batch API while also aggregating with SnarkPack.
- With pairing ≈0.589 ms and batch multi‑pairing, a 16‑core node comfortably verifies >10k proofs/sec; SnarkPack aggregates the last 5,000 into one aggregate in ~5–6 s on a tuned box (8,192 took 8.7 s in published benchmarks). (hackmd.io)
On‑chain:
- Submit one Groth16 verification per batch (~200–250k gas) → trivial to fit in a 45M‑gas block alongside your app logic. (medium.com)
Latency budget:
- Batch fill (queue) + aggregation (seconds) + one L1 inclusion (12s median): P50 < 20 s with headroom. If latency matters more than absolute gas, switch to shallow recursion and post a receipt every block. (succinct.xyz)

10) Final cautions

Groth16 on BLS12‑381: don’t. Use BN254/BLS12‑377/BW6‑761; BLS12‑381 doesn’t meet the field properties Groth16 wants. Use BLS12‑381 for signatures/KZG and for SNARKs that target it, now that precompiles exist on mainnet. (docs.gnark.consensys.io)
Don’t bake in gas constants across chains. Even for “standard” precompiles, costs differ on some L2s. Probe or parameterize. (infsec.io)
Ship with upstream verifiers and audited aggregation—avoid DIY variants in production unless you have a formal proof and an audit. (zellic.io)

References (selected)

Gas math and precompiles: EIP‑1108 (BN254), EIP‑2537 (BLS12‑381), EIP‑4844 (KZG point evaluation); Ethereum Pectra activation (May 7, 2025); 45M block gas limit reports. (eips.ethereum.org)
Pairing performance: gnark‑led cross‑library bench; herumi/mcl. (hackmd.io)
Verifier gas breakdowns: Horizen Labs analysis of Groth16 verifier costs. (medium.com)
Aggregation/recursion stacks: SnarkPack; SP1; RISC Zero Bonsai. (eprint.iacr.org)
Specialized verification layers: Aligned (>1,000 proofs/sec fast mode); zkVerify (1M+ testnet verifications, multi‑prover support). (blog.alignedlayer.com)

If you want a custom capacity plan for your circuits and chain mix, we’re happy to benchmark with your proving keys and real public input sizes.