Gas-Efficient Batching of Groth16 Proofs on Ethereum for...

Why this matters now

Ethereum’s roadmap for 2025-2026 has really shaken things up for on-chain proof verification. With Istanbul’s EIP-1108 already cutting down the costs of BN254 precompiles, Pectra stepped in to bring BLS12‑381 precompiles into the mix (thanks to EIP‑2537), and let's not forget about tighter calldata pricing (shoutout to EIP‑7623). Because of all these changes, Groth16 still holds its ground as the most cost-effective L1-verified proof for rollups. But hey, figuring out the best way to batch those proofs really comes down to what your throughput goals are, the public inputs you’re working with, and how much latency you can handle. Check it out here: (eips.ethereum.org).

The baseline: what a single Groth16 verify actually costs on Ethereum

Pairings really rack up the verification costs. After EIP‑1108 rolled out, checking Ethereum’s BN254 pairings costs a solid 45,000 base gas plus an additional 34,000 for each pairing. Most Solidity verifiers do about four pairings, which adds up to roughly 181,000 gas just for those pairings alone. (eips.ethereum.org)
When it comes to public inputs, the gas costs follow a linear path due to the MSM over the verifying key’s IC points. On BN254 (without the MSM precompile), the go-to method typically involves ECMUL (6,000 gas) plus ECADD (150 gas) for each input, landing you around 6,150 gas per public input. Throw in some scaffolding and calldata, and you get a handy rule-of-thumb: 207,700 + 7,160 × l gas, where l is the number of public inputs. So, if you’re looking at a two-input proof, you're hovering around ~220k gas, and for eight inputs, it’s about ~265k gas. (hackmd.io)
Proof bytes in calldata might be small, but they definitely add up. A BN254 Groth16 proof weighs in at 256 bytes (G1: 64, G2: 128, G1: 64), which amounts to roughly 4,096 gas at 16 gas/byte. If your transaction happens to be “data heavy”, keep in mind that EIP‑7623 can bump up the effective calldata floor - something to consider, especially when you're submitting multiple proofs in a single transaction. (xn--2-umb.com)

What’s Next for BLS12‑381 After Pectra?

The BLS12-381 curve has become pretty popular because of its efficiency and security features in the world of cryptography. With the recent developments surrounding Pectra, there are some interesting things to consider regarding its future.

Pectra Overview

Pectra is a new framework aimed at improving cryptographic protocols. By leveraging BLS12‑381, it enhances the overall performance and security. It stands out as an exciting innovation in the field!

Implications for BLS12-381

After Pectra, we can expect several key areas of impact on BLS12-381:

Performance Boosts:
Pectra's optimizations might help speed up computations that use BLS12-381, making processes more efficient.
Increased Adoption:
With Pectra facilitating better implementations, we could see an uptick in projects and platforms incorporating BLS12-381.
New Applications:
The potential for creating novel applications, particularly in decentralized finance (DeFi) and blockchain technology, could expand significantly.

Looking Ahead

As we look towards the future, it’s clear that BLS12-381, backed by Pectra’s developments, holds some exciting potential for the cryptography community. Keeping an eye on how these advancements unfold will be important for developers and researchers alike.

If you want to learn more about BLS12-381 or Pectra, check out the following resources:

BLS12-381 Documentation
Pectra Framework
EIP‑2537 introduces BLS12‑381 precompiles, which come with a more affordable pairing schedule compared to BN254: it’s 37,700 gas for the base plus 32,600 per pair. That means you’re looking at 167,100 gas for four pairings, while BN254 would set you back 181,000. On the flip side, the field encodings use 64-byte limbs, so a BLS12‑381 Groth16 proof (with G1: 128 bytes, G2: 256 bytes, and G1: another 128 bytes) totals 512 bytes--essentially doubling the calldata. If you're working with public inputs, it’s better to utilize the new G1MSM precompile (the cost per scalar decreases thanks to the built-in discount table). This approach is way more efficient than looping through ECMUL/ECADD in Solidity. You can check out more details over at (eips.ethereum.org).

Takeaway for a Single Proof:

If you're already using BN254, it’s still a solid choice when you want to keep your calldata minimal.
BLS12‑381 has come into its own: it offers slightly cheaper pairings and a better security margin, plus an MSM precompile. However, it does require more calldata. Be sure to evaluate your specific l and data needs before making the switch. (eips.ethereum.org)

3 vs 4 pairings: bleeding 34k-32.6k gas you don’t have to

Groth16’s math lets you verify with just three pairings, but a lot of Solidity templates still go for four. This means you could be leaving about 34k (BN254) or 32.6k (BLS12‑381) gas per call on the table. If you’re in control of your verifier, it's a good idea to switch to the 3-pairing product check. On the other hand, if you’re using a generator tool like snarkjs, make sure you choose a template that outputs the 3-pairing version and includes checks for malleability. Check it out here: (docs.pantherprotocol.io)

Batching strategies you can actually deploy (and what they cost)

There are four practical ways to "batch" Groth16 proofs. Choose one based on how many proofs you plan to finalize in each L1 period and how much latency you're okay with.

1) Naive N× on-chain verification (don’t)

Cost: N × (207,700 + 7,160 × l) gas on BN254 with four pairings for each proof; this isn’t going to work out well for larger N, that's for sure. Even if you cut it down to three pairings and make some code tweaks, you’re still looking at linear growth with N. Best to stick to this for tiny N or one-off scenarios. (hackmd.io)

2) On-chain batch verification via random linear combination

You can batch-verify n proofs of the same circuit by mixing their equations with some random field coefficients. This nifty trick cuts down the number of pairings from about 3n or 4n to roughly n + 2. You’ll still need to perform MSM over l for each proof, so while you’re trimming those constant factors, you’re still dealing with an O(n) situation. This approach tends to hit gas limits pretty quickly as n increases.

The good news? This batching method is pretty much standard in off-chain libraries and is well understood in the cryptography world. When it comes to on-chain applications, it’s particularly appealing for n in the low tens. Check it out here: (encrypt.a41.io).

Concrete BN254 example (n = 64, l = 3):

Pairings: The cost is 45,000 + 34,000 × (n + 2) = 45,000 + 34,000 × 66 = 2,289,000 gas.
MSM over public inputs: This one comes to 6,150 × l × n = 6,150 × 3 × 64 = 1,180,800 gas.
Proof calldata: Roughly about ~4,096 × 64 = 262,144 gas, not counting EIP‑7623 floor effects.
Total: You’re looking at around 3.73M gas for a single transaction--it's feasible, but it doesn't leave much wiggle room for extra logic. (eips.ethereum.org)

3) Recursion/wrapping to a single Groth16/Plonk proof (today’s default for throughput)

You check out a bunch of leaf proofs in a recursion circuit off the main chain and then send a single compact proof to Layer 1. On Ethereum, the top stacks out there combine large sets of proofs into one Groth16 or Plonk proof. This keeps the verification cost on-chain pretty steady. From what we've seen in practice, the final on-chain verification usually falls in the range of about 200k-300k gas on BN254 for smaller values of l. So, you’re essentially trading off a few seconds of off-chain proving for a more manageable on-chain cost. (7blocklabs.com)

Going with Groth16 as your main option? You’ll find yourself in the 200k-300k verification range, and the exact cost will depend on l. Keep it small--just one root, one block number, and one domain separator. Check it out here: (hackmd.io)
Opting for BLS12‑381 Groth16/Plonk? You’ll benefit from cheaper pairings and MSM being precompiled, but keep in mind that proof sizes will double. It’s smart to think about your calldata exposure in a setup that considers EIP‑7623’s minimum. For more details, head over to (eips.ethereum.org).

4) Proof aggregation systems and verification layers

SnarkPack (a neat Groth16 aggregation tool) squishes together n Groth16 proofs into a single aggregated object. It improves verification time and proof size to O(log n). According to Protocol Labs, they managed to aggregate 8,192 proofs in around 8-9 seconds and verification only took a few tens of milliseconds. When we talk about on-chain, that translates to “a few” pairings plus some modest multi-scalar multiplications (MSM)--basically, a consistent-scale verification concerning Layer 1 gas, even when n gets super large. It’s a good choice if you’re comfortable centralizing the aggregation and are okay with that single-aggregator trust model (or if you want to go for a decentralized aggregator). You can dive deeper into it here.
External verification layers, like Aligned Layer, take the heavy lifting of verification off your plate. They rely on a restaked group of operators to deliver a BLS-attested result back to Ethereum. The numbers they’ve published show that it costs around ~350-380k gas per batch and about ~16k gas for each consumer inclusion check. So, per proof, you’re looking at a cost that hovers around 16k + 380k/n. This setup is pretty enticing if you want to keep Layer 1 costs almost constant without having to build recursion on your own--just be aware of the trust and latency that comes with AVS. Check out more about it here.

Real benchmarks you can budget against

Here’s a breakdown of the “decision-grade” comparisons for an L1 settlement that has to handle N leaf Groth16 proofs of the same circuit, featuring l public inputs each. We’ll stick to the BN254 numbers for now; just keep in mind that BLS12‑381 is pretty similar but comes with a few differences--like slightly cheaper pairings, double the calldata, and different MSM pricing.

Assumptions for BN254 unless stated:

Single verify: 207,700 + 7,160 × l gas.
Batch verify: Pairings scale to n + 2, while MSM stays at l per proof.
Calldata: Each proof takes about 256 bytes (roughly 4,096 gas), but this can be affected by EIP‑7623 floors when there’s a lot of data involved.

All numbers have been rounded. Check out more details here.

N = 64, l = 3

Naive approach (64 separate verifies): We're looking at about 64 × (207,700 + 21,480) which comes out to roughly 14.7M gas. Definitely not feasible in a single block. (hackmd.io)
On-chain batch verification (random linear combo):
- Pairings: around 2.289M, MSM: about 1.181M, calldata: roughly 0.262M → giving us a grand total of about 3.73M gas. (eips.ethereum.org)
Recursive wrapper to one Groth16 (with outer l = 2-3):
- This one runs around 220k-235k gas. That’s a massive savings compared to naive methods, about 98.5%. And against the on-chain batch, we’re looking at savings of around 93-94%. (hackmd.io)
Verification layer (Aligned):
- It breaks down to ~380k base plus 64 × 16k, which totals around 1.4M gas; so per-proof, we’re looking at about 22k. The savings compared to the on-chain batch is roughly 62%. (docs.electron.dev)

2) N = 256, l = 2

Naive: We're looking at roughly 56.9 million gas here--definitely not a great starting point.
On-chain batch verify:
- Pairings come in at 45k + 34k × 258, which totals about 8.787 million; then add MSM: 6,150 × 2 × 256 gives us around 3.149 million. Don’t forget the calldata, which is around 1.05 million, bringing us to approximately 12.99 million gas. That's quite a bit for just one block, especially when you consider anything else happening in the 30-45 million gas range.
Recursive wrapper:
- This one clocks in at about 220k-230k gas (with an outer layer of l = 2). Thanks to EIP‑7623, the calldata remains pretty small (just a single proof). This route is definitely the go-to for achieving high throughput. (hackmd.io)
Verification layer:
- Here, we’re looking at roughly 380k + 256 × 16k, which is around 4.48 million gas if you need to run per-consumer inclusion checks on a larger scale. But, if you just make one post to a hub and let consumers grab the info off-chain, the on-chain gas cost stays pretty steady. (docs.electron.dev)

3) BLS12-381 variant (post-Pectra), N = 64, l = 3, aggregated into one BLS12-381 Groth16

Pairings: We're looking at about 37,700 gas plus 32,600 multiplied by 4, which brings us to 167,100 gas.
MSM: For the G1MSM precompile, the discount isn’t huge (it’s around 9-12k per scalar) at l = 3, so we should expect to add around 30-36k gas.
Calldata: The proof comes in at about 512 bytes, roughly 8,192 gas. That’s quite small compared to pairing and MSM costs.
Total verify: In total, we’re talking around 210-225k gas. This is pretty much on par with BN254, but it offers stronger security and scales better with larger MSMs. Just a heads-up: if your transaction is getting a bit “data heavy,” you might want to double-check (EIP-7623). (eips.ethereum.org)

Emerging best practices for gas-efficient Groth16 batching in production

Go for recursion/wrapping when you need high throughput; batch-verify on-chain just for small n

If you're dealing with dozens or even thousands of leaf proofs each L1 interval, think about designing a recursion tree and wrapping it into a single Groth16 or Plonk proof. This method is tried and true, turning O(n) verifies into something closer to ≈O(1) gas. Just keep those outer public inputs to a minimum--like a Merkle root, L2 state root, and a counter. (7blocklabs.com)

Trim down those public inputs (they're your gas booster!)

Each public input will run you around ~6,150-7,160 gas on BN254, and it’s pretty much the same deal with BLS12‑381 using MSM. Consider hashing or committing any extra data off-chain, and just bring the compact commitments on-chain. This strategy often delivers the best bang for your buck when it comes to optimization. (hackmd.io)

3) Switch to the 3-pairing verifier

Make sure to audit your verifier so that it uses the 3-pairing product equation and properly rejects any malleable points using < q checks. By switching to the 3-pairing swap, you can save around 34k per verification on BN254 and about 32.6k on BLS12-381. If you’re using code generation tools like snarkjs, double-check that your template is up-to-date and incorporates the latest hardening changes. You can find more details on this GitHub link.

4) Re-evaluate your curve post‑Pectra

So, here’s the deal: BN254 still outshines when it comes to calldata size and a well-tuned cost model. But with EIP‑2537 making waves, BLS12‑381 pairings are getting a bit more affordable, and MSM is now natively precompiled with some nice volume discounts. Plus, you’ll enjoy that sweet 120‑bit+ security boost. If you’re dealing with huge circuits or large MSMs, BLS12‑381 might just break even or even be better in terms of gas, even after you double those proof bytes. Definitely worth crunching the numbers on your end. Check it out for more details: (eips.ethereum.org)

Account for calldata floors (EIP‑7623)

When it comes to “data-heavy” transactions (like when you're posting a bunch of proofs), there’s a minimum charge of 10/40 gas per byte for calldata. This encourages you to think about using recursion (like just one small proof) or setting up a verification layer (with constant-size attestations) instead of cramming a ton of proofs into a single call. Check it out here: (eips.ethereum.org)

6) If you really need to batch on-chain, keep n small and l even smaller

When you're dealing with n in the low tens and l at 3 or less, you can make on-chain batch verification work without too much fuss - plus, it keeps latency low since there's no waiting around for recursion. To get a rough estimate, you can use the formula:

gas ≈ (45k + 34k × (n + 2)) + n × l × 6,150 + 4,096 × n (BN254)

Don’t forget to add in some extra headroom for safety and any specific application logic you might have. You can check out more details over at EIP 1108.

7) Think About an External Verification Layer for “Constant” L1 Gas Without the Headache of Recursion

If you're all about keeping things simple, consider handing off verification to a restaked network like Aligned. They provide a BLS-attested result, making your life a lot easier. You can plan for around 350-380k gas for the base cost, plus about 16k for each consumer inclusion check. The best part? You’ll get almost constant L1 costs as your user base grows. Just weigh the latency and security trade-offs against going the native recursion route. Check out more about it here.

8) Prover-side advances are your friend

GPU-accelerated Groth16 stacks like ICICLE-Snark really help cut down on off-chain aggregation latency. When you pair that with recursion, you can keep the L1 costs pretty stable while smashing those ambitious throughput SLOs. Check it out over at ingonyama.com!

Implementation notes your engineers will thank you for

Precompiles and Addresses:
- BN254:
  - ECADD: 0x06 (150 gas)
  - ECMUL: 0x07 (6,000 gas)
  - ECPAIRING: 0x08 (45,000 + 34,000·k). Check it out here.
- BLS12‑381 (EIP‑2537):
  - G1ADD: 0x0b (375)
  - G1MSM: 0x0c (discounted)
  - G2ADD: 0x0d (600)
  - G2MSM: 0x0e
  - PAIRING: 0x0f (37,700 + 32,600·k)
  - MAP FP→G1: 0x10
  - MAP FP2→G2: 0x11. Aim to use MSM when you can! You can get more info here.
Solidity Verifier Hygiene:
- Make sure to enforce < q checks and subgroup rules to dodge malleability issues. Recent verifier templates, like the updates to snarkjs, have these fixes sorted out. Check it out on GitHub.
- Go for custom errors instead of string requires, compile via IR, and keep memory copies to a minimum when you’re calling precompiles. Passing calldata pointers directly is safe and can save you some gas! These tweaks might seem small, but they really add up. More tips can be found here.
Calldata Planning:
- Here’s the scoop on element sizes: For BN254, G1 is 64 bytes and G2 is 128 bytes. For BLS12‑381, G1 takes 128 bytes and G2 256 bytes. Just a heads-up, don’t bother with point compression for precompile calls--both precompiles expect uncompressed coordinates. More info can be found here.
Public Input Layout:
- Opt for hashing or Merkle-accumulation down to one or two field elements. On BN254, each extra input costs around ~7,160 gas; with BLS12‑381 and MSM, the cost per scalar depends on k through discounts, but it generally falls in the same ballpark when l is small. You can dive deeper here.
Three-Pairing Product:
- Take a look at your pairing product equation and switch to a 3-pairing check if you’re still using four. We’ve noticed about 15-20% savings on the pairing side with this change. Just make sure to validate your results with formal tests and cross-implementations. More details available here.

When to choose which approach (quick decision matrix)

If you're aiming to settle 16 leaf proofs or fewer per L1 interval and need it done in less than a second, go for on-chain batch verification using random linear combinations. Keep your l value at 3, and set up a 3-pairing verifier. You might want to take another look if EIP-7623 turns your transactions into “data heavy” ones. Check it out here: (encrypt.a41.io)
For cases where you’re settling anywhere from 10 to 1000 leaf proofs and can handle a bit of off-chain latency (we’re talking seconds), consider using recursion with a single Groth16 on either BN254 or BLS12-381. This approach is pretty much the go-to for production environments. More details here: (7blocklabs.com)
If you want to keep your L1 gas costs constant without diving into recursion infrastructure, look into a verification layer like Aligned. You’ll want to budget around 350-380k base + about 16k for each consumer inclusion check. Just make sure to document the trade-offs between trust and latency. Get more info here: (blog.alignedlayer.com)
Already cranking out Groth16 proofs by the truckload and want to minimize L1 pairings? SnarkPack might be your best bet; it delivers O(log n) verifier time and keeps on-chain footprints super small when you use it as your settlement object. Dive into the details here: (research.protocol.ai)

Putting it all together: a high-throughput rollup plan

Target architecture
- Leaf circuits generate Groth16 proofs for each batch of L2 transactions.
- There's a recursion/aggregation service that creates one wrapper proof for every L1 posting interval.
- As for the L1 verifier contract:
  - It runs a 3-pairing check.
  - It offers one or two public inputs (like the state root and domain separator).
  - It can handle both BN254 and BLS12‑381 verifiers thanks to a timed upgrade path, so you're covered if you need to switch to MSM-heavy l or if you decide to consolidate everything onto BLS12‑381 infrastructure. (7blocklabs.com)
Budget
- For the BN254 wrapper (l = 2), you're looking at around 220k gas per settlement.
- With the BLS12‑381 wrapper (l = 2), it's around 210-225k gas per settlement, but you’ll need to factor in 2× proof calldata. Make sure to reevaluate based on EIP‑7623, depending on the other calldata in the transaction. (eips.ethereum.org)
Scalability knobs
- If your L2 intervals start producing a ton of proofs, you can either increase the recursion (which keeps the on-chain cost the same) or shift the verification to an AVS and post a BLS-attested result for about 350-380k gas (which remains constant). (blog.alignedlayer.com)

Key takeaways for decision‑makers

Groth16 is still the most budget-friendly option for settling proofs on Ethereum L1. You can expect around 200-300k gas per batch if you decide to wrap things up recursively or aggregate them--this holds true no matter how many leaf proofs you throw into the mix. (7blocklabs.com)
The biggest gas-saving trick you have in your toolbox is the number of public inputs. Cutting out just one input can save you around 6-8k gas on BN254 and about the same on BLS12‑381 MSM. (hackmd.io)
After the Pectra update, BLS12‑381 really came into its own: it offers slightly cheaper pairings, has native MSM, and boasts stronger security--but you'll need to deal with about double the calldata per proof. It’s a good idea to take a close look at your own l and calldata needs before making the switch to the outer curve. (eips.ethereum.org)
If you're working with really large batches and don’t want to dive into building recursion infrastructure, verification layers provide nearly constant L1 gas usage and can handle high throughput. Just keep in mind that this comes with the trade-off of trust and possible latency issues with AVS. (blog.alignedlayer.com)

If you're looking for a customized, data-driven strategy for your rollup's proof pipeline, 7Block Labs has got you covered. They can help you evaluate your circuits using both BN254 and BLS12-381, determine the size of the recursion tree, and create a production-ready verifier that features the 3-pairing optimization and malleability hardening.

Sources and further reading

EIP‑1108: This one covers the BN254 precompiles (ECADD, ECMUL), with a pairing cost of 45,000 + 34,000·k. Check it out here.
EIP‑2537: Dive into the BLS12‑381 precompiles, where the pairing cost is 37,700 + 32,600·k, plus some MSM discounts and 64‑byte field encoding. More details can be found here.
EIP‑7623: This update raises the calldata cost floor for those data-heavy transactions. Take a look here.
We’ve got a Groth16 verification gas model on BN254, where the cost is 207,700 + 7,160 × l, along with sizing insights. You can explore this further here.
Check out the discussion on Groth16’s “3‑pairing” and some template pointers; it also touches on malleability checks in modern verifiers. Read more about it here.
We have insights on SnarkPack aggregation (O(log n) verifier) and its performance on 8,192 proofs. It’s pretty cool, and you can find it here.
External verification layers are discussed along with measured gas usage. Check out that info here.
Lastly, there’s the prover acceleration with ICICLE‑Snark, which runs on GPU. You can learn more about this fast Groth16 implementation here.

Gas-Efficient Batching of Groth16 Proofs on Ethereum for High-Throughput Rollups: Actual Gas Savings Benchmarks