Could Bundling Thousands of Tiny Proofs into One Big...

Concise Summary

Bringing together multiple proofs into one can seriously reduce verification gas and blockspace, but it usually doesn't speed up the end-to-end latency for cross-chain oracle updates. In fact, in many setups, it can actually make things slower unless you opt for lightweight aggregation methods like Merkle or BLS, or go with a hybrid approach that combines "fast attestation + delayed ZK finality." We break down clear thresholds, architectures, and numbers to help you figure out when aggregation is beneficial and how to do it safely.

The latency question you actually need to answer

"Will aggregation cut down our cross-chain update latency?" basically asks: where's the holdup in your current process?

Finality and batching in the source chain
Transporting across chains (using relays or bridges)
Verification costs on the destination chain and its block capacity
Time taken to generate proofs (especially if you’re creating recursive or aggregated ZK proofs)
Costs associated with calldata/DA and getting included in the mempool

If you're finding that destination‑chain verification is slowing things down, aggregation can often be a great solution. However, if you’re dealing with bottlenecks during proof generation or with source/destination finality, then heavy recursive aggregation might actually increase latency, even though it can help cut costs. According to Chainlink CCIP's documentation, the biggest factor affecting cross‑chain message latency is the time it takes to reach finality on the source chain. This can really vary--some chains like Avalanche and Solana can get there in less than a second, while others like Ethereum and many Layer 2s can take tens of minutes, especially when they’re using "finalized" tags.

CCIP cleverly batches up those finalized messages into a Merkle root for each relay, so while aggregation is a neat way to spread out costs, it might not be the best tactic if you're trying to speed things up. Check out the details in the Chainlink docs for more info!

What “aggregation” really means (and why this distinction matters for latency)

There are three distinct aggregation techniques at work here:

1) Signature aggregation (BLS multi-signatures)

What it does: A bunch of validators or oracles come together to sign a single message, and you just need to verify one aggregated signature on-chain.
Why it’s fast: Instead of needing multiple checks, one pairing can validate N signers. With BLS on Ethereum after Pectra (thanks to EIP-2537), a 2-pairing verification only costs about ~103k gas. This replaces a whole lot of ECDSA verifications, speeding up on-chain processes to just milliseconds without adding any extra proving time. (eips.ethereum.org)

2) Merkle/Batch Commitments (no ZK)

What it does: It lets you commit to thousands of items using a Merkle root. You only need to verify signatures once on the root and then check some pretty straightforward Merkle proofs for each item.
Why it’s fast: The heavy lifting is all about verifying the signature on the root. After that, verifying each item is just a matter of hashing. With Pyth's Perseus upgrade, they switched to a 400 ms slot cadence, which allows them to verify Wormhole signatures just once per slot. This change has led to a sweet 50-80% reduction in costs for multi-feed updates, all without adding any delay to proof generation. Check it out on pyth.network.

3) Recursive/ZK proof aggregation (SNARK/STARK recursion, SnarkPack-style)

What it does: It combines thousands of proofs into a single, compact proof that can be verified on-chain.
Why it’s tricky for latency: While the on-chain verification becomes super affordable (thanks to logarithmic pairings and O(1) gas costs), generating that aggregated proof can take anywhere from a few seconds to a few minutes, depending on how big the batch is and what kind of hardware you’re using. According to Aligned Layer, this recursive aggregation can add several minutes of latency if you're aiming for “hard finality” on Layer 1. In contrast, their off-chain verification, which uses aggregated BLS signatures, can give results in just milliseconds. You can check out more about it here.

In a nutshell: if your latency service-level objective (SLO) is under a second or just a few seconds, go with signatures/Merkle batching. But if you’re looking for occasional high-security checkpoints that won’t break the bank on-chain, then recursive ZK aggregation is the way to go.

Where aggregation clearly reduces latency

Aggregation helps cut down on latency, but only when the bottleneck is on-chain verification throughput:

Blockspace Limits: Ethereum L1 has a cap on how many proofs it can verify in a single block, mainly due to those pairing-heavy verifiers. But there’s a workaround! By using off-chain verification along with aggregated signatures, like Aligned’s Proof Verification Layer, we can boost practical throughput to thousands of proofs per second. This method compiles everything down into one batched result for L1, which helps dodge that frustrating multi-block queuing that can make everything feel sluggish for downstream apps. Check out more here.
Per-update Signature Storms: When your oracle update needs to verify a bunch of publisher signatures across different destination chains, BLS aggregation pulls a nifty trick by turning multiple signature checks into one. This shaves off on-chain time from several milliseconds to just a single precompile call. And with EIP-2537 rolling out (it went live on April 23, 2025, in Pectra), those BLS12-381 precompiles are not only cheaper but faster than their BN254 counterparts. You can find more details here.
Calldata-bound L2s Post-EIP-7623: If you're pushing a lot of signatures or proofs as calldata, keep an eye on those rising costs! EIP-7623 bumps up the calldata fees for data-heavy transactions, which encourages aggregation to keep your payloads nice and lean. By batching things up (think converting signatures into one BLS and proofs into one SNARK), we can cut down on bytes and pairing checks. This helps sidestep those annoying mempool delays and block-size squabbles. More on this here.

When it comes to checking an aggregated BLS signature in the EVM, you're looking at around ~113k gas if you're using BN254 precompiles. On the flip side, with BLS12‑381 in Pectra, the cost for each “pair” drops to about ~70k gas, and if you’re doing a typical two-pairing check, it’s around ~103k gas. This upgrade not only speeds things up but also boosts security parameters. It’s definitely a lot quicker than going through a bunch of ECDSAs one by one. (ethresear.ch)

Where aggregation increases latency (often by minutes)

ZK recursion can really ramp up the prover time, and sometimes that increase can be quite substantial:

Aligned Layer’s Aggregation Service (testnet) gives us the scoop on “minutes for recursive proving,” weighing the trade-offs between trading throughput and gas savings against some added latency. Their off-chain verification is pretty speedy, but the on-chain aggregated proof--what they call “hard finality”--takes a bit longer to come together. (blog.alignedlayer.com)
When we look at research and production systems, the trade-offs are pretty familiar: SnarkPack is capable of aggregating thousands of Groth16 proofs in about ~163 ms (native), but don’t get too excited--aggregation can take anywhere from a few seconds to even tens of seconds, depending on the number of proofs and the hardware you’re using. That’s time you have to sit tight before you can post anything on-chain. (eprint.iacr.org)
zkVM-based recursion is making some strides: newer zkVMs like SP1 Hypercube and RISC Zero R0VM are pushing for real-time proving under 12 seconds for your standard Ethereum blocks on big GPU clusters. However, outside of those ideal setups, you’ll still find that “minutes to tens of seconds” is the norm. So, if your oracle is aiming for a 400 ms to 2 s p95, recursive aggregation is likely going to shoot past that budget right now. (blog.succinct.xyz)

Bottom line: if you're struggling with issues like source/destination finality, network propagation, or L2 sequencing, then recursive aggregation probably won’t do much for your latency and might even make it worse. So, think of it as a tool for cost-saving or security improvements, rather than a speed boost.

Case study: Pyth’s cross‑chain pipeline--aggregation without ZK to keep latency low

Pyth gathers quotes from publishers on Pythnet and then every 400 ms, it publishes a Merkle root that reflects all the price updates. Wormhole guardians come in to sign off on this root, and when users want to pull the price on-chain, they submit the signed root along with Merkle proofs to the destination chains. The cool thing is that only one set of signatures gets verified for each slot, and checking each price is just about doing Merkle checks. This setup ramps up the update frequency (from 1 second to 400 ms) and slashes the verification costs per update by about 50-80% across different ecosystems--all without messing around with ZK proving delays. Check it out here: (pyth.network).

Here’s the approach we suggest for those latency-sensitive oracle updates: go with BLS/Merkle aggregation. This keeps verification quick and constant-time on the destination chain, plus it helps you steer clear of those pesky long proof-generation waits. Pyth’s production numbers (we’re talking hundreds of millions of updates every quarter) back this up and show that it really does scale. (messari.io)

Case study: CCIP and “aggregation by finality window”

Chainlink's CCIP takes a moment to ensure that the source chain is finalized before it sends over a Merkle root of those finalized messages to the destination. This means there's some clever aggregation going on that lowers the cost of verifying each message. However, the real kicker when it comes to latency is that whole finality process. For instance, on Ethereum, you’re looking at around 15 minutes for that “finalized” tag, and a lot of Layer 2s have similar timelines.

So, if you’ve got a service-level agreement (SLA) that aims to keep things in the seconds range, the bottleneck isn’t really in the proof verification part--it's in the finality policy itself. By tweaking how you handle finality (like choosing between block depth and the finalized tag), you can make a much bigger impact than just focusing on cryptographic aggregation. (docs.chain.link)

Case study: zk light clients and bridges--aggregation improves cost, not always speed

Polyhedra’s zkBridge takes a cool two-layer approach. First off, there's a distributed prover called deVirgo that whips up a speedy STARK-style proof of consensus. Then, it uses a Groth16 “wrapper” to make that proof compact for cheap on-chain verification--think around 220-230k gas. They can actually prove and verify a block header across Ethereum and Cosmos in about 12-20 seconds during benchmark tests. While that's quick for validity bridges, it’s still a bit slower compared to those signature/Merkle pipelines. Plus, if you batch more headers together, you can cut costs, but it does mean you’ll have to wait a bit longer. (theblock.co)

The key takeaway here is that if you need to guarantee strict validity on every cross-chain update, you should brace yourself for latencies that could stretch from seconds to even a minute. Sure, you can batch your updates to save on costs, but unless the time it takes to prove things dips below your service-level objective (SLO), you’re not going to see any improvement in latency. (docs.zkbridge.com)

Hard numbers you can budget for

Groth16 Verification on Ethereum: When you’re looking to verify using Groth16, expect to spend around 200k to 250k gas, plus an extra ~7k gas for each public input. In comparison, STARK verification can rack up costs over 1M gas. The cool thing is that if you aggregate thousands of proofs into one Groth16 proof, you can cut down the verification to just a few pairings (think O(1)/O(log n)). Just keep in mind that with aggregation, you’ll deal with some latency as a trade-off. (hackmd.io)
Signature Aggregation: With EIP‑1108, the gas cost for BN254 pairing has been set to 34k·k + 45k gas. A straightforward BLS check usually requires two pairings, which adds up to about 113k gas. But with EIP‑2537, the introduction of BLS12‑381 precompiles really helps cut down costs per pairing while boosting security--perfect for those oracle and bridge attestations you might be working with. (eips.ethereum.org)
Off‑Chain Verification + BLS Attestation: Aligned’s Verification Layer has it reported that it takes around 350k gas for a batch that includes one proof (from any system). If you’re juggling a batch size of 20, it works out to about 40k gas per proof, with verification happening in mere milliseconds (thanks to operators signing) and you'll have the on-chain read available after just one block. (blog.alignedlayer.com)
Electron Labs’ Groth16 “Super-Proof”: For those using Electron Labs, you’re looking at a base cost of about 380k gas per batch, plus around 16k gas for each consumer contract inclusion call, spread across n included proofs. So, per-proof gas will be roughly 16k + 380k/n. This setup definitely offers a cost advantage, but the total end-to-end latency is something you’ll need to consider, especially depending on how long you wait to gather those proofs. (docs.electron.dev)

The 3 patterns that work in production (and when to use them)

Low-latency trading and risk engines (p95 < 1-2 s) → We're using Merkle + BLS, and skipping ZK on the hot path.

We publish updates every slot (think around 400 ms).
Each slot gets one root signed (BLS), and we verify it just once at the destination; for each feed, we use a Merkle branch.
Check out this example: Pyth Perseus (with those 400 ms slots, we handle one signature set, cutting multi-feed costs by 50-80%). (pyth.network)

Mixed latency/stronger assurances → Hybrid “fast attestation now, validity checkpoint later”

Immediate soft-finality: You can quickly verify off-chain when AVS signs a result (takes just milliseconds).
Periodic hard-finality: Every T minutes or blocks, we aggregate proofs and post a single validity proof to L1 to keep trust in check. Check out Aligned’s two modes for an example. (blog.alignedlayer.com)

“Every update needs to be validity-proven” (think bridges and security-critical ops) → zk light client with customizable batching

Choose N (the number of headers/updates for each batch) to stay on track with your fee goals.
Keep in mind that you might face an update latency of around 10-120 seconds with existing stacks unless you’re using some massive GPU clusters. For instance, zkBridge’s deVirgo paired with a Groth16 wrapper comes in at about 220k gas for verification, which means you’re looking at 12-20 seconds for cross-chain interactions in their demos. (blog.polyhedra.network)

How to decide: a simple break‑even model

Define:

V: This is the on-chain cost or time it takes to verify a single "tiny" proof, like a 230k gas Groth16 proof.
A(n): This represents the time needed to generate an aggregated proof that has a size of n, whether it's done through recursion or packing.
G: This denotes the gas limit per block for proofs or your overall cost budget.
F_src/F_dst: These refer to the finality waiting times at the source and destination, respectively.

Then:

Non-aggregated latency is roughly the sum of queue_delay(V, G), F_src, relay, inclusion, and F_dst.
Aggregated latency is about wait_to_fill(n), plus A(n), inclusion, and F_dst (usually, you’ve already accounted for F_src while at relay).

Aggregation can really help with latency, but only when queue_delay(V, G) is greater than wait_to_fill(n) plus A(n). In real-world scenarios, especially on L1, the queue delay tends to be a bottleneck mainly when things get busy (like when there are a lot of proofs per block). If your volume isn’t too high, ZK aggregation is more about saving gas than it is about speeding things up.

Plugging the Numbers

So, let’s break this down. If you’ve got 1,000 Groth16 verifications to handle and the chain can only process about 180 per block (thanks to gas limits), you’re gonna face quite a backlog. Now, here’s where things get interesting: if you can whip up an aggregated proof that can be verified just once, you can clear out that entire queue in a single transaction--assuming you can generate that proof quickly enough.

If your aggregator takes around 2-4 minutes to crank out the recursive proof, this scenario isn’t great for a 1-second Service Level Objective (SLO). But on the bright side, it could actually work out better when you’re dealing with larger batches, especially when the gas pressure is on.

For more insight, check out this blog post.

2025 updates that change the calculus

Pectra has rolled out EIP‑2537, which introduces the BLS12‑381 precompiles. This is a game-changer--it cuts down on gas costs and boosts security for BLS aggregation. So, if you’re working with signature-based batching, this is definitely a win for those low-latency pipelines. Check out more about it here.
With EIP‑7623, the costs for calldata have gone up for those data-heavy transactions. To keep things efficient, try to keep your payloads lean--think Merkle trees with just one aggregated signature or proof. And when you can, lean into using blobs on L2. More details can be found here.
The speed of zkVM proving has taken a huge leap forward, thanks to advancements like SP1 Hypercube and R0VM 2.0. We’re now looking at sub-12-second proofs for most Ethereum blocks when using large clusters. This is huge for enabling “validity every block” for certain bridges, but just a heads up, it still won't quite match the sub-second updates we can get from BLS/Merkle setups. Dive deeper into this progress here.

Engineering gotchas (learned the hard way)

Just a heads up, batched verification isn’t automatically a safe bet. Zellic pointed out some zero-knowledge and soundness issues in gnark’s Groth16 extension for specific batched scenarios. So, if you’re thinking of rolling out custom aggregation, make sure you’ve got audits and formal verification in place. Don’t just assume that “batch == safe.” (zellic.io)
Keep an eye on those aggregation windows--they can create some unnecessary back-pressure. To combat that, try using dynamic batching with clear SLOs. For example, you might want to flush every X milliseconds or after Y items, whichever comes first. This way, you won’t be stuck waiting to fill the batch during slow periods.
Oh, and remember to account for fees and mempool dynamics! With rollups, blobs (EIP‑4844) can make big payloads more affordable and reliable; but when it comes to L1, calldata can spike, and EIP‑7623 hits data-heavy bundles with penalties. So, keep your updates sized with the current fee market in mind. (blocknative.com)
Lastly, think about the balance between security and liveness. The choice between ZK “hard finality now” and BLS/Merkle “soft finality now + hard finality later” is an important product decision. If you can, try to offer both options to give integrators the flexibility they need.

Implementation patterns with precise details

Price-feed fanout (EVM L2s): You can expect updates to be published every 400 ms (which is basically one slot). This involves signing a single root, and app transactions will include the latest update package, which is made up of the root and Merkle branches. With this setup, you're looking at 50-80% savings on multi-feed verification costs, all without adding much to the latency. More details can be found at pyth.network.
Proof-heavy apps (ZK coprocessors, verifiable AI): For these, proofs get verified off-chain in an AVS, and then you post an aggregated BLS attestation, which is about one pairing check. The settled result will be ready after just one block, and periodically--say, once an hour--you can aggregate recursively to L1 for hard finality. This setup strikes a balance between a smooth user experience and the security of L1. Check it out more at blog.alignedlayer.com.
Bridge validity checkpoints: If you find yourself needing to prove every header, you’ll want to batch those headers to achieve around 220-300k gas per verification on EVM (we're talking about Groth16-wrapped recursion here). Just keep an eye on batch size N to make sure you stay within your service level objectives (SLO). If you’re aiming for under 10 seconds, you might need to invest in some serious GPU capacity or think about reducing N. You can get more insights at blog.polyhedra.network.

A quick decision checklist for teams

Target p95 latency
- < 1 s: Go for Merkle + BLS. Avoid ZK on the hot path.
- 1-30 s: Consider a hybrid approach (AVS fast attest + periodic ZK checkpoints) or use small recursive batches if you've got GPU capacity.
- > 30 s: It's totally doable to have a full ZK light client for each update; just make sure to aggregate as much as you can to keep those fees down.
Destination Chain Realities
- When finality is just minutes (like those “finalized” Ethereum blocks), don’t count on cryptographic aggregation to speed up end-to-end latency. Instead, focus on fine-tuning your finality policy and route selection. (docs.chain.link)
Cost/capacity constraints
- When you hit the proof-verification gas saturation, using aggregation can help cut down queuing latency by bundling everything into a single transaction--just as long as you can create that aggregated proof within your service level objective (SLO). (blog.alignedlayer.com)
Security posture
- If your stakeholders need to verify each update, be prepared for a bit of a delay, or consider investing in real-time proving clusters. Wherever you can, try to mix and match approaches.

Practical next steps (what we deploy with clients)

Make sure to track your pipeline from start to finish. Measure for each stage's p50/p95--think about source finality, relay, proving, verification, and inclusion. Base your decisions on solid data, not just a gut feeling.
Kick things off with a Merkle + BLS setup, and make sure you have a ZK checkpoint path ready that you can activate later if needed.
If you’re diving into recursion:
- Go for hardened stacks like SP1 and R0VM, and plan out your GPU capacity. It’s a good idea to test with smaller benchmarks using your actual circuits and public inputs.
- Tap into tried-and-tested aggregation methods (like SnarkPack) and keep those public inputs to a minimum to manage calldata effectively. (eprint.iacr.org)
- Don’t forget to audit your batched verification logic. Avoid making changes on the fly without getting a review first. (zellic.io)
When it comes to EVM:
- Lean towards BLS12‑381 precompiles (check out EIP‑2537) for aggregation and to ensure future security.
- Keep an eye on how much calldata costs (EIP‑7623); try to aggregate to cut down on bytes; if you’re working with L2s, make use of blobs where it’s available. (blog.ethereum.org)

The verdict

Absolutely, aggregation can help make latency feel less annoying--especially when the hang-up is due to on-chain verification speed or mempool inclusion from those bulky payloads.
On the flip side, aggregation doesn’t automatically cut down end-to-end latency for cross-chain oracle updates; in fact, heavy recursive proving can sometimes add a few seconds to a couple of minutes. To balance out speed and cost-effectiveness, consider using signature/Merkle aggregation for the quick hits and save the recursive ZK aggregation for those occasional hard-finality checkpoints.

If your business needs updates in less than a second and wants to stay in sync with the market, get a Merkle + BLS pipeline up and running now, just like Pyth does, and you can add ZK checkpoints down the line. If your situation requires validity proofs for every single message, be ready for some extra latency or consider building a real-time proving infrastructure. And remember to aggregate those proofs to keep the verification costs under control!

References and Further Reading:

Check out Chainlink's insights on execution latency and finality by chain. (docs.chain.link)
Dive into the details of Pyth Perseus, which highlights 400 ms slots, single-root verification, and a cost reduction of 50-80%. (pyth.network)
Explore Aligned Layer's take on off-chain verification in milliseconds compared to minutes for recursive aggregation, plus gas and throughput numbers. (blog.alignedlayer.com)
Discover Electron Labs’ breakdown of Groth16 super-proof gas math, estimating about ~380k base plus ~16k per-proof access. (docs.electron.dev)
Unpack SnarkPack’s performance on aggregation, along with verification and aggregation times. (eprint.iacr.org)
Get the scoop on Pectra/EIP-2537, where BLS12-381 precompiles are now live, making aggregation cheaper, faster, and stronger. (blog.ethereum.org)
Learn about EIP-1108 pairing costs and what that means for on-chain BLS. (eips.ethereum.org)
zkBridge is pushing the envelope with two-layer proving (deVirgo + Groth16), showcasing ~220-230k gas verifies, 12-20 s demo latencies, and some batching trade-offs. (blog.polyhedra.network)
Zellic highlights batch-verification pitfalls in gnark’s Groth16 extension. It’s a must-read! (zellic.io)

7Block Labs has got your back when it comes to prototyping both options--a low-latency Merkle/BLS oracle pipeline and a gradual rollout of ZK checkpoints. We’ll run benchmarks using your actual feeds and chains, helping you figure out the best break-even points to aim for.

Could Bundling Thousands of Tiny Proofs into One Big Proof Really Speed Up Cross-Chain Oracle Updates?