7Block Labs
Blockchain Technology

ByAUJay

How Do Modern Proof Aggregation Layers Keep Latency Low When Batching Mixed Plonk and STARK Proofs?

Summary: The quickest proof aggregation stacks manage to keep latency at bay by recursively compressing STARKs, using accumulation methods to push back the Plonk verifier tasks, and bundling everything in a compact, EVM-friendly SNARK. They also strategically engineer queues, hardware, and proof types to reduce tail delays. Check out the detailed blueprint below, complete with timings, gas calculations, and operational patterns you can put to use right away.

Who this is for

  • Decision-makers looking into zk rollups, zk coprocessors, or prover networks who want reliable p95/p99 latency--not just a low average gas fee.
  • Teams that are already generating both Plonk(ish) and STARK proofs and finding it tough to batch them without introducing seconds or even minutes of lag.

The real latency budget in mixed-proof aggregation

When your aggregator processes a mix of different proofs like Plonk, STARK, and Plonkish/FRI variants, the overall speed (or end-to-end latency) is influenced by four main factors:

1) Batch Fill Time

  • When arrivals come in at a rate that's roughly Poisson with rate λ, if you’re waiting to gather N proofs, it usually takes about N/λ. If you've got a time cap of T, the median wait time comes out to about T/2. In those slow periods, this wait can really add up. It's a good idea to set up adaptive batch windows instead of sticking to a fixed N. Check out more details here.

2) Leaf Proving Times

  • STARK VMs: Right now, the leading networks break execution into smaller pieces called shards, hook them up for parallel proving, and then compress everything down recursively. Plus, some SDKs can skip local simulation, which helps cut down the time by a few minutes. Check out more details here.
  • Plonk Circuits: The time it takes to prove each circuit really varies depending on the specifics of the circuit itself. Thanks to aggregation, we can push the verification process to the end, which means that leaves won't hold each other up. You can find more in-depth info here.

3) Aggregation and Wrapping Time

When it comes to STARK→SNARK wrapping, there's typically a set overhead you'll need to consider. For instance, you might see around ~6 seconds added on when wrapping to Groth16, and it can jump to about ~70 seconds for Plonk in one of the popular stacks out there. This added time doesn’t really change based on how big your guest program is, which means it can really add up for those “tiny” updates. So, if you're concerned about latency, make sure to choose your wrappers carefully. Check out more details here.

4) On-chain inclusion

  • When you aggregate, the verification gas tends to stay pretty steady per batch. Thanks to Ethereum’s EIP‑1108, bn128 pairings got repriced to about 45k + 34k·k gas. This keeps Groth16 on BN254 as the go-to low-latency default for L1. If you’re looking at Pectra-era chains, you’ll also find BLS12‑381 precompiles available. Check it out here!

What “mixed Plonk + STARK batching” looks like in practice

Modern tech stacks tend to come together in a three-layer pattern:

  • Layer A -- Leaf Proofs:

    • STARK receipts generated from a zkVM (RISC‑V/EVM).
    • Plonk proofs created from application-specific circuits (think payments, Merkle checks, and custom logic).
  • Layer B -- compression/accumulation:

    • STARKs: This involves using recursive FRI verification within a recursion circuit. Plus, there's an optional “packing” feature that helps spread out the Merkle/FRI queries over multiple receipts. You can dive deeper into this here.
    • Plonk: Here, we're looking at accumulation and aggregation methods like aPlonK and SnarkFold. These techniques help push off pairings and reduce the verifier's workload down to about O(1) or O(log n). Check it out here.
  • Layer C -- final wrapper:

    • Generate a compact Groth16 or Plonk proof that confirms “the batch verifier accepted K Plonk instances and M STARK receipts.” This helps maintain on-chain gas costs to be manageable and minimal. (docs.succinct.xyz)

Two solid examples highlight this trend:

  • RISC Zero: You start with segment receipts, then move to STARK recursion (which gives you a succinct receipt), followed by identity_p254, and finish off with a Groth16 “receipt” for EVM. Check it out here.
  • Succinct SP1: Here, you take a shard and go through STARK recursion, with an optional Groth16/Plonk wrapper. What’s cool is that their network can run proofs in parallel across different machines, which really helps keep tail latency low. More info is available here.

Cryptographic building blocks that cut latency (without cutting corners)

1) Recursive FRI for STARKs

So, let’s talk about Recursive FRI for STARKs. Essentially, these recursion circuits take a second look at FRI proofs within a compact arithmetic circuit. Most of the heavy lifting happens during the Poseidon-based Merkle checks. To make things a bit easier, we can “pack” FRI queries (think STARKPack) which helps minimize the extra work needed for each additional proof. This strategy allows us to quickly bundle together a bunch of small zkVM segments before wrapping them up in any SNARK. Check it out for more details: proxima-one.github.io.

2) Accumulation/Folding for Plonk

  • aPlonK is pretty cool because it combines several Plonk statements using a multi-polynomial commitment. This means that both proof and verification get to scale at O(log n), which is nice! On the other hand, SnarkFold takes the idea of "defer expensive checks" to the next level. It combines multiple proofs, so that the heavy lifting with pairing only happens once at the end. Both of these methods help reduce per-leaf costs that could otherwise slow things down. You can check out more details here.

3) Cross-system recursion (Plonk + STARK together)

  • If you need to verify “wrong-field” artifacts (like BN254 pairings for a Plonk/KZG proof) within a STARK recursion circuit running on Goldilocks or Stark-prime, you'll want to use non-native arithmetic. Recent developments like WARPfold and the HyperNova/ProtoStar folding techniques offer some solid strategies for juggling multiple fields and arithmetizations in a single recursive pipeline. Check it out here: (eprint.iacr.org).
  1. A small, speedy outer SNARK
  • When it comes to EVM today, Groth16 on BN254 is still the go-to option for keeping latency and gas costs in check, all thanks to EIP‑1108. With the introduction of Pectra, BLS12‑381 precompiles are now an option, giving you more flexibility when it comes to your final wrappers. Just pick what works best based on your deployment needs and existing SRS. (eips.ethereum.org)

5) Vector-commitment Sidekicks for “Many Openings Now”

  • When your aggregator handles a bunch of commitments in one go, modern VCs like FlexProofs come to the rescue. They can crunch through all those openings in O(N) time with a customizable batch parameter b. Early tests are pretty impressive, showing speedups of up to 6× compared to older methods for N=2^16. This is a big deal, especially when per-proof metadata starts to bog things down. Check it out here: (arxiv.org)

Engineering the queues so batches don’t add seconds

  • Adaptive Batch Windows

    • Set up two triggers: size Nmax and time Tmax, and they go off whenever one of them hits. Adjust Nmax and Tmax based on your input stream (like Plonk vs STARK) to make sure neither gets starved. Use a simple Poisson approximation to model arrivals; during low traffic, lean on Tmax to keep wait times in check. (7blocklabs.com)
  • Split queues by work profile

    • Set up separate K8s queues for: (i) leaf proving, (ii) recursion/folding, and (iii) final SNARK wrap. It's a good idea to place recursion alongside fast NVMe for those transcript shuffles, and make sure to route things to GPU MIG slices that have plenty of memory. Some prover-network vendors really suggest explicitly isolating those recursion layers. You can check out more about this here.
  • Skip the unnecessary pre‑steps

    • For SDKs that usually go for local simulation before hitting the network, turn on “skip simulation” for those latency-sensitive flows; this little tweak can save you minutes on big traces. Just make sure to add some admission checks in your CI. (docs.succinct.xyz)
  • Pick the right wrapper for latency

    • When p95 wall-clock is a big deal for you, go for Groth16 wrapping instead of Plonk. This is especially true if Plonk wrapping tacks on about 70 seconds of fixed latency. But if you’re after a universal SRS or certain compatibility, then Plonk wrapping is the way to go. (docs.succinct.xyz)
  • Outsource parallelism when you hit single-box limits

    • Distributed prover networks let you spread the workload across multiple machines, which means that latency doesn’t just shoot up as your trace size increases. To get accurate results, make sure to benchmark on a latency-optimized endpoint instead of the default one. (docs.succinct.xyz)
  • Save your capacity for what really matters

    • Decentralized AVS-backed prover networks (like Lagrange on EigenLayer) are stepping up with top-notch operators and dedicated subnetworks. This means your batches won’t be stuck waiting in the public queue. You can also set up contracts for SLOs when you need to keep your p95 below a specific threshold. Check it out here: (prnewswire.com)

Low‑latency blueprints for two common “mixed” scenarios

A) Oracle/co‑processor: 1,000 Plonk micro‑proofs + 50 zkVM STARK receipts per batch

Goal: Keep it under 10-12 seconds from when the “last proof is received” to when it’s “on-chain verified.”

  • Step 1 -- Plonk accumulation:

    • When those micro-proofs start rolling in, toss them into an aPlonK/SnarkFold accumulator. This way, you can save the KZG pairings for later; the cost for adding each new Plonk proof is pretty minimal and can be done in parallel. Trigger the accumulator every Tmax = 2 seconds or once you hit 1,000 proofs. (eprint.iacr.org)
  • Step 2 -- STARK recursion/packing:

    • We’re gonna pack 50 zkVM receipts using a recursion circuit and STARKPack. This way, the Merkle/FRI verification tasks get spread out over all the receipts, which is pretty efficient. Let’s keep this in a separate queue so it doesn’t hold up the Plonk accumulation. Check out more on this over at nethermind.io.
  • Step 3 -- Final wrapper:

    • Check (i) the state of the Plonk accumulator and (ii) the compressed STARK proof wrapped inside a single outer Groth16. After that, you can post that single proof, which is around 260 bytes, on L1/L2. Just a heads up, the gas cost for verification tends to stay around 200-300k on the EVM these days. If you're setting your sights on a Pectra-aligned chain and your tech stack already uses BLS12-381, you might want to think about going with a BLS-based wrapper instead. (eips.ethereum.org)

What to expect:

  • The outer Groth16 wrap helps skip a bunch of on-chain transactions for each proof. Teams have noted that the base on-chain costs are usually in the ballpark of a few hundred thousand gas for similar “super-proof” setups. If you do need to include per-proof checks, they can end up costing around ~16k gas each. (docs.electron.dev)

B) Rollup checkpoint: STARK zkVM state transition + external Plonk proofs (side conditions)

Goal

The aim here is to keep tail latency in check while still maintaining predictable L1 gas usage.

  • Step 1 -- zkVM shards → recursion:

    • So, we start by proving block execution with STARK shards, which leads to a compact receipt through some nifty recursive joins. This is the approach that RISC Zero and SP1 have taken to keep the proof size constant and speed up the wrapping process. Check it out here: (dev.risczero.com)
  • Step 2 -- Pull in side-condition Plonk proofs:

    • You can batch them using aPlonK or SnarkFold; just make sure if the fields don’t match the zkVM's, keep them apart until the final wrap. This way, you’ll avoid any mix-ups with wrong-field arithmetic inside the recursion circuit. (eprint.iacr.org)
  • Step 3 -- Wrap once:

    • Use the Outer Groth16 over BN254 (or BLS12‑381 on Pectra chains) to check both the recursion result and the Plonk accumulator. Steer clear of multi-wrap patterns, like wrapping each receipt individually, since they can pile on those extra seconds for each wrap. (docs.succinct.xyz)

A note on real-time targets:

  • In 2025, a demo from a production prover network knocked out most Ethereum L1 blocks in about 10-12 seconds using a hefty GPU cluster. This really shows what you can pull off when you parallelize everything from start to finish. If you’re aiming for sub-12 seconds on the 95th percentile, make sure you plan for distributed proving right from the get-go. (theblock.co)

On‑chain verification design: gas and curves you should plan for

  • BN254 Today, BLS12‑381 is Looking Good

    • BN254 is still the go-to option for being budget-friendly, thanks to EIP‑1108. Essentially, the cost of your Groth16 verifier depends on how many pairings you need, plus a little extra fixed charge. But with Pectra rolling out on May 7, 2025, those BLS12‑381 precompiles are here, making BLS-based wrappers super practical--especially if you’re already using BLS curves in your setup. It’s smart to keep verifiers upgradable (with timelocks) so you can easily swap curves without having to redo your entire rollup. (eips.ethereum.org)
  • Keep public inputs small

    • When public signals are present, gas fees can shoot up. So, it's a good idea to hash large data (like Merkle roots or accumulators) into a single field element whenever you can. This tip works for both Groth16 and Plonk verifiers.
  • When Aggregation Stays in SNARK‑Land

    • If you’re just aggregating Groth16/Plonk user proofs on-chain, you can count on a steady base verification cost along with some minor inclusion checks for each proof, instead of diving into per-proof pairings. This approach tends to be cheaper overall and comes with lower latency compared to verifying each proof one by one. Check it out here: (docs.electron.dev)

Implementation details that matter for latency (and get missed)

  • “Wrong-field” arithmetic is a real latency issue

    • Checking BN254 pairings within a STARK recursion that's over a different field can really eat up cycles. If you find yourself in this situation, consider using dedicated non-native gadgets (like those “wrong-field” bigints in Halo2) or WARPfold-style folding to keep your costs manageable--and make sure to plan for the time it’ll take. (eprint.iacr.org)
  • Combine hash and field options whenever you can

    • By syncing up hash functions and base fields in both the Plonkish and STARK pipelines, we can cut down on translator costs in recursion circuits. This means fewer non-native operations and more efficient Merkle verifiers. The Plonky2/FRI recursion clearly shows that Merkle hashing takes the lead, so let’s not complicate things with unnecessary differences. (proxima-one.github.io)
  • Pre-warm CRS/SRS and verifying keys

    • To keep cold-start spikes at bay, make sure to cache SRS and VKs across your aggregator instances. It’s a good idea to pin VK hashes in your on-chain verifiers. Most production setups suggest using version attestation and pinned verifiers to avoid those pesky accidental upgrades that can really spike latency. (7blocklabs.com)
  • Go for “compressed/recursion‑friendly” proof flavors

    • Certain zkVMs need a specific compressed proof type for in‑circuit verification and aggregation. Sticking to the recommended flavor can help you sidestep any unexpected recursion hiccups and retries. (docs.succinct.xyz)

Operating with a prover network: keeping p95 low in the wild

  • Reserve capacity and co-locate queues

    • Reach out to vendors for reserved lanes and SLAs. Try to place recursion and wrapping jobs alongside high-bandwidth NVMe/GPU setups to cut down on queueing and I/O overheads. (docs.succinct.xyz)
  • Decentralize for liveness and burst absorption

    • Networks like Lagrange, backed by AVS, have a bunch of independent operators--think Coinbase, OKX, Nethermind, and others. They set up subnetworks with their own dedicated bandwidth, making it easy to handle sudden traffic spikes without cranking up latency. This really helps speed up batch completion times when things get busy. (prnewswire.com)
  • Adjust the SDK flags correctly

    • Go for “network” proving modes instead of local proving. Make sure to check the environment flags so that requests are spread out across different machines, rather than being confined to a single GPU. (docs.succinct.xyz)

Brief, in‑depth: how a single outer proof can attest mixed Plonk + STARK batches

  • Inside the outer circuit (Groth16 or Plonk):
    • Verify a Plonk accumulator:

      • You start with a Plonk/SnarkFold accumulator state, which includes commitments and linear-check witnesses. The goal is to prove that it corresponds to k underlying Plonk proofs without having to run through all the pairings again. You’ll only perform the final multi-pairing check once. Check it out here: (eprint.iacr.org).
    • Verify a compressed STARK:

      • Here, you run a recursion verifier gadget that checks FRI or “packed” FRI openings (known as STARKPack), along with Merkle paths for m receipts. This process leverages the same hash and finiteness assumptions specifically tuned for recursion. You can read more about it at (nethermind.io).
    • Enforce cross-batch invariants:

      • This step involves checking that both the Plonk accumulator and the STARK batch are committed to the same global state root or epoch. After that, you emit one neat proof for on-chain verification. The gas costs are pretty stable, almost constant in m+k. For more details, check out (eips.ethereum.org).

What “good” looks like in 2026: targets and sanity checks

  • Batch assembly: Get those adaptive N/T windows rolling for each stream, and keep that median wait time at or below T/2 when the load's light. (7blocklabs.com)
  • Recursion/packing time: You’re looking at sub‑second costs for each extra receipt if your FRI recursion is nicely tuned. Just keep in mind that the total is mainly influenced by Merkle hashing, so choose your hash and field combos wisely! (proxima-one.github.io)
  • Wrapper choice: Go for Groth16 when that p95 really counts; only consider Plonk if the SRS or the storyline really calls for it. Just a heads-up--there's a big difference in wrap overheads, with one major stack reporting ~6s versus ~70s. (docs.succinct.xyz)
  • On‑chain verify: Aim for about 200-300k gas per aggregate on EVM as of now. With Pectra chains, keep an eye on the BLS12‑381 options--they might change some of those exact numbers, so it's smart to prototype both. (eips.ethereum.org)
  • Prover network: Measure using a latency-optimized endpoint to get a better picture. Distributing proving should help flatten that size-latency curve. Real-time block-level proving demonstrations are hitting around 10-12s with big GPU clusters, which is a solid benchmark for your p95 design. (docs.succinct.xyz)

Emerging best practices we recommend to clients

  • Kick things off with STARK and wrap it up with SNARK (all on Ethereum): first, recurse and pack those STARK receipts, then give it a Groth16 wrap for that sweet predictable gas cost and compact calldata. Check it out here.
  • Keep those public inputs as light as possible: instead of dragging bulky statements around, hash them into a single field element before you verify; trust me, it saves on gas and reduces those pesky encoding delays.
  • Don’t go crazy SNARK-wrapping a bunch of medium receipts on their own: always recurse first! Wrapping adds a fixed penalty per proof that can really stack up. More details here.
  • Steer clear of wrong-field work unless you absolutely have to: if possible, verify those Plonk accumulators in the outer SNARK rather than inside the STARK recursion circuit. Only pull out WARPfold-style techniques if the requirements really make you mix fields. You can read more about it here.
  • Don’t forget to reserve some capacity: if you’re aiming for strict service level objectives, blend a latency-optimized private lane on a decentralized prover network with pinned verifiers and version attestation for the best results. Find out more here.

Final take

If you're looking to combine Plonk and STARK proofs and keep latency low, here's the winning strategy: start by recursing and packing your STARKs early. Then, you can fold and accumulate your Plonk proofs away from the critical path. Finally, wrap everything up with a compact Groth16 or Plonk outer proof. Don't forget to optimize your queues and hardware for that long tail!

With the tools coming out in 2025-2026--like recursive FRI, aPlonK/SnarkFold, STARKPack, Pectra’s BLS12‑381 precompiles, and well-established prover networks--you'll be able to hit a p95 latency in the low seconds to just under a dozen seconds for decent batch sizes, all without losing verifiability on L1. Check out more details here.


References (selected)

  • Check out the succinct SP1 docs covering recursion, wrappers, network usage, and latencies. You can find them here.
  • Dive into the RISC Zero recursion and the STARK‑to‑SNARK pipeline with the details available here.
  • Explore STARKPack from Nethermind, which focuses on FRI‑based aggregation. More info can be found here.
  • Don’t miss out on aPlonK and SnarkFold, which are geared towards Plonk aggregation. Check it out here.
  • There’s also EIP‑1108 covering bn128 repricing and the Pectra/BLS12‑381 precompiles. You can read about it here.
  • For a look at the Lagrange ZK Prover Network, this piece discusses EigenLayer, operators, and subnetworks. Take a peek here.
  • If you’re interested in Plonky2, the recursion profile highlights the dominance of Merkle/FRI. It’s all laid out here.
  • Lastly, make sure to check out the real‑time proving milestone for context on latency ceilings. You can find that information here.

Like what you're reading? Let's build together.

Get a free 30-minute consultation with our engineering team.

7BlockLabs

Full-stack blockchain product studio: DeFi, dApps, audits, integrations.

7Block Labs is a trading name of JAYANTH TECHNOLOGIES LIMITED.

Registered in England and Wales (Company No. 16589283).

Registered Office address: Office 13536, 182-184 High Street North, East Ham, London, E6 2JA.

© 2026 7BlockLabs. All rights reserved.