ByAUJay
7Block’s SLA Standards for Enterprise Maintenance Retainers
Enterprise
If you want to make sure your business is keeping up with all the necessary regulations, there are a few important areas you really should pay attention to. Alright, let’s dive into some key frameworks and concepts you really should get familiar with. Here’s the lowdown:
SOC 2
SOC 2 is really focused on keeping your data safe and sound. It's built for companies that deal with customer data, making sure they handle that information safely and securely. To achieve SOC 2 compliance, you’ve got to show that your systems meet the Trust Services Criteria. This includes making sure your security, availability, processing integrity, confidentiality, and privacy standards are all up to snuff. It's pretty crucial to get all these areas covered! When customers are choosing whether to trust your services, they usually keep an eye out for this certification.
ISO 27001
ISO 27001 really looks at information security from a broader perspective. It’s all about managing security in a more comprehensive way. This international standard is a great tool for organizations looking to handle their information security in a more organized manner. Getting ISO 27001 certification really highlights that you've got a solid system for protecting sensitive data. It also shows that you're committed to regularly checking in on and boosting your information security practices.
RPO/RTO
Getting a grip on RPO (Recovery Point Objective) and RTO (Recovery Time Objective) is super important when it comes to planning for disaster recovery.
RPO, or Recovery Point Objective, basically lets you know how much data you can handle losing if something goes sideways, like a disaster. So, let’s say your Recovery Point Objective (RPO) is 4 hours. That means you really should be backing up your data at least every 4 hours to stay on top of things. RTO stands for Recovery Time Objective. It's basically about how fast you need to get your systems back up and running after something goes wrong. If your Recovery Time Objective (RTO) is just 2 hours, it’s crucial to have systems ready to help you bounce back quickly. You really want to make sure everything's in place so you can get back on track in no time!
Finding the right balance between these two metrics is super important for keeping downtime and data loss to a minimum!
SIEM
A Security Information and Event Management (SIEM) system is super important for keeping an eye on and handling security events and incidents as they happen. SIEM solutions are super helpful when it comes to looking at security alerts from your apps and network devices. They make it easier to spot potential threats and jump on them before they turn into bigger problems.
SLA Credits
Service Level Agreements, or SLAs for short, usually have a section about SLA credits. These credits are basically a way to compensate you if the service falls short of the performance standards we agreed on. It's like having a safety net in case things don’t go as planned! It's really helpful to get a grip on how these credits function and when you can actually go ahead and claim them.
Procurement
When you're looking to get services or tools that meet these standards, be sure to ask the right questions! It's super important to think about how well the vendor meets compliance standards and whether that fits with what your organization needs. Having a good procurement process in place is key to making sure you're working with reliable vendors.
Audit Readiness
Being audit-ready is crucial. Make it a habit to check in on your processes and paperwork every now and then. This way, you’ll be well-prepared for any upcoming audits! Staying on top of this not only keeps you compliant but also enhances your organization’s reputation. Make sure to keep your records neat and current--it really pays off when audit season comes around! Trust me, staying on top of things makes the process so much smoother.
If you keep an eye on these key areas, you’ll not only boost your company’s security but also earn your customers' trust for the long haul.
The specific headache you already feel
- So, you've got this big challenge ahead of you: making sure you hit that 99% guarantee. So, you're aiming for “95% uptime” on your production dApp, right? Well, here’s the thing: your dependencies--think AWS regions, RPC providers, L2 blob fees, and ZK provers--are all over the map when it comes to their service models. And they can change in a heartbeat, especially now that EIP-4844 has been rolled out. It's a bit of a wild ride! Finance is on the hunt for a simple ROI model, while Security is keen to keep everything in line with SOC 2 guidelines. On top of that, Engineering is aiming for SLOs that won’t slow down deployment.
Your smart contracts can be upgraded, and you have the choice between UUPS or Transparent options. So, Operations is looking for a straightforward, well-documented playbook for handling P1 patches, hotfix timelocks, and all those pause-and-monitor controls. The goal is to keep both the change-management auditors and the legal team satisfied.
So, when it comes to procurement, they’re really pushing for “service credits” that can actually be enforced. But here’s the catch--there’s some inconsistencies with your incident taxonomy (you know, SEV1 through SEV3), error budgets, and those pesky vendor dependencies. It’s just not lining up as smoothly as it should! Hey, just curious--if there's a sudden blob-fee spike or if the RPC goes down during a launch window, who's going to cover the costs? And what does your mean time to recovery (MTTR) look like in situations like that?
What’s at risk if you don’t fix it
- Missed Launch and Renewal Deadlines: If you’re not on top of your error budget or if the severity levels feel a bit fuzzy, you could end up stuck in those freeze windows. And let me tell you, that can really slow down revenue from your products or new features. So, Google SRE mentions that error budgets are really handy for finding that sweet spot between keeping things reliable and being able to roll out changes quickly. If something happens that chews up more than 20% of your budget in just four weeks, we hit pause on everything to sort it out. If we don’t put a solid process in place, launching new features can turn into a bit of a guessing game. (sre.google).
- Cost and Compliance Risks: Alright, so here’s the scoop: in 2024, if you're looking at the average cost of a data breach in the U.S., it's a whole different ballgame! was a staggering $9. 36M (global average: $4. 88M). If your audits point out issues with patching or change controls, it could really hit you in the wallet and make renewals riskier. (cfo.com).
- PCI DSS 4.
0. They’ve made some changes to the rules around patch SLAs. Now, if there’s a critical patch, you’ve got to apply it within 30 days. As for high-severity patches, they used to all have a blanket 30-day deadline. But now, they’ll be governed by a risk-ranking system. So, it’s a bit more tailored to the specific situation. This could really catch anyone off guard who has weak DevSecOps processes in place. Hey, just a heads up: if you can’t match a CVE to its patch within 30 days for those critical issues, it could really hurt your assessments. Make sure you keep an eye on that! (secureframe.com).
- When it comes to vulnerabilities that have been identified as being actively exploited, the U.S. According to federal guidelines, if you're dealing with post-2021 CVEs, you'll need to fix them within a two-week window. Even if you’re not required by federal laws, your board or insurance companies might still want you to stay on top of things. (cisa.gov).
- Multi-Dependency Reliability Math Just Doesn’t Add Up: The Service Level Agreement (SLA) for EC2 at the AWS Regional level is set at 99%. You’ve got a 99% uptime guarantee, but there’s a catch--it only kicks in if you’re using at least two Availability Zones (AZs). If you’re just running on a single-instance setup, that SLA drops down to 99%. So, if you want to make sure you’re getting the best reliability, it’s definitely worth spreading things out a bit! 5%. If your RPC and prover fleet aren’t synced up, you might feel like you’re hitting that “four nines” mark on paper. But honestly, you could just be operating at “two nines” in reality. ” (aws.amazon.com).
- You might hear node providers touting a 99% uptime guarantee. You can count on a 99% reliability rate, but keep in mind that actual uptimes can vary over a 90-day stretch. For example, if you check Infura’s public status, you'll see that their component uptimes can range from 99%. 58% and 100%). If you don’t mix it up with your providers and pay attention to those p95 RPC latency and error rates, you might find your actual service level objectives (SLOs) dipping below what you’ve agreed on. (status.infura.io).
- Post-EIP-4844 Volatility: So, here’s the deal--blobs have really brought down the costs for posting on Layer 2, which is great news! But, keep an eye out--when there's a lot of demand for non-L2 "blobscription" services, those blob base fees can spike pretty quickly.
At the beginning of everything, blob gas prices skyrocketed to hundreds of gwei, which really messed up the cost structure for a while.
If your batcher can’t switch between calldata and blobs when it needs to, you’re going to see your “cost SLO” skyrocket just when traffic is at its highest.
(blocknative.com).
Here’s the deal: If you’re not working with solid, enforceable SLOs that take dependencies into account--linked to things like incident severity, patch windows, and cost controls--then your assertions of "99. Having "95% uptime" and being "SOC 2-ready" can actually lead to some pretty significant legal and financial issues down the road.
7Block’s technical-but-pragmatic SLA methodology
We put together SLAs that actually work in real life and hold up when someone comes in to check them out. So, what does all that mean in practical terms? Basically, we're diving into things like clear SLOs you can actually measure, setting up error budgets, having rituals for handling incidents, and crafting procurement clauses that are specifically designed to work with your tech stack. Think of it like tailoring everything to fit your needs--whether it's Solidity contracts, rollup posting and batching, RPC providers, provers, or cloud services. It’s all about finding that perfect fit! On top of that, we tie everything back to SOC 2 and ISO 27001 controls. We also keep an eye on DORA metrics and offer simple service credits.
1) SLOs that cascade from user experience to chain dependencies
- Availability SLOs (monthly):
- Tier A (for customer interactions): 99. So, that’s 95%, which works out to roughly 21. You’ve got about 9 minutes of downtime to play with each month.
- Tier B (admin rails/analytics): 99. 9% (roughly 43. 8 minutes monthly).
- Error budget policy: So here's the deal--if we burn through more than 20% of our error budget in just four weeks, we'll hit pause on rolling out new features and dive into a P0 postmortem. And if we face over 20% on the same type of outage within a quarter, we’ll need to elevate that to a P0 item in our quarterly plan. (sre.google).
- Performance SLOs:
So, just a heads up, for the p95 RPC latency, we’re aiming to keep things pretty snappy--like, we want reads to be at or below 300 ms and writes to be at or below 800 ms. This is for each chain and each provider, just to keep everything running smoothly! - If the block height falls behind L1/L2 by more than 2 blocks for over a minute, we’ll automatically switch to a different provider. We want to keep the ZK prover queue time at the 95th percentile to 3 minutes or less. To help with that, we’ll set up autoscaling and job aging alerts to make sure everything runs smoothly.
- Data integrity SLOs:
Alright, here's what we need: we want to set up state divergence alarms using a multi-client setup with Geth and Nethermind. The plan is to do quorum reads before we write anything. To gather our metrics, we'll be using Prometheus and Grafana, tapping into the documented endpoints for both Geth and Nethermind. Sounds good?
(geth.ethereum.org). - Cost SLOs (post‑4844):
Our batchers are planning to focus on blob posting for now. But if the blob base fee shoots up--like, if it hits around 10 times the execution base fee--we’ll quickly pivot to using calldata instead. This way, we can keep the Layer 1 posting costs in check, especially during those times when there’s a lot of blob congestion going on. (blocknative.com).
Making Metrics Observable
We grab Geth and Nethermind metrics from the /debug/metrics/prometheus endpoint and then get them all set up on our usual dashboards. It’s a pretty straightforward process!
We also added some custom panels to keep an eye on stuff like mempool health, how many peers we’ve got, block import times, p95 RPC latency, and a side-by-side look at “blob vs calldata” spending.
If you want more info, just click here to dive into the details!
2) Incident management that procurement and auditors accept
- A severity ladder that lines up with industry standards and takes vendor risk into account:
- SEV1: Alright, this is a major issue--a total outage, data loss, or a crucial feature that’s just not working for anyone.
- SEV2: This means there's a pretty significant issue going on, like a major slowdown, or some users might be facing an outage.
- SEV3: We’ve got some minor hiccups here, but don’t worry--there's a workaround we can use to get things back on track! We make some adjustments to our paging rules, change up how often we communicate (usually aiming for updates every 30-60 minutes), and we figure out those RCA deadlines depending on how serious the situation is. (atlassian.com).
- Goals included in your Service Level Agreement: For P1 issues, our goal is to recognize the problem within 15 minutes or less. We want to kick off some mitigation actions within an hour and make sure we finish up the root cause analysis, complete with action items, within five business days.
- DORA-aligned ops metrics in the appendix: In there, you'll find key metrics like lead time, deployment frequency, failed deployment recovery time, and change failure rate. These figures are super helpful for understanding our operations! Oh, and make sure to check out the new update for 2024--the deployment rework rate! You won't want to miss it! These metrics really help us figure out our quarterly reliability objectives and get a sense of how we’re doing with our error budget. (dora.dev).
3) Security, change, and patch SLAs that pass SOC 2 and ISO 27001 scrutiny
- Managing Changes for Upgradeable Contracts: Here at UUPS/Transparent proxies, we really focus on transparency. We've got solid governance in place with guidance from both the owners and ProxyAdmin. We’ve got a solid system in place for upgrades. Everything's managed with role-based access (thanks to _authorizeUpgrade), so only the right people can make changes. For any non-urgent updates, we use timelocks to keep things in check. And just in case something goes south, we’ve got a “break-glass” Pausable flow ready to go. It’s all about keeping things secure and under control! To make sure everything stays secure, we rely on OpenZeppelin Upgrades plugins for those important upgrade-safety checks and managing our contract registry. It’s a smart way to keep our system running smoothly! Check it out here!.
- Patch Timelines:
Hey there! Just a heads-up, if there's a serious vulnerability, specifically with PCI DSS 4.
0. 1 6. 3. 3) As soon as it's out, we dive into it within 30 days. When it comes to high-risk issues, we always take the time to note down any exceptions or compensating controls. As for the KEV-listed exploited CVEs, we're working to get those patched up within just two weeks. We're always keeping tabs on CVSS and we've got a pretty thorough inventory of our assets, including all those open-source components. Plus, we make sure to tie any fixes directly to their change tickets. It's all about staying organized and on top of things! If you're looking for more info, check this out here!
- Audit Alignment:
We’ve got our processes all lined up with SOC 2 Type II standards, covering key areas like incident response, change management, logging, and access. We've also aligned our controls with ISO/IEC 27001:2022, particularly focusing on incident response as outlined in Annex A 5.
- And then there's the event reporting mentioned in Annex A 6.
8).
On top of that, we put together your evidence binder and run some tabletop exercises to make sure you’re all set before the auditors show up.
Check out all the deets here! You’ll find some really useful info!
- RPO/RTO Policy:
Hey there! Just a heads-up: we've got our Recovery Point Objective (RPO) set to 15 minutes or less for the crucial transactional data. For any high-severity level 1 (SEV1) issues, we're aiming to keep our Recovery Time Objective (RTO) to an hour max. We manage this by using a combination of multi-AZ failover and hot-standby RPC/provider routing.
4) Multi-provider, multi-AZ architecture with automatic failover
- Cloud: We're working on establishing a strong foundation on AWS, targeting a region-level SLA of 99%. We hit that 99% mark by making sure we use at least two availability zones for our stateful components. It's just a smart way to keep things running smoothly! Hey, just wanted to give you a quick heads up--single-instance workloads usually aren't on the critical path. If you want to dive deeper into it, you can find more info here.
- RPC: We're currently teaming up with two providers--kind of like a combo of Infura and QuickNode or Alchemy. We're also keeping track of their health and monitoring for any lag issues. We only work with providers that meet those p95 latency and error benchmarks. And hey, if something goes sideways, we can switch things up in just a few seconds! Absolutely! The public status page gives us a glimpse of what’s going on, but we’ve got our own SLO probes to really dive into the details. If you're looking for more info, you can check it out right here.
- Keeping an Eye & Responding: We’ve got our backs covered with OpenZeppelin Monitor and Forta to keep track of everything on-chain. These tools send us alerts for stuff like admin changes, big approvals, and any unusual transfer activities. It's like having a watchful eye on our operations! We've hooked these up with Opsgenie and PagerDuty, plus we've added some cool auto-response playbooks. With these, we can pause things, limit the rate, or even kick in circuit breakers whenever necessary. It's all about keeping everything running smoothly! Hey, just a heads up! OpenZeppelin is planning to wrap up their Defender SaaS by July 1, 2026. But no worries--we'll be moving our clients over to their open-source Monitor/Relayer well before that deadline hit. If you want to dive into the details, just click here. Happy exploring!
5) Cost governance for post‑4844 rollup operations
We build batchers that cleverly decide whether to use blobs or calldata, depending on the current blob base fee and execution base fee. Plus, we've set up some safety measures to ensure we don’t get stuck covering costs for nearly-empty blobs when things get really hectic. Back in the days of the early "blobscription" peaks, blobs typically came in at a lower price compared to calldata for L2s. That said, there were definitely some exceptions, especially when it came to those smaller, less efficient payloads. To help us manage our spending better, we've set some limits on payloads and created a switching system. (blocknative.com).
- Governance: We’ve put together some cool dashboards that display the L1 posting costs for each batch and L2. Plus, we’ve highlighted the savings we’ve noticed compared to our previous expenses before EIP-4844. If blob fees stick around and end up being higher than what we’re aiming for, we'll adjust the posting schedule to keep our monthly costs in check. The Ethereum Foundation just shared the scoop on Dencun/EIP-4844 activation, and they’ve laid out what we can look forward to regarding L2 fee drops. Plus, we’re throwing in our own alerts to keep you posted on any shifts in the blob market. Keep an eye out! (blog.ethereum.org).
- Tackling SEV1 Issues for L2 Posting During Blob Congestion.
- Situation: So, we finally launched, and guess what? We got slammed with traffic, which led to a rise in blob fees. So, our batcher noticed that the blob base fees were skyrocketing--like more than 10 times the execution base fee! To tackle that, it made a quick switch to calldata for about 30 minutes. Making this move really helped us maintain a steady flow of work without letting costs spiral out of control. Blocknative pointed out that they noticed some pretty wild spikes during the first blobscription event. The blob base fees shot up to about 650 Gwei, which is a huge jump! Fortunately, our guardrail steps in to prevent us from wasting 128 KiB of gas on something that only needs 1-2 KiB of payload. That's a real lifesaver! (blocknative.com).
- Ops Outcome: We were able to send out SEV1 communications in just 15 minutes and rolled out the mitigation in under an hour. That meant we hit all our service level objectives (SLOs) without a hitch! Finance has rolled out a handy new graph that shows the “max cost per batch” clearly, so you won’t have any unexpected charges catching you off guard anymore.
Emergency Upgrade Path for an Access-Control Bug
- What’s Happening: So, we’ve come across a UUPS-Upgradeable contract, and it looks like there’s a little mix-up with the roles on the
mint()function. Not the greatest news, huh? - Response: So, here’s the plan: we start by using a Forta-style detection bot that spots the problem and shoots an alert over to the on-call team.
After that, we dive right in with a pre-approved runbook that lays out a "pause and access fix."
Next up, we’re suggesting an upgrade that takes advantage of the OpenZeppelin Upgrades checks.
If it’s not super urgent, we can just move it through the timelock.
But if money's at stake, we might need to "break the glass," as our policy suggests.
Thanks to OpenZeppelin’s UUPS pattern and their awesome proxy admin tips, we’ve dodged any upgrade lockouts. Plus, we’ve got the
_authorizeUpgradefunction all set to go! Feel free to take a look at it here: OpenZeppelin Docs. You might find some really useful info! - Audit trail: We make sure to keep a detailed record of everything. This includes change requests, test artifacts, sign-offs, and we even do a comprehensive 5-day Root Cause Analysis (RCA) to dig deep into any issues. This process really covers all the bases when it comes to SOC 2's change and incident controls.
3) PCI‑impacted Consumer Payments dApp (patch SLAs)
- Situation: So, here’s the deal--we’ve got a serious CVE hiding out in one of our transitive open-source dependencies, and it’s impacting our PCI 4 compliance.
0. It’s super important to take care of those critical patches! We really need to get them done in 30 days or less. It's a big deal! Also, CISA KEV has pointed out that there's some active exploitation going on with a related CVE. So let’s keep this on our radar and aim for a 2-week turnaround. If you want to dive deeper into this topic, you can check it out here. Happy reading!
- Response: Alright, here’s the game plan: First, we’re going to connect the SBOM to the repository that’s been impacted. After that, we’ll deploy a hotfix and put everything through some blue/green canary testing. Sounds good? After that, the Defender/Monitor will keep a close watch to make sure everything's running smoothly and that the function invariants are still holding up after deployment. We’ll make sure to get Procurement all the timelines and supporting evidence they need for the QSA review.
4) Observability Drill -- Proving SLOs Aren’t Just Aspirational
- Setup: So, we’ve got our Geth and Nethermind nodes busy exporting metrics, and Prometheus is primed to scoop those metrics up from the
/debug/metrics/prometheusendpoint. In Grafana, we're closely monitoring a few key things: p95 RPC latency, how long it takes to import blocks, the number of peers, and any lag at the chain head. It's important for us to stay on top of these metrics! To see how everything holds up, we start a simulated spike in RPC latency. It’s great to see that the automatic failover kicks in smoothly and stays within our service level objective! If you're looking for some tips on which flags and endpoints to standardize, definitely take a look at the Geth documentation. You can find it right here: geth.ethereum.org. It's super helpful!
What “good” looks like in your contract (ready for legal/procurement)
- Availability & performance
Tier A endpoints guarantee a solid 99% uptime. You can expect around 95% uptime each month. Just so you know, if there happens to be a 15-minute stretch where the uptime dips below 99%, that will count against your service credits. ”. We keep an eye on the p95 RPC latency thresholds for each chain and provider. If we notice a lag of more than 2 blocks that lasts longer than 60 seconds, we count that as a fault. ”. - Security & patching
We tackle critical patches within about 30 days, but when it comes to KEV-listed exploited CVEs, we really strive to get those sorted in just 14 days. We run vulnerability scans every three months and keep an updated Software Bill of Materials (SBOM) on hand. ” (secureframe.com). - Incident response
If there's a SEV1 incident, we'll get back to you within 15 minutes and keep you in the loop with updates every hour until everything's back up and running smoothly. You can expect to get a root cause analysis in about 5 business days. It will include some action items and an overview of how it impacted the error budget. ”. - Managing changes, like contracts that can be upgraded. We handle upgrades using UUPS/Transparent and have a role-based control system in place. On top of that, we run safety checks with OZ Upgrades and have timelocks set up. Just to be on the safe side, we test our emergency pause procedures every three months. ” (docs.openzeppelin.com).
- Service credits
If availability drops below 99%, you'll snag 10% of your monthly fees back. It's 95%, but it tends to hang around above 99. If it falls below 99%, then it’s at 0%. It's actually a 30% credit, not 0%. "This fits perfectly with the usual cloud credit schedules, so it should be a breeze for vendor finance to pick up on." ) (aws.amazon.com).
Alright, let’s get these added into your MSA/SOW. We’ll also throw in a RACI and set up a reporting rhythm while we’re at it! With this approach, your vendor risk team will be able to give their approval in a snap!
GTM metrics that translate to ROI
- Uptime reality check:
So, if you're diving into AWS, you should definitely check out EC2 with multi-AZ. It offers a pretty reliable 99% uptime, which is solid! You've got 99% uptime, which is a solid improvement over the usual 99%. Just a heads up, there's a 5% rate for a single-instance setup. If any part of your critical path is depending on just one instance, reaching that "four nines" target is probably going to be a tough stretch. That's why we prioritize building systems that utilize multi-AZ and multi-provider configurations. It really helps us meet those service level objectives we aim for. (aws.amazon.com). - Winning on costs after Dencun, but with some guidelines in place: Hey everyone! I've got some awesome news to share! The Ethereum Foundation has just announced that Dencun/EIP-4844 is set to launch on the mainnet on March 13, 2024. Can't wait to see how this unfolds! So, with Layer 2 transactions moving to blobs, we're definitely going to start seeing some lower fees! On top of that, our flexible switch and payload thresholds have done a great job of keeping our monthly L1 costs consistent, even when we’ve faced those big blob spikes. (blog.ethereum.org).
- DevOps performance evidence:
Hey there! So, guess what? The newest DORA update for 2024 has rolled out a cool new feature: deployment rework rates! We’re really focused on connecting those change failure rates to how quickly you bounce back from failed deployments. Plus, we’re making sure this all ties into your SLA credits and those quarterly error-budget reviews. Exciting stuff, right? Here are some great ways to reduce those unexpected tasks and get features delivered faster! (dora.dev). - Cutting down on costs associated with risks and breaches:
So, check this out: IBM's 2024 report reveals that the average price tag for a data breach has skyrocketed to a whopping $4 million!
It's 88 million, and it’s actually even more in the U.S.! Here’s the bright side: companies that are using AI and automation for prevention are actually saving around $2. 2M on breach costs. Our SOC 2-compliant monitoring and automation processes are designed to really speed things up. We can help reduce the time it takes to detect issues (MTTD) and get things back on track (MTTR). Plus, we provide you with all the audit-ready evidence you’ll need along the way. This can really help you save on costs! (newsroom.ibm.com).
How we deliver (and where we plug in)
- Build & integrate
We develop and integrate smart contracts that are not just easy to upgrade, but also come with features like pausable controls and role models specifically tailored for audits. If you're curious, take a look at our smart contract development and dApp development services! Check out our deep dive into L1/L2 architecture, cross-chain solutions, and rollup posting strategies that help keep costs low using blob-aware techniques. If you're curious, be sure to explore our cross-chain solutions development and blockchain integration services! - Operate & secure
We've got you covered with ongoing monitoring, quick incident response, and pre-audit hardening. If you want to dive deeper into our security audit services, check it out! - Experience seamless delivery along with solid uptime and cost service goals, plus top-notch vendor management. Take a look at our web3 development services and explore our broader range of blockchain development services. We’ve got you covered!
- Financial backing and go-to-market assistance. We're here to give a hand to programs looking for a financial lift to hit those compliance and service level agreement milestones. If you're curious about how we can help, check out our fundraising support section for more info!
Implementation checklist you can run this quarter
- Establish Service Level Agreements (SLAs) that include some clear and measurable Service Level Objectives (SLOs) along with error budgets. Focus on key aspects like availability, latency, lag, and cost. First things first, set up a severity ladder so you can easily prioritize issues. Next, figure out how often you’ll send out updates and when you’ll notify people. Oh, and make sure to connect your status page, too! Lastly, don’t forget to lay out a timeline for your root cause analysis. (atlassian.com). Make sure your patch SLAs match up with PCI and SOC 2 policies--try to get those critical patches sorted out within 30 days and the KEV ones even quicker, ideally in 14 days. It’s a good idea to track your progress using a ticketing system and an SBOM to stay on top of everything. (secureframe.com). Make sure you're tapping into a bunch of different RPC providers. It's a good idea to set up some quorum reads and keep an eye on lag with alarms. And don’t forget, always double-check what those providers are saying by running your own tests. (status.infura.io).
- First, grab Geth or Nethermind and make sure you have ZK provers set up with Prometheus and Grafana. Don't forget to set up alerts for things like p95 latency, lag, and job queue times! (geth.ethereum.org). Hey there! Just a heads up, you might want to buckle up a bit because 4844 could bring some ups and downs. It’s a good idea to refine your batch switching logic and establish those payload thresholds. Oh, and don’t forget to whip up some dashboards to keep an eye on how your blob spending stacks up against calldata. Happy coding! (blocknative.com). Hey there! Just a quick reminder: make sure to align your incident controls and event reporting for ISO 27001:2022 with your SOC runbook. And don’t forget to practice this every quarter. It’s super important to keep everyone sharp and ready! (isms.online). Make sure to link your DORA metrics to those quarterly reliability OKRs. And don’t forget to share the error-budget burn-down when you’re doing your executive reviews! It’s a great way to keep everyone on the same page. (dora.dev).
FAQ-level specifics your stakeholders will ask
Hey! So, what’s the deal with 99? So, what does 95% uptime actually mean? Basically, it means that a system is expected to be up and running 95% of the time. That breaks down to about 21 hours of downtime in a month. So, if you're relying on something with that kind of uptime, you might want to keep in mind that it could be offline for a little while each month. That's just about 9 minutes of downtime spread out over a typical month of 30 days. If you’re checking out 99, 9%, that's roughly 43. Just 8 minutes, and you get a solid 99. You're looking at roughly 99%, but that drops to around 4. 4 minutes. We make sure to handle any downtime really thoughtfully, especially during maintenance windows and when we’re dealing with different risks. So, are we all set with our compliance? With the ISO 27001:2022 transition deadlines looming, a ton of companies managed to finish their migration by October 31, 2025. We're going to dive into your current controls and make sure they match up with the 2022 Annex A and SOC 2 Trust Services Criteria. It's a team effort, and we'll work side by side to identify and address any gaps during the pilot phase. If you want to dive deeper into this topic, feel free to check it out here. It’s definitely worth a look! So, how do we avoid falling into the trap of “vendor blamestorming”? Well, one way is by creating cross-vendor SLOs. That just means we set some shared performance goals among the different vendors we’re working with. Plus, using neutral probes can help keep things fair and unbiased. And let’s not forget about service credits; aligning those with what’s standard in the cloud can really go a long way in keeping everyone accountable without pointing fingers. With this approach, your finance and legal teams will find it super easy to get a grasp on these agreements and keep everything on track. If you're curious to dive deeper into this topic, you can check it out here. Enjoy!
At 7Block Labs, we whip up Service Level Agreements (SLAs) that hit the sweet spot for everyone involved. Our SLAs are super practical for engineering teams, easy to audit for security folks, budget-friendly for the finance team, and totally enforceable by procurement--plus, we’ve got you covered on everything from Solidity and ZK to L1/L2 and all that good stuff in between. If you're after something that’s ready to go and offers solid returns on your investment, you’re in the right place! We've got everything you need to make that happen. We're going to put everything into action and support it with service credits and easy-to-use dashboards.
Book a 90-Day Pilot Strategy Call
Are you excited to get started on your journey? Let’s jump into a 90-Day Pilot Strategy Call! This is a great opportunity to create a personalized plan just for you. Want to book your session? Here’s a quick and easy guide to get you started:
1. Pick a Time: Just grab a time that suits you best with our super handy scheduling tool. It’s a breeze! 2. Fill Out the Form: We just need a few details from you to ensure we're all set and in sync! 3. Confirmation: Keep an eye on your inbox! You’ll receive a confirmation email soon with all the details you need.
Just a quick reminder that this call is all about YOU and what you’re aiming for. I’m excited to hear your thoughts! Let's really make the most of this opportunity!
Like what you're reading? Let's build together.
Get a free 30-minute consultation with our engineering team.
Related Posts
ByAUJay
Building 'Private Social Networks' with Onchain Keys
Creating Private Social Networks with Onchain Keys
ByAUJay
Tokenizing Intellectual Property for AI Models: A Simple Guide
## How to Tokenize “Intellectual Property” for AI Models ### Summary: A lot of AI teams struggle to show what their models have been trained on or what licenses they comply with. With the EU AI Act set to kick in by 2026 and new publisher standards like RSL 1.0 making things more transparent, it's becoming more crucial than ever to get this right.
ByAUJay
Creating 'Meme-Utility' Hybrids on Solana: A Simple Guide
## How to Create “Meme‑Utility” Hybrids on Solana Dive into this handy guide on how to blend Solana’s Token‑2022 extensions, Actions/Blinks, Jito bundles, and ZK compression. We’ll show you how to launch a meme coin that’s not just fun but also packs a punch with real utility, slashes distribution costs, and gets you a solid go-to-market strategy.

