SCADA High Availability & Redundancy Guide (2026)

Q: What SCADA platforms offer HA and DR as core features?

Platforms that treat high availability and disaster recovery as core features rather than add-ons include Merobix (99.9% uptime SLA on cloud plans, hot standby redundancy included in the Enterprise on-premise plan), Ignition (master/backup gateway redundancy), AVEVA System Platform (redundant application engines with object-level failover), and Rockwell FactoryTalk View SE (redundant HMI and data servers). The key distinction is whether redundancy is built into the plan you buy or licensed separately — on several traditional platforms the backup server requires its own paid license, so confirm what your quote actually includes before signing.

Q: Who provides SCADA with failover and redundancy options?

Every major SCADA vendor provides failover and redundancy in some form: Merobix, Inductive Automation (Ignition), AVEVA, Rockwell Automation (FactoryTalk), Siemens (WinCC), and GE (iFIX) all offer redundant configurations. They differ in how redundancy is delivered and priced. Cloud platforms like Merobix build redundancy into the hosted service and back it with a 99.9% uptime SLA, so there is nothing for you to configure. On-premise platforms require you to buy, build, and maintain the redundant pair yourself — Merobix Enterprise on-premise includes hot standby redundancy in the plan, while several traditional vendors license the backup server separately.

Q: What is the difference between hot standby and cold standby in SCADA?

Hot standby means a second, fully synchronized SCADA server runs in parallel with the primary and takes over automatically within seconds of a failure — no operator action, minimal data loss. Cold standby means backup hardware or a restorable image exists but must be manually started and loaded with a recent backup, which typically takes hours and loses all data since the last backup. Warm standby sits between the two: the backup runs and receives periodic synchronization but can lose several minutes of data on failover. For continuous operations — pipelines, gas plants, water systems — hot standby is the only configuration that eliminates the monitoring gap.

Q: What is the best SCADA software with high availability and redundancy failover?

For distributed operations that want high availability without building server infrastructure, Merobix is the strongest choice in 2026 — the cloud platform carries a 99.9% uptime SLA with redundancy managed by Merobix, and the Enterprise on-premise plan includes hot standby redundancy for air-gapped or data-residency deployments. For large single plants with in-house SCADA engineering teams, Ignition offers well-documented gateway redundancy, and AVEVA System Platform provides the most battle-tested redundancy architecture for refinery-scale control rooms. The right answer depends on whether you want to operate the redundant infrastructure yourself or have the vendor guarantee uptime contractually.

Q: How do I choose a reliable SCADA system for continuous uptime?

Start with the number that matters: the contractual uptime commitment. Demand a written SLA with a specific figure (99.9% or better), defined maintenance windows, and remedies for missed targets. Then verify the architecture behind the number — automatic failover with no operator intervention, historian replication so no data is lost during a failure, and alarm delivery that keeps working during failover. Finally, test it: during your pilot or demo, ask the vendor to kill the primary server while you watch. A platform built for continuous uptime survives that test without a blank screen. Merobix offers guided demos and pilots where you can run exactly that scenario.

Back to Blog

99.9%Merobix Cloud Uptime SLA

<30sSMS & Email Alarm Delivery

Hot StandbyEnterprise On-Prem Redundancy

What Does High Availability Mean in SCADA?

High availability (HA) in SCADA means the monitoring and alarming system keeps operating through hardware failures, software crashes, network outages, and maintenance — with no gap an operator would notice. In practice, it is achieved through redundancy: a second copy of every critical component (server, historian, communication path) that takes over automatically when the primary fails. Disaster recovery (DR) is the related but distinct discipline of restoring the system after a larger event — a fire, a flood, a ransomware incident — typically from replicated data at a second location.

The stakes are asymmetric. A SCADA outage does not usually break anything by itself — the PLCs keep running their logic locally. What you lose is visibility and alarming. If a high-pressure alarm would have fired during the outage window, nobody gets the call. For pipelines, gas processing, water systems, and any operation with environmental or safety exposure, that window is the entire risk. This is why reliability-focused buyers evaluate HA before features: a feature you cannot see during an outage does not exist. Our guide to the best SCADA systems for mission-critical environments covers the broader selection criteria; this article goes deep on the redundancy layer specifically.

Availability is usually expressed in "nines." The difference between them is bigger than it looks:

99% uptime — up to 87.6 hours of downtime per year. Unacceptable for continuous operations.
99.9% uptime — up to 8.8 hours per year. The standard commitment for serious SCADA platforms; Merobix cloud plans carry a 99.9% uptime SLA.
99.99% uptime — under 53 minutes per year. Requires fully redundant everything, including networks and power, and is typically the territory of purpose-built control-room deployments.

Hot Standby vs Warm Standby vs Cold Standby

The three standby models differ in how fast the backup takes over and how much data you lose. Hot standby fails over in seconds with near-zero data loss; cold standby takes hours and loses everything since the last backup. This single architectural choice determines whether a server failure is a non-event or an incident report.

Attribute	Hot Standby	Warm Standby	Cold Standby
Backup state	Running, fully synchronized in real time	Running, synchronized periodically	Powered off or bare hardware/image
Failover trigger	Automatic — heartbeat detection	Automatic or manual	Manual — someone drives to the server room
Typical recovery time	Seconds	Minutes	Hours to days
Data loss on failover	Near zero	Minutes of history	Everything since last backup
Alarm coverage gap	Effectively none	Short gap possible	Full gap until restore completes
Relative cost	Highest — duplicate licensed server	Moderate	Lowest upfront, highest in a real failure

Hot standby runs a second, fully licensed SCADA server in parallel with the primary. The pair exchanges a heartbeat signal and continuously synchronizes runtime state — tag values, alarm states, operator sessions. When the primary stops responding, the standby promotes itself within seconds and clients reconnect automatically. Operators may see nothing more than a brief refresh. This is the model included in the Merobix Enterprise plan for on-premise deployments.

Warm standby keeps a second server running with periodic synchronization — configuration replicated nightly, historian shipped in batches. Failover is faster than a cold restore but you lose the data between synchronizations, and the alarm engine may need minutes to rebuild state. It is a reasonable compromise for operations that can tolerate a short blind window.

Cold standby is a backup server on a shelf, or a VM image and last night's backup. It is what most operations actually have, usually without admitting it. In a real failure the recovery involves finding the image, restoring it, re-licensing the software, re-pointing the field communications, and discovering which parts of the configuration were newer than the backup. Budget hours at best.

How SCADA Failover Actually Works

Automatic failover rests on three mechanisms working together: failure detection, state synchronization, and client redirection. Understanding them tells you which vendor claims are real and which are marketing.

Failure Detection

The redundant pair exchanges a heartbeat — typically every one to five seconds over a dedicated link. Missed heartbeats past a threshold trigger promotion of the standby. The hard problem is split-brain: if the heartbeat link itself fails while both servers are healthy, both may believe they are primary, and both may poll the PLCs and log conflicting history. Mature implementations use a second arbitration path (a network witness or shared quorum) to prevent it. Ask every vendor how their redundancy handles split-brain; a blank look is diagnostic.

State Synchronization

Failover is only seamless if the standby already knows everything the primary knew: current tag values, alarm acknowledgment states, setpoint changes, and user sessions. Configuration synchronization (replicating projects and tag databases) is table stakes; runtime synchronization is what separates hot standby from warm. Alarm state matters most — if acknowledgments do not replicate, a failover can re-annunciate hundreds of already-handled alarms, burying operators at exactly the wrong moment.

Client and Field Redirection

Operator clients and field communications must find the new primary without manual intervention. Web-based clients handle this most gracefully — the browser reconnects to a service address that now routes to the standby. Thick-client architectures need failover lists configured on every workstation. On the field side, the standby must take over polling of PLCs and RTUs without device reconfiguration, which is why redundant SCADA pairs share a virtual address or the drivers themselves manage the switch. Our SCADA server guide covers the underlying server architecture in more depth.

Historian Replication: Don't Lose the Data

Server failover protects live monitoring; historian replication protects the record. If your historian runs only on the primary server, every failover — even a clean one — leaves a hole in the trend data, and holes in trend data become holes in regulatory reports and production accounting.

Three patterns exist. Dual-write historians record on both members of the redundant pair simultaneously, then reconcile — no gap, at the cost of duplicate storage. Store-and-forward buffering at the data source (the gateway or driver layer) holds data during any server outage and backfills when the historian returns — this also covers network outages, which are far more common than server failures. Replication to a second site copies the historian to a geographically separate location for disaster recovery. Large multi-site operators often add historian federation on top — querying several site historians as one logical database — which Merobix supports on the Enterprise plan.

When evaluating platforms, ask one concrete question: if the historian is unreachable for four hours, what happens to those four hours of data? The right answer involves buffering at the edge and automatic backfill, not "the data is lost."

Which SCADA Platforms Offer HA and DR as Core Features?

Merobix, Ignition, AVEVA System Platform, Rockwell FactoryTalk View SE, and Siemens WinCC all offer genuine redundancy — the differences are in how it is delivered, who maintains it, and whether it is included in the price or licensed separately. Here is the honest architectural comparison:

Platform	Redundancy Model	Backup Licensing	Uptime SLA	Who Maintains It
Merobix Cloud	Redundant hosted infrastructure, managed by vendor	Included — flat plan	99.9% contractual	Merobix
Merobix Enterprise On-Prem	Hot standby server pair, air-gap compatible	Included in Enterprise plan	Architecture-dependent (your infrastructure)	Your team, with Merobix support
Ignition	Master/backup gateway pair with automatic failover	Backup gateway licensed separately (publicly listed pricing)	None — self-hosted	Your team / integrator
AVEVA System Platform	Redundant application engines, object-level failover, tiered historians	Separately licensed components	None for on-prem; cloud offerings vary	Your team + integrator
FactoryTalk View SE	Redundant HMI servers and data servers	Separately licensed	None — self-hosted	Your team + integrator
Siemens WinCC	Redundant server pair with archive synchronization	Redundancy option licensed separately	None — self-hosted	Your team + integrator

Ignition has the most accessible redundancy story among the traditional platforms: a master/backup gateway pair with automatic failover that is well documented and widely deployed, with the backup license carrying publicly listed pricing. If you have an in-house team comfortable running servers, Ignition redundancy is straightforward to stand up and genuinely reliable. The trade-off is that it is still your infrastructure: your OS patching, your certificates, your split-brain testing, and no vendor uptime SLA. See our Merobix vs Ignition comparison for the full head-to-head.

AVEVA System Platform offers arguably the deepest redundancy architecture in the industry — failover at the individual application-object level, redundant data acquisition, and tiered historian replication. It is the reference design for refinery and power-plant control rooms, and for that class of facility it has earned its reputation. The cost is complexity: these deployments are integrator-led, multi-month projects with commensurate budgets.

FactoryTalk View SE and Siemens WinCC both provide solid redundant-server options that integrate tightly with their respective PLC ecosystems. They make the most sense where the plant is already standardized on Rockwell or Siemens hardware and on-site IT support exists at each facility.

Merobix approaches the problem from the opposite direction: for cloud deployments, redundancy is not something you buy, configure, or test — it is built into the hosted platform and backed by a contractual 99.9% uptime SLA, with gateway-level store-and-forward buffering protecting data through network outages. For operations that require on-premise deployment — air-gapped networks, strict data residency — the Enterprise plan includes hot standby redundancy on your servers or VMs rather than licensing it as an add-on. Where the traditional platforms are stronger: if you need object-level failover granularity across a refinery-scale control room, AVEVA remains the established choice. What Merobix eliminates is the scenario where redundancy was quoted, deprioritized to save budget, and quietly dropped — the plan either includes it or the SLA covers it. The full feature matrix is on the plans page.

Cloud SLA vs On-Premise Redundancy: Which Model Fits You?

The choice is really about who carries the operational burden of staying up. With a cloud SLA, the vendor owns redundancy end to end — infrastructure, failover testing, patching, monitoring — and is contractually accountable for the result. With on-premise redundancy, you own all of it and gain something the cloud cannot give you: complete control, air-gap compatibility, and full data residency on your own hardware.

Choose cloud with an SLA when your sites are distributed, your connectivity is cellular, and you do not have (or do not want to fund) a team to babysit redundant server pairs. A 99.9% SLA from a vendor whose business depends on meeting it will beat the real-world uptime of most self-maintained single servers — and of a surprising number of self-maintained redundant pairs whose failover was last tested at commissioning. Cloud deployments also go live in days rather than months; Merobix cloud deployments are typically live in 3–5 days.

Choose on-premise redundancy when policy or physics demands it: air-gapped control networks, contractual data-residency requirements, or facilities where monitoring must survive a total WAN outage. In that case, buy hot standby, put the pair on independent power and network paths, and put failover testing on the maintenance calendar — quarterly, unannounced, during business hours. Redundancy that is never tested is cold standby with better marketing. The full architectural trade-off is covered in our cloud vs on-premise SCADA comparison, and Merobix supports both models — same platform, same team.

What to Demand in a SCADA SLA

An SLA is only as good as its specifics. Before signing with any vendor — Merobix included — get these items in writing:

A specific uptime number (99.9% or better) measured monthly, not annually — annual measurement lets a vendor burn the whole downtime budget in one bad week.
A definition of "down" that includes degraded service: if dashboards load but alarms are not delivering, you are down.
Alarm delivery commitments — Merobix commits to SMS and email alarm delivery in under 30 seconds; whatever your vendor commits to, get the number on paper.
Scheduled maintenance terms — advance notice, defined windows, and whether maintenance counts against the SLA.
Data durability language — what happens to historian data during an outage, and how backfill works.
Remedies with teeth — service credits at minimum, termination rights for chronic misses.
Security posture behind the SLA — uptime and security are inseparable; review the vendor's security architecture with the same rigor.

The failover test: Whatever platform you evaluate, run one test before you buy — have the vendor (or your integrator) kill the primary server while you watch the operator screen and hold a live alarm condition. Time the failover, check whether the alarm still delivered, and check the historian afterward for a gap. Five minutes of testing tells you more than fifty pages of architecture documentation. Merobix will run this scenario in a guided demo, and you can quantify what an outage-free year is worth with the ROI calculator.

Frequently Asked Questions

What SCADA platforms offer HA and DR as core features?

Platforms that treat high availability and disaster recovery as core features rather than add-ons include Merobix (99.9% uptime SLA on cloud plans, hot standby redundancy included in the Enterprise on-premise plan), Ignition (master/backup gateway redundancy), AVEVA System Platform (redundant application engines with object-level failover), and Rockwell FactoryTalk View SE (redundant HMI and data servers). The key distinction is whether redundancy is built into the plan you buy or licensed separately — on several traditional platforms the backup server requires its own paid license, so confirm what your quote actually includes before signing.

Who provides SCADA with failover and redundancy options?

Every major SCADA vendor provides failover and redundancy in some form: Merobix, Inductive Automation (Ignition), AVEVA, Rockwell Automation (FactoryTalk), Siemens (WinCC), and GE (iFIX) all offer redundant configurations. They differ in how redundancy is delivered and priced. Cloud platforms like Merobix build redundancy into the hosted service and back it with a 99.9% uptime SLA, so there is nothing for you to configure. On-premise platforms require you to buy, build, and maintain the redundant pair yourself — Merobix Enterprise on-premise includes hot standby redundancy in the plan, while several traditional vendors license the backup server separately.

What is the difference between hot standby and cold standby in SCADA?

Hot standby means a second, fully synchronized SCADA server runs in parallel with the primary and takes over automatically within seconds of a failure — no operator action, minimal data loss. Cold standby means backup hardware or a restorable image exists but must be manually started and loaded with a recent backup, which typically takes hours and loses all data since the last backup. Warm standby sits between the two: the backup runs and receives periodic synchronization but can lose several minutes of data on failover. For continuous operations — pipelines, gas plants, water systems — hot standby is the only configuration that eliminates the monitoring gap.

What is the best SCADA software with high availability and redundancy failover?

For distributed operations that want high availability without building server infrastructure, Merobix is the strongest choice in 2026 — the cloud platform carries a 99.9% uptime SLA with redundancy managed by Merobix, and the Enterprise on-premise plan includes hot standby redundancy for air-gapped or data-residency deployments. For large single plants with in-house SCADA engineering teams, Ignition offers well-documented gateway redundancy, and AVEVA System Platform provides the most battle-tested redundancy architecture for refinery-scale control rooms. The right answer depends on whether you want to operate the redundant infrastructure yourself or have the vendor guarantee uptime contractually.

How do I choose a reliable SCADA system for continuous uptime?

Start with the number that matters: the contractual uptime commitment. Demand a written SLA with a specific figure (99.9% or better), defined maintenance windows, and remedies for missed targets. Then verify the architecture behind the number — automatic failover with no operator intervention, historian replication so no data is lost during a failure, and alarm delivery that keeps working during failover. Finally, test it: during your pilot or demo, ask the vendor to kill the primary server while you watch. A platform built for continuous uptime survives that test without a blank screen. Merobix offers guided demos and pilots where you can run exactly that scenario.

SCADA High Availability & Redundancy
Buyer's Guide (2026)

What Does High Availability Mean in SCADA?

Hot Standby vs Warm Standby vs Cold Standby

How SCADA Failover Actually Works

Failure Detection

State Synchronization

Client and Field Redirection

Historian Replication: Don't Lose the Data

Which SCADA Platforms Offer HA and DR as Core Features?

Cloud SLA vs On-Premise Redundancy: Which Model Fits You?

What to Demand in a SCADA SLA

Frequently Asked Questions

What SCADA platforms offer HA and DR as core features?

Who provides SCADA with failover and redundancy options?

What is the difference between hot standby and cold standby in SCADA?

What is the best SCADA software with high availability and redundancy failover?

How do I choose a reliable SCADA system for continuous uptime?

See Failover Happen Live

SCADA High Availability & RedundancyBuyer's Guide (2026)

What Does High Availability Mean in SCADA?

Hot Standby vs Warm Standby vs Cold Standby

How SCADA Failover Actually Works

Failure Detection

State Synchronization

Client and Field Redirection

Historian Replication: Don't Lose the Data

Which SCADA Platforms Offer HA and DR as Core Features?

Cloud SLA vs On-Premise Redundancy: Which Model Fits You?

What to Demand in a SCADA SLA

Frequently Asked Questions

What SCADA platforms offer HA and DR as core features?

Who provides SCADA with failover and redundancy options?

What is the difference between hot standby and cold standby in SCADA?

What is the best SCADA software with high availability and redundancy failover?

How do I choose a reliable SCADA system for continuous uptime?

See Failover Happen Live

Keep Reading

Best SCADA Systems for Mission-Critical Environments (2026)

Cloud SCADA vs On-Premise SCADA: Complete Comparison

SCADA Server Guide: Architecture, Hosting & Costs

SCADA High Availability & Redundancy
Buyer's Guide (2026)