DR & Cloud Sites — Secondary Datacenter and Cloud Egress
A reduced-footprint secondary site serves as the primary DR and backup target. An optional cloud site provides burst, tertiary DR, or archival egress inside a FedRAMP authorization boundary. These sites are managed as part of the same fleet via ACM.
The Backup Conversation is Different on Kubernetes — and More Flexible
Veeam Kasten K10 protects at three scopes: an individual VM and its attached storage, a full namespace containing multiple VMs and applications, or the entire cluster. You define the policy — Kasten executes it consistently. This is a meaningful upgrade from agent-based backup, which requires per-VM configuration and breaks when VMs move. Alongside Kasten, SnapMirror handles block-level storage replication to the DR site independently — so your data is protected at both the application layer and the storage layer. These two mechanisms together give you something more reliable and more testable than most organizations have today.
Per-VM protection
Namespace-level protection
Full cluster backup
App-consistent restore
SnapMirror storage replication
Testable failover
Secondary Datacenter
Reduced footprint — DR and backup target
Passive / Active DR
NetApp— Less performant by design, SnapMirror destination
Capacity Array — HA PairSAS
Capacity ArraySAS
Object / Archival TargetS3-compatible
Veeam Kasten K10— Backup target & DR restore point
DR Modes
PassiveData replicated, cluster powered down. Manual failover. Lowest cost.
ActiveCluster running, workloads live. Automated failover. Higher cost.
Cloud Egress Site
GovCloud or commercial — optional architecture component
Optional
Managed OpenShift
Managed OpenShift ClusterCloud-hosted
HashiCorp Vault Enterprise— Secrets sync across all sites
Cloud Storage
Object StorageS3-compatible
Managed Block (CSI)Cloud-native
Use Cases
BurstOverflow compute for mission demand peaks.
Cloud DRThird site for additional resiliency.
ArchiveLong-term retention and compliance egress.
ACM provides single-pane fleet management across all three sites. FedRAMP authorization boundary applies when configured correctly across primary, DR, and cloud.
How Recovery Actually Works — Two Workflows Side by Side
NetApp SnapMirror — Storage Failover
Block-level replication · Storage layer recovery · Used for site-wide DR
SnapMirror continuously replicates storage volumes from the primary datacenter to the secondary site at the block level. It is storage-layer protection — independent of what is running on top. In a failover scenario, the secondary volumes are promoted and become read/write. OpenShift at the DR site mounts those volumes and starts workloads.
Step 1
SnapMirror replication running continuously. RPO is minutes, not hours — based on replication schedule configured per volume or policy.
Step 2
Primary site incident detected. Decision to failover made. SnapMirror relationship is broken — secondary volumes promoted to read/write.
Step 3
OpenShift cluster at DR site brought active. Trident CSI re-attaches the promoted volumes as PersistentVolumes.
Step 4
VMs start on DR cluster using replicated storage. Applications resume from last replication checkpoint. DNS/routing updated to DR site.
Step 5
Primary restored. Reverse replication syncs changes back. Planned failback executed. SnapMirror relationship re-established in original direction.
Best used for: Site-wide disaster, storage hardware failure, datacenter-level outage. Protects the data layer regardless of what caused the incident.
Veeam Kasten K10 — Application Restore
Application-layer recovery · Per-VM, namespace, or cluster scope · Any cluster in the fleet
Kasten K10 operates at the application layer. It captures a consistent point-in-time snapshot of a VM, a namespace, or the entire cluster — including storage volumes, configuration, and metadata. Restore can target any cluster in the fleet. This makes Kasten the right tool for accidental deletion, corruption, ransomware recovery, and cross-cluster workload mobility as well as DR.
Step 1
Kasten K10 policies run on a defined schedule. Backup scope is per-VM, per-namespace, or cluster-wide. Snapshots stored on-cluster or exported to S3-compatible object storage at the DR site.
Step 2
Recovery event occurs — deleted VM, corrupted namespace, ransomware, or full site failure. Administrator selects restore point and target cluster from the Kasten dashboard.
Step 3
Kasten restores the VM disk (PVC), VM definition, networking config, and associated secrets. Application-consistent to the backup point-in-time — not just the disk.
Step 4
VM comes online on the target cluster. If restoring to a different namespace or cluster, Kasten handles remapping of storage and network references automatically.
Step 5
Restore verified. Policy compliance and audit log available in Kasten dashboard. Immutable backup copies on object storage remain intact for compliance retention.
Best used for: Accidental deletion, ransomware, application corruption, cross-cluster workload mobility, and DR where application-layer consistency is required.
These two mechanisms are complementary, not duplicative. SnapMirror protects storage at the infrastructure layer continuously and is the primary tool for site-level failover. Kasten protects applications at the platform layer on a schedule and is the primary tool for precision recovery, mobility, and compliance. Most production environments run both.