Redundant Storage (Erl, Naserpour)
How can the reliability and availability of cloud storage devices survive failure conditions?
ProblemWhen cloud storage devices fail or become inaccessible, cloud consumers are unable to access data and cloud services relying on access to the device may also fail.
SolutionA failsafe system comprised of redundant cloud storage devices is established so that when the primary device fails, the redundant secondary device takes its place.
ApplicationData is replicated from the primary storage to the secondary storage device. A storage service gateway is used to redirect data access requests to the secondary storage device, when necessary.
Compound PatternsBurst In, Burst Out to Private Cloud, Burst Out to Public Cloud, Cloud Balancing, Elastic Environment, Infrastructure-as-a-Service (IaaS), Multitenant Environment, Platform-as-a-Service (PaaS), Private Cloud, Public Cloud, Resilient Environment, Software-as-a-Service (SaaS)
Cloud storage devices are subject to failure and disruption due to a variety of causes, including network connectivity issues, controller failures, general hardware failure, and security breaches.
When the reliability of a cloud storage device is compromised, it can have a ripple effect, causing impact failure across any cloud services, cloud-based applications, and cloud infrastructure program and components that rely on its presence and availability.
Figure 1 - A sample scenario that demonstrates the effects of a failed cloud storage device.
- The cloud storage device is installed and configured.
- Four LUNs are created, one for each cloud consumer.
- Each cloud consumer sends a request to access its own LUN.
- The cloud storage device receives the requests and forwards them to the respective LUN.
- The cloud storage device fails and cloud consumers lose access to their LUNs. This may be due to the loss of the device controller (5.1) or loss of connectivity (5.2).
A secondary redundant cloud storage device is incorporated into a system that synchronizes its data with the data in the primary cloud storage device. When the primary device fails, a storage service gateway diverts requests to the secondary device.
Figure 2 - A simple scenario demonstrating the failover of redundant storage.
- The primary cloud storage device is replicated to the secondary cloud storage device on a regular basis.
- The primary storage becomes unavailable and the storage service gateway forwards the cloud consumer requests to the secondary storage device.
- The secondary storage forwards the requests to the LUNs, allowing cloud consumers to continue to access to their data.
This pattern fundamentally relies on the resource replication mechanism to keep the primary cloud storage device synchronized with any additional duplicate secondary cloud storage devices that comprise the failover system.
Figure 3 - Storage replication is used to keep the redundant storage device synchronized.
Cloud providers may locate secondary cloud storage devices in a different geographical region than the primary cloud storage device, usually for economic reasons. For some types of data, this may introduce legal concerns. The location of the secondary cloud storage device can dictate the protocol and method used for synchronization because some replication transport protocols have distance restrictions.
Some cloud providers use storage devices with dual array and storage controllers to improve device redundancy. They may place the secondary storage device in a different physical location for cloud balancing and disaster recovery purposes. In this case, cloud providers may need to lease a network connection via a third-party cloud provider, to establish replication between two devices.
NIST Reference Architecture Mapping
This pattern relates to the highlighted parts of the NIST reference architecture, as follows: