asynchronous replication
What is asynchronous replication?
Asynchronous replication is a store-and-forward approach to data backup and protection.
Asynchronous replication writes data to the primary storage array first, and then, depending on the implementation approach, commits data to be replicated to memory or a disk-based journal. It then copies the data in real time -- or more commonly, at scheduled intervals -- to one or more replication targets, known as replicas.
Asynchronous replication is one of the methods available to protect data and ensure its recovery in case of a disruptive event. It is a remote replication method in which data is written to a replica storage medium after it has already been written to the primary storage medium. The write process is considered complete once it has been acknowledged by the local storage. Network acknowledgement from the secondary location is not required, which helps improve network performance.
Usually, the replication process happens in a scheduled manner with data periodically transmitted to the replica in batches -- for example, every minute, every hour or every day. That said, the process can also occur in near real time, although this is rarer.
The chief goal of asynchronous replication is to enable data recovery in the event of a problem. However, some loss of data can occur when a failover event is triggered to facilitate an automatic switch to the data backup system. Losses usually happen when -- and because -- the replica site does not contain the most recently made changes to the primary site.
The benefits of asynchronous replication
Like synchronous replication, asynchronous replication is a method to protect data and minimize its loss. A reliable replication process ensures businesses can protect their data and restore mission-critical activities dependent on that data as soon as possible following a disruptive event.
Asynchronous replication also offers additional benefits and advantages over synchronous replication:
- It costs significantly less than synchronous replication. Asynchronous replication requires less bandwidth than synchronous replication, so it usually doesn't overload the network while ensuring fairly fast and secure data transfers. It also doesn't usually require expensive specialized hardware for implementation.
- It is designed to work over long distances. Since the replication process does not usually occur in real time, asynchronous replication can work over long distances, albeit with some latency.
- It is resilient. The asynchronous replication process tolerates some degradation in connectivity and wide area network interruptions. A copy of the data can be stored locally until the service is restored.
Synchronous vs. asynchronous replication
Synchronous replication is typically used to provide high availability of critical applications. In these scenarios, failover from the primary to secondary array is nearly instantaneous, which ensures little to no application downtime and minimal negative impact to users.
Asynchronous replication mainly differs from synchronous replication in the data writing method to the storage replica site. In synchronous replication, data is written to the primary storage and the replica simultaneously -- hence the term synchronous. Since the primary -- the source -- and replica copies are always synchronized, the possibility of data loss is zero. It therefore enables more reliable disaster recovery for applications or processes where having access to the near-latest data is mission- or business-critical.
That said, the real-time nature of synchronous replication means that the process consumes more network bandwidth. Also, it increases latency as distances increase and cannot tolerate connectivity degradation without affecting performance and speed.
Asynchronous replication is so named because the data is first written to the primary storage and then copied to the replica. Unlike the synchronous method, the replication process doesn't occur in real time, but on a scheduled or periodic basis.
Typically, block-based storage arrays are used for synchronous replication to enable fast and efficient data transportation. Block storage also speeds up data retrieval. In contrast, asynchronous replication uses array-based, network-based or host-based replication products.
Low resilience is a drawback of synchronous replication. Even a single failure can result in loss of service. Also, if the source data is corrupted -- say, due to a virus -- the replicated data will also be corrupted, with almost no chance of fixing it pre-replication, since it gets copied to the replica in real time. Asynchronous replication offers higher resilience since the replication doesn't occur in real time, and because it takes two or more failures for loss of service to occur.
Synchronous replication is usually used with segmented systems such as local area networks or storage area networks and can require specialized hardware for implementation. It is suitable in the following situations:
- When multiple data repositories must be simultaneously updated.
- For high-end transactional applications.
- When instant failover is required in case the primary node fails.
The main application of asynchronous replication is to take data backups.
Both synchronous and asynchronous data replication offer a short recovery time objective (RTO), which refers to the maximum amount of time that's acceptable for regaining data access following an unexpected or unplanned disruption. However, the recovery point objective (RPO) -- the maximum acceptable amount of data loss after an unexpected or unplanned disruption -- differs. Where synchronous replication offers zero RPO, the RPO for asynchronous replication can vary from a few minutes to a few hours.
Learn more about the differences between RPO and RTO.
Asynchronous replication use cases
Asynchronous replication is commonly preferred over synchronous replication in data backups, especially when the data is less sensitive or in situations where partial data loss can be tolerated. These backups can be local, but the technology is frequently used for off-site backups. Cloud backups also use asynchronous replication.
Some enterprise-grade hypervisors include an asynchronous replication feature that allows entire virtual machines to be replicated to a remote location so that VMs can fail over to that location in the event of a disaster. This is commonly referred to as instant recovery or recovery in place, and a number of backup software products support this functionality.
Asynchronous replication is also frequently used with storage snapshots for continuous data protection.
Further explore the differences between replication vs. backup and synchronous vs. asynchronous replication. Learn why replication is a key technology for disaster recovery and the differences between hardware replication vs. software replication.