High Availability DRBD: Distributed Replicated Block Devices

Did you know that it is possible for your server to crash while your website remains online?

Highly reliable databases are critical for online services to function in the event of a catastrophe. Therefore deploying dedicated High Availability Databases ensures they remain available, even if one node crashes.

DRBD: A Highly Available Tool That Can Help

This is where tools such as Distributed Replicated Block Device (DRBD) come in, enabling automatic failover capabilities to prevent downtime.

With a Distributed Replicated Block Device, whenever new data is written to disk, the block device uses the network to replicate data to the second node.

Through redundancy, businesses can protect themselves from downtime and financial loss, and get minimal to zero interruption during software and framework-related operations.

High Availability Database Hosting is ideal for mission-critical databases such as healthcare, government, eCommerce, big data or SaaS. Complex infrastructures can be hard to manage, but a DRBD delivers improved resiliency and optimizes disaster recovery, making them worth a significant investment.

In traditional architectures, all it takes for hardware shutdown is for one component to crash.”

In a High Availability environment, when a server crashes due to a hardware or software failure, the second server where all data has been replicated becomes active and takes over the workload. Thus, the hot spare ensures full redundancy and resilience.

Different Types of Highly Available Storage

DRBD is Linux-based open source software at the kernel level, on which High Availability clusters are built. It is a good solution for data clusters to replace solid state drives (SSD) storage solutions with low capacity.

Easily integrated into any infrastructure including cloud, DRBD is used to mirror data, logical volumes, file systems, RAID devices (Redundant Array of Independent Disks), and block devices (HDD partitions) across the network to multiple servers in real time, through different types of replication. The other hosts need to have the same amount of free disk space in the hard drive disk partition as the primary node.

DRBD uses a block file to synchronize a number of tasks, including the two independent HDD partitions in the active and passive servers for read and write operations. When the hot standby takes over, there is zero downtime because it already contains a copy of all data.

Remember, high availability is all about removing single points of failure from your infrastructure.”

hot standby or primary and secondary nodes

What is a Hot Standby or Secondary Node?

The hot server is a backup that allows load balancing to remove single points of failure. In active/passive mode, read and write (access or alter from memory) operations are run in the primary node.

An all-round tool, DRBD can add high availability to just about any application. DRBD can also work in an active/active environment, in particular as a popular approach to enabling load balancing in HA web cluster configurations. In this mode, servers run simultaneously so read and write operations are run on both servers at the same time, a process also known as shared-disk mode.

DRBD is an enterprise-grade tool that simplifies the replacement of data storage and increases data availability.”

DRBD supports both synchronous and asynchronous write operations, which will be further discussed below in relation to the three protocol setups.

In synchronous data replication, notifications are only delivered after write operations are finalized on all hosts, while in asynchronous replication applications receive notifications only of locally finalized operations before the process moves on to other hosts.

Primary and Secondary Nodes

Commonly, in a small-scale High Availability two redundant node-scenario, one is active (primary) and one is inactive (secondary), also known as a hot standby that already has a copy of the data through network mirroring and replication provided by DRBD.

They are both connected to a single IP configuration, which means the hot spare will immediately take over operations in case of hardware failure. The switchover does not affect the High Availability databases, which remain 100% available.

how DRBD replication works

How Does DRBD Replication Actually Work?

DRBD architecture is made up of two separate segments that ensure high-availability storage; the kernel module for DRBD behavior and user administrative tools to operate DRBD disks.

Because this architecture enables database mirroring and data replication through both synchronous and asynchronous write operations, DRBD is a flexible, virtual block device that can run on three replication protocols, known as Protocol A, Protocol B and Protocol C. All data replication is network transparent (invisible) to other applications using the same protocol.

Protocol A constitutes asynchronous replication that can generate some data loss if host failover is forced. As previously explained, asynchronous data replication means that local write operations on the primary node (active/passive server situation) are considered achieved when local write operations are finished, and the mirrored data is available in the send buffer of the TCP transport framework. This setup is more common in replicating stacked resources in a wide area network.
Protocol B involves memory synchronous (semi-sync) replication. In this deployment, no data is usually lost in failover. Local operations on the primary node are considered achieved once local disk write is complete, and the replicated data is available in the second node. Finalized write operations on the primary node may be deleted, however, if both nodes crash and data storage on the primary node is destroyed. This protocol is a variation of protocols A and C, and an example of how versatile DRBD can be in replication modes.
Protocol C covers synchronous replication of local write operations and is the most popular scenario in production data replication. In this case, replication operations are considered achieved when replication confirmation is received on local and remote disks. DRBD is configured to use Protocol C by default; therefore, to change the protocol setup, reconfiguration in the file is necessary.

To confirm that the two hosts are indeed identical and all data was replicated, DRBD moves hashes and not data, which saves time and bandwidth.

In “split brain” situations in which node communication failures result in two hosts both being mistakenly identified as the primary hosts, DRBD leverages a recovery algorithm that ensures there is no inconsistent storage.

Managed Hosting Can Help With Complex Infrastructures and DRBD High Availability

DRBD is workload agnostic and a great open-source tool with features that enable it to work as a kernel module, certain userspace management applications, and shell scripts. Organizations interested in DRBD virtual disks can take advantage of the open-source status and alter the software to accommodate their needs and applications.

Managing a complex infrastructure is not a task many businesses want on their plate, but Liquid Web can build and manage custom hosting environments to ensure peak performance, reduce team effort spent on configuration, and achieve business objectives.

Not all companies have the proper resources to configure DRBD for their infrastructure; however, they can always rely on a managed service provider like Liquid Web to do the heavy lifting for High Availability, especially when its product offering includes enterprise-grade tools such as DRBD software and Heartbeat.