How to Sync Two CentOS 8 Servers Using File Replication

Reading Time: 8 minutes

Introduction

All online businesses need to account for growth. As a business receives more visitors to its site, the underlying infrastructure needs to scale to provide the same level of performance that the visitors are accustomed to. Horizontal scaling, the addition of more servers rather than increasing the power of the existing servers, is an easy way to build our web servers’ ability to handle a more significant amount of traffic and protect us against hardware failure. Ensuring that the additional web servers have the same files and data is a potentially time-consuming and challenging task. Automating that task using free, open-source software, such as lsyncd, is a way to ensure that we have a safe, secure, and repeatable method of copying files from one server to another.

Prerequisites

This article assumes that we are utilizing two or more core-managed CentOS 8 servers running Apache in a default configuration. Installation of software may differ depending on the OS and default software configuration used. We also assume that there is a basic understanding of the functionality of Vim. Any text editor will work in place of Vim.

Software Used

This article explores a method of synchronizing data between two web servers running Apache utilizing open-source software called Lsyncd. To quote from the description on the GitHub page:

Lsyncd watches the local directory trees event monitor interface (inotify or fsevents). It aggregates and combines events for a few seconds and then spawns one (or more) process(es) to synchronize the changes. By default, this process uses rsync.

https://github.com/axkibe/lsyncd

This denotes that lsyncd runs in the background and tracks changes we make to files in a specified folder. It will collect those changes over a short amount of time and then processes them by using rsync and SSH to “push” the file changes to the remote servers. The main benefits for us are:

  • Lsyncd is free – this free software that can be downloaded and configured with no charge for the software or use. 
  • Setup is simple – we only need to install a single package, and the configuration file uses Lua (though the syntax is rather straightforward).
  • Comprised of reliable technology – rsync and SSH are old, well-used utilities that are readily available on every Linux based machine.

Because of the free usage rights and ease of setup, Lsyncd makes for a perfect utility to synchronize data across two or more hosts. Several examples of when we might want to synchronize data across hosts include:

  • Load-balancing incoming requests[1] – this works best when the traffic levels are relatively low (or intermittent), or new and modified content is not frequently accessed. 
  • High availability – keeping in mind that there are multiple aspects of high availability. Using lsyncd to push data to another host that can take over in the event of a hardware failure is an excellent use-case.
  • Live / Running backups – a great way to keep a running record of the files and folders that have changed will ensure we push the changes to a second host for backup purposes.

[1] – If we have a high traffic site, we are better off using a shared file system that our web nodes can access simultaneously.

Unfortunately, there are some drawbacks to using Lsyncd that we need to consider when determining if this is the best way to synchronize data across multiple servers. 

Lsyncd is a one-way push-based utility. This means we have a master server where we can create or edit files, and then the master server “pushes” the changes to the attached slave nodes. 

Any changes made on a slave node are not picked up or shared with the master or other nodes. Additionally, not all changes are pushed out of the master. Files that are created, deleted, or have the content modified are pushed out; however, ownership and permission changes are not transmitted to the slave nodes. 

  • Lsyncd is not a real-time synchronization mechanism. The default timeframe for pushing changes is every 15 seconds (although this can be modified in the configuration settings if necessary).
  • Lsyncd is a one-way push-based utility. This means we have a master server where we can create or edit files, and then the master server “pushes” the changes to the attached slave nodes. 
  • Any changes made on a slave node are not picked up or shared with the master or other nodes. Additionally, no changes are pushed out of the master. Files that are created, deleted, or have the content modified are pushed out; however, ownership and permission changes are not transmitted to the slave nodes. 

Basic Configuration

Add EPEL Repo

To begin setting up Lsyncd, we need to add the software repository that contains the Lsyncd package. This is easily performed with the following command: 

root@alt ~]# yum -y install epel-release

The installation will take a moment, and we will see a decent amount of output. Once it says “Complete!” we can move on.

[root@alt ~]# yum -y install epel-release
Loaded plugins: fastestmirror, langpacks, priorities
Loading mirror speeds from cached hostfile
1 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package epel-release.noarch 0:7-11 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
=====================================================================
 Package Arch Version Repository Size
=====================================================================
Installing:
 epel-release noarch 7-11 system-extras 15 k
Transaction Summary
=====================================================================
Install 1 Package
Total download size: 15 k
Installed size: 24 k
Downloading packages:
epel-release-7-11.noarch.rpm | 15 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : epel-release-7-11.noarch 1/1
  Verifying : epel-release-7-11.noarch 1/1
Installed:
  epel-release.noarch 0:7-11
Complete!

Now, we need to make sure that the repository we just set up is enabled. To accomplish this, we want to review the repo file itself. 

[root@alt ~]# vim /etc/yum.repos.d/epel.repo

We then need to ensure that the repo is set to enabled=1.

[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
metalink=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch
failovermethod=priority
***enabled=1***
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7

Install the Lsyncd Software

Now we can install the Lsyncd software using the following command.

root@alt ~]# yum -y install lsyncd

This process will take a few moments to complete, and then we can continue the set up once we see “Complete!”.

[root@alt ~]# yum -y install lsyncd
Loaded plugins: fastestmirror, langpacks, priorities
Loading mirror speeds from cached hostfile
epel/x86_64/metalink | 16 kB 00:00:00
 * epel: mirrors.liquidweb.com
epel | 5.4 kB 00:00:00
(1/3): epel/x86_64/group_gz | 90 kB 00:00:00
(2/3): epel/x86_64/updateinfo | 1.0 MB 00:00:00
(3/3): epel/x86_64/primary_db | 6.9 MB 00:00:00
166 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package lsyncd.x86_64 0:2.2.2-1.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
======================================================================
 Package Arch Version Repository Size
======================================================================
Installing:
 lsyncd x86_64 2.2.2-1.el7 epel 83 k
Transaction Summary
======================================================================
Install 1 Package
Total download size: 83 k
Installed size: 227 k
Downloading packages:
warning: /var/cache/yum/x86_64/7/epel/packages/lsyncd-2.2.2-1.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID 352c64e5: NOKEY] 0.0 B/s | 0 B --:--:-- ETA
Public key for lsyncd-2.2.2-1.el7.x86_64.rpm is not installed
lsyncd-2.2.2-1.el7.x86_64.rpm | 83 kB 00:00:00
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
Importing GPG key 0x352C64E5:
 Userid : "Fedora EPEL (7) <epel@fedoraproject.org>"
 Fingerprint: 91e9 7d7c 4a5e 96f1 7f3e 888f 6a2f aea2 352c 64e5
 Package : epel-release-7-11.noarch (@system-extras)
 From : /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : lsyncd-2.2.2-1.el7.x86_64 1/1
  Verifying : lsyncd-2.2.2-1.el7.x86_64 1/1
Installed:
  lsyncd.x86_64 0:2.2.2-1.el7
Complete!

Configure SSH on Master

Now that the Lsyncd package is installed, we need to ensure the master host can push files to the slave hosts without requiring user intervention. We will accomplish this using SSH keys. For the purposes of this tutorial, we will assume that no other SSH keys are currently installed. To begin this process, we can create the SSH key with the following command.

root@host # ssh-keygen -t rsa (or)
root@host # ssh-keygen -t rsa -b 4096 -C "$(whoami)@$(hostname)-$(date -u +%Y-%m-%d-%H:%M:%S%z)" 
Note:
We can use the second command to generate a stronger key.

This SSH key creation process will ask several questions. For this tutorial, we will use the defaults and no added password.

Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/username/.ssh/id_rsa.
Your public key has been saved in /home/username/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:rCwcRH+3vxiIsrkrwikDaE1UlTi8vr0g/wOfwSowCsw user@domain.com-2019-08-21-13:14:45+0000
The key's randomart image is:
+---[RSA 4096]----+
|. . . . |
| = o . . |
|o.= . . . . |
|oDo . . . . . |
|B. .. Y . |
|O+.. o . . |
|O++.o o . . . |
|=*. . ... . . o. |
|.o.=+.++. . . |
+----[SHA256]-----+
root@alt [~]#

After the SSH keys are generated, we will transfer the public key over to our slave host. This process will allow us to authenticate and access that host without needing to enter a password. We can transfer the key over with the following command.

root@alt [~]# ssh-copy-id root@opt.thisisnotadomain.com

Next, we will need to provide the password one time (since the SSH key is not yet in place), and then we will then be ready to use our new SSH key to access the slave.

[root@alt ~]# ssh-copy-id root@opt.thisisnotadomain.com
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'opt.thisisnotadomain.com (209.59.144.32)' can't be established.
ECDSA key fingerprint is SHA256:R+KfXlPf2mvWLCYs89sobGJZ/1IUsHvO9fne4/4EvJ0.
ECDSA key fingerprint is MD5:9d:bd:7d:d2:66:6d:cd:8b:d2:ba:dc:d5:bc:6a:02:71.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@opt.thisisnotadomain.com's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@opt.thisisnotadomain.com'"
and check to make sure that only the key(s) you wanted were added.
[root@alt ~]#

To ensure that lsyncd can utilize the SSH keys we created, we need to modify the SSH config file on the master to add a few snippets of information.

root@alt [~]# vim ~/.ssh/config

Edit the config file using vim, and add the following info.

Host dest_host
 Hostname 172.16.144.32
 User root
 IdentityFile ~/.ssh/id_rsa

As you can see, we created an entry that specifies the destination host, giving it a name (in this case, dest_host), the IP associated with the hostname (172.16.144.32), the user we’ll authenticate as (we are using root), and the location of the SSH private key on the master host (~/.ssh/id_rsa).

Configure Lsyncd on Master

Next, we need to modify the Lsyncd configuration file. We will need to specify the following settings:

  • General log file location
  • Status log file location
  • Frequency to write status file

We also need to define the following specific settings to sync the data:

  • Method of synchronization
  • Source folder for files we want to sync (we are using /var/www/html)
  • Destination host (/var/www/html)
  • The target folder for files we want to sync

To begin, edit the config file using the vim command.

root@alt [~]# vim /etc/lsyncd.conf

From there, we modify the configs to match the parameters noted above.
Things to keep in mind:

  • We are using the defaults for most of these settings. We have also increased the statusInterval option to write more frequently. The default is 10 seconds, but we have chosen 1 second.
  • The “host= option” should specify the name we gave the host above when editing our SSH config file. In this case, it is dest_host.
  • The configuration file is written in Lua. Be mindful of spacing and the “” character(s). That formatting is needed and serves a purpose.
settings {
logfile = "/var/log/lsyncd/lsyncd.log",
statusFile = "/var/log/lsyncd/lsyncd-status.log",
statusInterval = 10
}

-- Slave server configuration

sync {
default.rsync,
source="/var/www/",
target="IP:/var/www/",

rsync = {
compress = true,
acls = true,
verbose = true,
owner = true,
group = true,
perms = true,
rsh = "/usr/bin/ssh -p 22 -o StrictHostKeyChecking=no"
}
}

Now that we have Lsyncd installed, configured, and our SSH keys in place, (to allow for user-free authentication), run the following commands to  start and enable the service.

[root@alt lsyncd]# systemctl start lsyncd
[root@alt lsyncd]# systemctl enable lsyncd
Created symlink from /etc/systemd/system/multi-user.target.wants/lsyncd.service to /usr/lib/systemd/system/lsyncd.service.
[root@alt lsyncd]#

Now that Lsyncd is running, we can verify that it is monitoring the folder for any changes in the status log.

root@alt [~]# cd /var/log/lsyncd
[root@alt lsyncd]# tail lsyncd.status
Lsyncd status report at Thu Feb 6 10:07:37 2020

Sync1 source=/var/www/html/
There are 0 delays
Excluding:
  Nothing.

Inotify watching 1 directories
  1: /var/www/html/

We will note in the log that the files are being syncing from the master to the slave server. Prior to initializing the lsyncd service, these were the contents of the /var/www/html folder on each server:

[root@alt ~]# cd /var/www/html
[root@alt html]# ll
total 20
drwxr-xr-x 2 root root 4096 Feb 6 09:45 .
drwxr-xr-x 4 root root 4096 Feb 1 03:20 ..
-rw-r--r-- 1 root root 420 Feb 6 09:45 index.html
-rw-r--r-- 1 root root 7528 Feb 6 09:45 styles.css
[root@alt html]#

[root@opt ~]# cd /var/www/html
[root@opt html]# ll
total 0
[root@opt html]#

After a few moments, the Lsyncd service will pick up the changes in the alt folder and compare it to the destination folder on opt. If changes are noted, it will push over those file modifications to the slave node. We can review the primary Lsyncd log to verify that the transfer has occurred, and what files were transferred across.

[root@alt ~]# cd /var/log/lsyncd
[root@alt lsyncd]# cat lsyncd.log

Tue Feb 11 08:07:28 2020 Normal: Rsyncing list
/
/index.html
/styles.css
Tue Feb 11 08:07:28 2020 Normal: Finished (list): 0
[root@alt lsyncd]#

In reviewing the folder on our slave node, we can see the transferred files are now in the destination directory.

[root@opt ~]# hostname
opt.thisisnotadomain.com
[root@opt ~]# ll /var/www/html
total 12
-rw-r--r-- 1 root root 420 Feb 6 09:45 index.html
-rw-r--r-- 1 root root 7528 Feb 6 09:45 styles.css
[root@opt ~]#

Conclusion

Now that we have the service running, Lsyncd will start when the system is rebooted and will continuously monitor for any changes and then push the changes across to the slave node. This method is a simple system that works well for specific use cases and is easily configured. Ultimately, the problem we are trying to solve is something that cannot depend on instant replication. However, the benefits of Lsyncd are such that for most people not running a load-balanced setup or needing a secondary copy of files on another machine, the benefits far outweigh any disadvantages.

Refer a friend and get a $50 hosting credit!