UniConfig High Availability Cluster

In order to make this work, the following three components will be needed:

  • Haproxy

  • Keepalived

  • UniConfig

Note

You should be able to find the first two in the package manager of you Linux distribution. This setup was tested on Ubuntu 18.04. If you use an older distribution, you might find that some of the features that are required are not available in the package managers version of Keepalived and/or Haproxy. In that case, you might need to build these from source. In such cases, make sure that you use the latests version.

To see how to download, install and run UniConfig, follow these inscructions.

There will be two scenarios we’ll talk about.

For both cases, the assumpotion will be that each Haproxy instance and each UniConfig node is running in a separate VM, meaning that we’ll have 3 VMs in the first and 4 VMs in the second scenario.

1 Haproxy and 2 UniConfig nodes scenario

In this first scenario, we’ll have one Haproxy instance connected to two UniConfig nodes. The Haproxy instance will continuously check the availability of the UniConfig nodes and in case one of them isn’t responding, it will start sending the reqests to the other UniConfig node.

1 Haproxy

The configuration in Haproxy is going to be the following:

defaults
 mode http
 option http-server-close
 timeout client 20s
 timeout server 20s
 timeout connect 4s

frontend frontend_app
 bind HAPROXY_IP:HAPROXY_PORT name app
 default_backend backend_app

backend backend_app
 stick-table type ip size 1
 stick on src
 server s1 UNICONFIG_1_IP:8181 check
 server s2 UNICONFIG_2_IP:8181 check backup

Populate the fields HAPROXY_IP, HAPROXY_PORT, UNICONFIG_1_IP, UNICONFIG_2_IP with appropriate values in your infrastructre. Save this to a file called /etc/haproxy/haproxy.cfg.

After that, when both of your UniConfig nodes are running, start Haproxy with service haproxy start or systemctl start haproxy.service, depending on what your service manager is.

Now, by default all request will be forwarerd to server s1 which is the UniConfig at the UNICONFIG_1_IP address. If you stop it or block the connection to this node by other means, the Haproxy will start forwarding all requests to UNICONFIG_2_IP address and will stick to sending request to this UniConfig node even if UniConfig on the UNICONFIG_1_IP comes back to life again.

2 Haproxy with floating IP and 2 UniConfig nodes scenario

In this second scenario, we’ll have an additional Haproxy instance. Both of the Haproxy instances are going to be connected to both UniConfig instances. The information about which UniConfig node is in active use is going to be shared among the Haproxy instances via stick tables. There will always be only one active Haproxy instance and in case it fails, the other one will take over. This is going to be managed by Keepalived which uses the VRRP protocol for failover managment.

2 Haproxy

The steps are going to be similar as in the first scenario, however the two Haproxy instances will have a slightly modified configuration, plus we’ll need to configure Keepalived on the VMs that run Haproxy. Also, the Haproxy will be reachable through a virtual IP address. In case of a failover, when for example the Haproxy instance 1 becomes unavailable, the Haproxy instance 2 assigns to itself the virtual IP address and announces this information via a Gratuitous ARP reply.

First, the Haproxy configuration:

peers LB
 peer LB1 HAPROXY_1_IP:10001
 peer LB2 HAPROXY_2_IP:10002

defaults
 mode http
 option http-server-close
 timeout client 20s
 timeout server 20s
 timeout connect 4s

frontend frontend_app
 bind HAPROXY_{1|2}_IP:HAPROXY_PORT name app
 default_backend backend_app

backend backend_app
 stick-table type ip size 1 peers LB
 stick on src
 server s1 UNICONFIG_1_IP:8181 check
 server s2 UNICONFIG_2_IP:8181 check backup

The HAPROXY_1_IP and HAPROXY_2_IP addresses should be populated with the private IP addresses of these two VMs. The UNICONFIG_1_IP and the UNICONFIG_2_IP should be populated as in the first scenario, with the IP addresses of the VMs that run UniConfig.

The HAPROXY_{1|2}_IP field should be populated with HAPROXY_1_IP on the VM where the first instance of Haproxy is running and HAPROXY_2_IP on the VM where the second one is running. The HAPROXY_PORT can be any port of your choosing, where Haproxy will be reachable.

Here, the IP address should be populated with the virtual IP address, on which the active Haproxy instance is going to be reachable. This IP address will also be present in the Keepalived configuration, because this IP is the one that the BACKUP Haproxy instance assigns to itself in case of a failover. The HAPROXY_PORT is the port can be chosen arbitrarily, it’s simply a port where Haproxy is going to be reachable.

Let’s move in to Keepalived. There will be two slightly different configuration files, one used for the MASTER Haproxy instance and the other one for BACKUP.

This is how the MASTER Haproxy instance configuration file looks like:

global_defs {
    script_user root
    enable_script_security
}

vrrp_script check_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface eth0
    state MASTER
    virtual_router_id 51
    priority 101

    virtual_ipaddress {
        HAPROXY_PUB_IP
    }

    track_script {
        check_haproxy
    }

    notify /run-on-status-change.sh

}

The BACKUP Haproxy instance configuration file looks like this:

global_defs {
    script_user root
    enable_script_security
}

vrrp_script check_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface eth0
    state BACKUP
    virtual_router_id 51
    priority 100

    virtual_ipaddress {
        HAPROXY_PUB_IP
    }

    track_script {
        check_haproxy
    }

    notify /run-on-status-change.sh

}

As you can see, the only difference are the state and the priority fields values. The notify field is optional, we’ll see how it can be used a little later.

These config files should be save as /etc/keepalived/keepalived.conf on both VMs which will run Haproxy.

Then, Haproxy needs to be executed with an additional parameter on both VMs. On the VM with the Haproxy instance one, execute haproxy -L LB1 and on the other one, execute haproxy -L LB2

Now, to run Keepalived, execute /etc/init.d/keepalived start on both the MASTER and the BACKUP machine.

Recap

Now all should be working. The two VMs exchange stick table information for keeping in sync regarding which UniConfig node to send request to. The MASTER Keepalive checks if Haproxy is running and informs the BACKUP instance about the status. So long as the connection between the Keepalive instances is in tact and Haproxy is running fine, the BACKUP instances does nothing. Should there be a problem with either the connection or the Haproxy process itself on the MASTER instance, the BACKUP instance kicks in, assigns to itself the virtual IP address and notifies the rest of the network about this fact via a Gratuitous ARP reply, which will cause all requests to be forwarerd to this BACKUP instance. Now if the MASTER instances connection and/or Haproxy process comes back to life and the BACKUP nodes starts receiving status information from the MASTER instance again, it’s going to unassign the virtual IP from itself and let the MASTER take over. This works through the election process between the two instances and is resolved via VRRP through the priority fields value in the config file. The instance with the higher value in the priority field takes presedence.

Notify script

Now for the notify field. Keepalived allows you to take action on a status change and lets you run an arbitrary script. As a simple example, you could define the script as follows:

#!/bin/bash

TIMESTAMP=$(date)
TRANSITION=$3

echo "I'm $TRANSITION : $TIMESTAMP" > /status-change.txt

According to the documentation the script defined in notify is executed with four additional arguments:

# arguments provided by Keepalived:
#   $(n-3) = "GROUP"|"INSTANCE"
#   $(n-2) = name of the group or instance
#   $(n-1) = target state of transition (stop only applies to instances)
#            ("MASTER"|"BACKUP"|"FAULT"|"STOP")
#   $(n)   = priority value
#   $(n-3) and $(n-1) are ALWAYS sent in uppercase, and the possible

By this logic, you can save the above script as /run-on-status-change.sh and see how the status changes between the MASTER and BACKUP node. Any time a status change occurs, you’ll be able to see that information in the status-change.txt file.

The script /run-on-status-change.sh is placed in the root directory in this example, because Keepalived expects the scripts whole path to only be writable by the root user.

Note

If you do not want to use such a script, delete the whole line from the config files.

To verify that the failover is working correctly without using the example script, you can take a look at the logs produced by Keepalived.