Jenkins High Availability: Strengthened Active-Passive Architecture with HAProxy and NFS
Active-Passive
Introduction
To install and configure Jenkins in an active-passive setup on Ubuntu 22.04 using an NFS file system, the process involves setting up the NFS server for shared storage between the Jenkins nodes and configuring Jenkins as a service on both the active and passive nodes. This guide will walk you through the installation and configuration of the NFS file system, installing Jenkins on both nodes, and setting up an active-passive failover mechanism using HAProxy.
Architecture
- 192.168.1.40 — — > NFS Server
- 192.168.1.41 — — > NFS Client
- 192.168.1.10 — — > Jenkins Active Node
- 192.168.1.7 — — > Jenkins Passive Node
- 192.168.1.6 — → HaProxy
- DnsNAME — → deniz.devopskings.com.tr
Step 1: Create User and Group for NFS Server
- Create a User Group for NFS: First, create a group for managing NFS shares.
sudo groupadd nfsnobody
2. Create a User for NFS: Then, create a user associated with this group.
sudo useradd -g nfsnobody nfsnobody
3. Set a root Permission for the NFS User: Add the user to additional groups (if needed):
sudo usermod -aG sudo nfsnobody
4. Verify the user and group: To check that the user and group have been created and configured correctly, you can run.
id nfsnobody
Step 2: NFS Server Setup (192.168.1.40)
- Install NFS server on the machine (192.168.1.40)
# update OS
sudo apt update && sudo apt upgrade -y
# Install
sudo apt install -y nfs-kernel-server
2. Create the shared directory that will be used for Jenkins data
# Create Directory: Creates the /srv/nfs/jenkins directory for NFS sharing.
sudo mkdir -p /srv/nfs/jenkins
# Set Ownership: Assigns the nfsnobody user and group to the directory.
sudo chown -R nfsnobody:nfsnobody /srv/nfs/jenkins
# Set Permissions: Gives the owner full access, and others read and execute access to the directory.
sudo chmod 755 /srv/nfs/jenkins
3. Configure exports by adding the following to /etc/exports
/srv/nfs/jenkins 192.168.1.41(rw,sync,no_root_squash) 192.168.1.10(rw,sync,no_root_squash) 192.168.1.7(rw,sync,no_root_squash)
4. Start, enable and status the NFS service
# Start NFS: Starts the NFS server
sudo systemctl start nfs-server
# Enable NFS: Ensures NFS starts on boot
sudo systemctl enable nfs-server
# Check NFS: Shows the current status of the NFS server
sudo systemctl status nfs-server
5. Export the file system:
sudo exportfs -a
Step 3: NFS Client Setup (192.168.1.41, 192.168.1.10, 192.168.1.7)
- Install NFS client on each machine
# update OS
sudo apt update && sudo apt upgrade -y
# Install
sudo apt install -y nfs-kernel-server
2. Create the mount point on each machine
sudo mkdir -p /var/lib/jenkins
3. Gives full permissions to /var/lib/jenkins
for all users. ( 192.168.1.41–192.168.1.10–192.168.1.7 )
sudo chmod 777 /var/lib/jenkins
4. Mount the NFS share from the NFS server
sudo mount 192.168.1.40:/srv/nfs/jenkins /var/lib/jenkins
5. Add the NFS share to /etc/fstab
for automatic mounting on reboot
# Switch to root user
sudo su -
# Add to /etc/fstab: Auto-mount NFS share to /var/lib/jenkins at boot
echo "192.168.1.40:/srv/nfs/jenkins /var/lib/jenkins nfs defaults 0 0" | sudo tee -a /etc/fstab
Step 4: Jenkins Installation on 192.168.1.10 (Active) and 192.168.1.7 (Passive)
Our pre-requisite is Java-17. We need to install Java first before proceeding with further steps.
- Update Your Ubuntu 20.04 LTS: Begin by opening a terminal and updating your system’s package list to ensure you have the latest package information.
# OS update
sudo apt update
2. Install Java 17 on Ubuntu 20.04 LTS: As Ubuntu 20.04 LTS may not include Java 17 by default, we will install it using OpenJDK, the open-source Java Development Kit. Run the following command to install Java 17
# Install
sudo apt install -y openjdk-17-jdk
3. Verify Java Installation on Ubuntu 20.04 LTS: Once the installation is complete, verify that Java 17 has been installed correctly by running the following command.
# check
java -version
4. Install Jenkins on both machines
# Download Jenkins Key: Downloads the Jenkins key file to verify the package’s authenticity.
sudo wget -O /usr/share/keyrings/jenkins-keyring.asc \
https://pkg.jenkins.io/debian-stable/jenkins.io-2023.key
# Add Jenkins Repository: Adds the Jenkins repository to your system's package sources.
echo "deb [signed-by=/usr/share/keyrings/jenkins-keyring.asc]" \
https://pkg.jenkins.io/debian-stable binary/ | sudo tee \
/etc/apt/sources.list.d/jenkins.list > /dev/null
# Update Package List: Updates your system's package list to include the Jenkins repository
sudo apt-get update
# Install Jenkins: Installs Jenkins on your system.
sudo apt-get install -y jenkins
5. Gives full permissions to /var/lib/jenkins
for all users. ( 192.168.1.41–192.168.1.10–192.168.1.7 )
sudo chmod 777 /var/lib/jenkins
6. Start, enable and status Jenkins on both machines
# Start Jenkins: Starts Jenkins
sudo systemctl start jenkins
# Enable Jenkins: Auto-start on boot
sudo systemctl enable jenkins
# Check Jenkins: Shows its status
sudo systemctl status jenkins
7. I check the most important part: The NFS server and clients.
Troubleshooting_1: On the active and passive machines where you installed Jenkins, if you see a user and group other than Jenkins when you check with the “ls -ld /var/lib/jenkins” command, you should change them using the chown
command."
sudo chown -R jenkins:jenkins /var/lib/jenkins
Step 5: HAProxy Configuration (Active-Passive Setup)
- Install HAProxy on a separate machine ( 192.168.1.6 )
# OS Upgrade
sudo apt update
# Install
sudo apt install -y haproxy
2. Configure HAProxy to manage the active-passive setup by editing /etc/haproxy/haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Tune settings for performance and security
maxconn 4096
ssl-server-verify none
defaults
log global
option httplog
option dontlognull
timeout connect 5s
timeout client 30s
timeout server 30s
frontend stats
mode http
bind *:7000
stats enable
stats uri /stats
stats refresh 10s
stats admin if LOCALHOST
stats auth admin:admin
frontend jenkins
bind *:80
default_backend jenkins_back
backend jenkins_back
option httpchk GET /login
http-check expect status 200
balance first
mode http
option httpchk GET /login
server jenkins_active 192.168.1.10:8080 check
server jenkins_passive 192.168.1.7:8080 check backup
first balance: This ensures that traffic will always go to the first server in the list (
jenkins_active
). If the active server becomes unavailable, traffic will automatically be routed to the passive server (jenkins_passive
) as a backup.
3. Start, enable and status HAProxy
# Start Jenkins: Starts Jenkins
sudo systemctl start haproxy
# Enable Jenkins: Auto-start on boot
sudo systemctl enable haproxy
# Check Jenkins: Shows its status
sudo systemctl status haproxy
4. Verifying Active-Passive Failover
- Test by stopping Jenkins on 192.168.1.10 to confirm that 192.168.1.7 takes over as the active instance.
- HAProxy will automatically route traffic to the active node, ensuring data integrity and high availability.
Now, let’s check the DNS and go to the address deniz.devopskings.com.tr in the browser.
Success !!!!!!!
- I did the opposite now. Stop Jenkins on 192.168.1.7 to verify that 192.168.1.10 has taken over as the active instance.
Let’s check the DNS again and go to deniz.devopskings.com.tr from the browser.
You can check which machine is handling the traffic from the HAProxy UI.
As seen in the traffic on the active Jenkins machine…..
Useful commands while troubleshooting
- Verify the Mount: To verify that the NFS share is mounted correctly, use the following command.
mount | grep nfs
- Check ownership and permissions of
/var/lib/jenkins
on the passive machine
ls -ld /var/lib/jenkins
- Unmount and re-mount the NFS share on the passive machine
sudo umount /var/lib/jenkins
sudo mount -a
- Check for any locked files or processes holding onto Jenkins files
sudo lsof /var/lib/jenkins
- Check for
.lock
or.tmp
files in/var/lib/jenkins
sudo find /var/lib/jenkins -name "*.lock" -o -name "*.tmp"
If you find any, remove these temporary or lock files, as they might be preventing the passive Jenkins instance from starting.
sudo rm -f /var/lib/jenkins/*.lock /var/lib/jenkins/*.tmp
- Check the NFS mount status on the active - passive node.
df -h | grep /var/lib/jenkins
A fully optional Bash script.
#!/bin/bash
# Configuration
endpoint="http://deniz.devopskings.com.tr" # Replace with your actual endpoint
retry_interval=60 # 1 minute retry interval
retry_attempts=5 # Retry 5 times
check_interval=600 # 10 minutes between each check
ssh_user="root" # SSH username for the remote machine
ssh_host="192.168.1.7" # Replace with the IP or hostname of the remote machine
start_service_command="sudo systemctl start jenkins.ervice" # Command to start the service on remote
# Function to check endpoint and retry if necessary
function check_endpoint {
for ((attempt=1; attempt<=retry_attempts; attempt++))
do
response=$(curl -s -o /dev/null -w "%{http_code}" "$endpoint")
if [[ "$response" -eq 200 ]]; then
echo "Request successful (HTTP 200). No action needed."
return 0
else
echo "Attempt $attempt: Received HTTP $response. Retrying in $retry_interval seconds..."
sleep "$retry_interval"
fi
done
echo "All attempts failed. Initiating service start on backup machine..."
return 1
}
# Main loop to run every 10 minutes
while true
do
if ! check_endpoint; then
ssh "$ssh_user@$ssh_host" "$start_service_command"
echo "Service started on backup machine."
fi
echo "Waiting for $check_interval seconds before next check..."
sleep "$check_interval"
done
Explanation of the Script:
- Configuration: Adjust the endpoint URL, SSH details, retry interval, and the command to start the service.
- Function
check_endpoint
: Sends a request to the endpoint. If it fails to get a 200 response after the specified retries, it returns failure. - Main Loop: Runs every 10 minutes. If
check_endpoint
fails, it connects to the specified machine and starts the service.