Scheduled upgrade from November 26, 07:00 UTC to November 26, 17:00 UTC
Kindly note that during the maintenance window, app.hopsworks.ai will not be accessible.
5
View the Changes
arrow back
Back to Blog
Javier Cabrera
link to linkedin
Software Engineer
Article updated on

Air-gapped Installations in Hopsworks

Tutorial: Testing an Air Gapped Installation
November 19, 2024
16 min
Read
Javier Cabrera
Javier Cabreralink to linkedin
Software Engineer
Hopsworks

TL;DR

For Hopsworks, installing in air-gapped environments is essential. We must deliver air-gap guarantees to our customers. An air-gapped environment refers to a computer network or system that is physically isolated from external networks, including the internet. In the context of Hopsworks, it specifically means a Kubernetes cluster that is completely disconnected from outside networks, preventing any external communication or access. However, after hundreds of lines of code, dependencies, and scripting, it's normal to expect some air-gap leaks. For instance, we might find an innocently forgotten wget command hidden within the Hopsworks code.

As coding enthusiasts, we recently asked ourselves, "Can we evaluate this automatically? And if so, can we turn it into a test?" Each run of the CI pipeline of Hopsworks involves creating a multinode Kubernetes cluster with Vagrant virtual machines. In this tutorial, we'll walk through how we extended this process to test air-gapped installations as well.

After evaluating our options, we consider using iptable rules as the best and most generic way of testing the air-gapped environment. We test the air-gapped environment by implementing fine-grained iptable rules. In the following text, we describe how this approach works.

Air-gapped simulation

As mentioned before the premise for this air gapped test is to block everything from the kubernetes nodes to the outside world. Therefore the very first iptable rule should be.

sudo iptables -P OUTPUT DROP

That effectively blocks everything. In fact, if you run such a rule inside a node your vagrant ssh session will be blocked and you will probably lose the control of the machine. Therefore, we should have an exception for host-guest communication in the network.

HOST_IP=$(ip route | grep default | awk '{print $3}')
sudo iptables -A OUTPUT -d $HOST_IP -j ACCEPT

In a vagrant node, the host IP is usually the default network routing. The first line gets us that IP, we then allow the guest to access the host.

Our Kubernetes cluster is a multinode environment. This means that we have several vagrant machines that are communicating between each other. We have a fixed subnetwork for these nodes. We allow communication between them with the following rule.

sudo iptables -A OUTPUT -d $INTERNAL_NETWORK/16 -j ACCEPT

At this point, we have isolated the communication for the nodes: they communicate between each other and with the host, everything else is blocked. Although we aim for an air-gapped cluster, our CI process requires pulling several container images. While this isn't strictly air-gapped, blocking access to all registries except our trusted one provides a good simulation. This is because companies usually allow traffic through a proxy to ensure air gap. Thus, we need to add rules that allow access only to our "approved" image registry.

resolve_domain() {
    DOMAIN=$1
    echo "Resolving domain $DOMAIN..."
    IP=$(nslookup $DOMAIN -type=A | grep -A 1 "Name:" | grep "Address" | tail -n1 | awk '{print $2}')
    if [ -z "$IP" ]; then
        echo "Error: Unable to resolve domain $DOMAIN"
        exit 1
    fi
    echo "Domain $DOMAIN resolved to IP $IP"
}

for domain in $ALLOWED_DOMAINS
do
	resolve_domain $domain
	# Allow traffic to the specific domain's IP address
	echo "Allowing traffic to domain $DOMAIN (IP: $IP)..."
	sudo iptables -A OUTPUT -d $IP -j ACCEPT
done

However, if we set ALLOWED_DOMAINS="docker.hops.works" and then run ctr -i pull docker.hops.works/busybox, it will fail. This is because the DNS resolution for docker.hops.works is also blocked. To resolve this issue, we need to add the following rules:

# Google DNS, Cloudflare DNS, systemd-resolved, CoreDNS, OpenDNS
DNSs="1.1.1.1 8.8.8.8 127.0.0.53 4.2.2.1 4.2.2.2 208.67.220.220"

# We explicitly set the DNSs here to avoid DNS navigation ala YourFreedom
for DNS in $DNSs
do
    sudo iptables -A OUTPUT -p tcp --dport 53 -d $DNS  -j ACCEPT
    sudo iptables -A OUTPUT -p udp --dport 53 -d $DNS  -j ACCEPT
    ...    
done

We add a list of known DNS IPs and allow communication with them, but only on port 53. This enables our cluster nodes to resolve the IP address behind docker.hops.works. Consequently, running ctr i pull docker.hops.works/busybox now works effectively. The DNS resolution returns the same IP we previously added to our allowed list.

We also permit communication within the IP range 10.96.0.0/16. This specific range is designated for the Kubernetes API server, which is crucial for the proper functioning of the cluster. By allowing this communication, we ensure that the nodes can interact with the control plane components. This allowance is a necessary exception to our air-gapped setup, as it enables the internal workings of the Kubernetes cluster while still maintaining isolation from external networks.

sudo iptables -A OUTPUT -d 10.96.0.0/16 -j ACCEPT

At this stage, we can effectively test if our Kubernetes deployments exclusively use images from docker.hops.works. However, we aim to dig deeper: once we've pulled the deployment images, are the deployment services truly air-gapped?

We install some needed operators that dynamically add other images, so we can't simply assume we're air-gapped by checking for external registries. This is the case also for Kubernetes CNI operators. For example, certain Kubernetes CNI rules, like the ones from calico, can bypass our written rules for Pods, potentially compromising our isolation efforts.

[A Pod is Kubernetes' basic unit of work—it groups containers that run on the same node and collectively form services.]

Pods have their own IPs, and the CNI handles Pod-to-Pod and Pod-to-external communication routing. The catch is that CNIs also modify node routing at a low level. Consequently, our rules may become ineffective for Pod-Pod and Pod-external communication if they don't overlap with these modifications.

To address this issue, we employ the same approach used for inter-node communication. When configuring a Kubernetes cluster, we can specify the Pod IP range. This allows us to preemptively block all communication by default, just as we did for node communication. In the snippet below, the Pod IP range is set to 10.244.0.0/16.

sudo iptables -A FORWARD -s 10.244.0.0/16 -d $INTERNAL_NETWORK/16 -j ACCEPT
sudo iptables -A OUTPUT -s 10.244.0.0/16 -d $INTERNAL_NETWORK/16 -j ACCEPT

# Pod to Pod 
sudo iptables -A FORWARD -s 10.244.0.0/16 -d 10.244.0.0/16 -j ACCEPT
sudo iptables -A OUTPUT -s 10.244.0.0/16 -d 10.244.0.0/16 -j ACCEPT

# Pod to Kube API allowed
sudo iptables -A OUTPUT -s 10.244.0.0/16 -d 10.96.0.0/16  -j ACCEPT
sudo iptables -A FORWARD -s 10.244.0.0/16 -d 10.96.0.0/16 -j ACCEPT

# We block everything else by default
sudo iptables -A FORWARD -s 10.244.0.0/16 -j DROP
sudo iptables -A OUTPUT -s 10.244.0.0/16 -j REJECT

We also need to allow communication between the nodes and any pod.

sudo iptables -A OUTPUT -d 10.244.0.0/16 -j ACCEPT

Pods need to also communicate potentially with our whitelisted IPs and the DNS navigation. We need to modify the rules above in the bash loops.

for domain in $ALLOWED_OMAINS
do
	resolve_domain $domain
	# Allow traffic to the specific domain's IP address
	echo "Allowing traffic to domain $DOMAIN (IP: $IP)..."
	sudo iptables -A OUTPUT -d $IP -j ACCEPT
  # From inside pods external domains can be accessed
  sudo iptables -A OUTPUT -s 10.244.0.0/16 -d $IP -j ACCEPT
  sudo iptables -A FORWARD -s 10.244.0.0/16 -d $IP -j ACCEPT
done
# Google DNS, Cloudflare DNS, systemd-resolved, CoreDNS, OpenDNS
DNSs="1.1.1.1 8.8.8.8 127.0.0.53 4.2.2.1 4.2.2.2 208.67.220.220"

# We explicitly set the DNSs here to avoid DNS navigation, e.g. https://www.your-freedom.net/index.php?id=dns-tunneling
for DNS in $DNSs
do
    sudo iptables -A OUTPUT -p tcp --dport 53 -d $DNS  -j ACCEPT
    sudo iptables -A OUTPUT -p udp --dport 53 -d $DNS  -j ACCEPT
    

    sudo iptables -A OUTPUT -s  10.244.0.0/16 -p udp --dport 53 -d $DNS -j ACCEPT
    sudo iptables -A OUTPUT -s  10.244.0.0/16 -p tcp --dport 53 -d $DNS -j ACCEPT

    sudo iptables -A FORWARD -s 10.244.0.0/16 -p udp --dport 53 -d $DNS -j ACCEPT
    sudo iptables -A FORWARD -s 10.244.0.0/16 -p tcp --dport 53 -d $DNS -j ACCEPT  
done

Bonus (logging)

While we've effectively created an air-gapped environment with iptables rules, we'd like to know what's being blocked for debugging and auditing our services. Specifically, we want to identify which IP is blocked from which Pod. Since a Pod isolates a service, we could then say, "Aha! Webserver X is trying to access IP Y." This insight would be invaluable for troubleshooting and maintaining our air-gapped setup.

If you add the following LOG rules just before DROP actions, the packet will be caught and logged.

sudo iptables -A OUTPUT -j LOG --log-prefix "Dropped from VM " --log-level 4
sudo iptables -A FORWARD -s 10.244.0.0/16 -j LOG --log-prefix "Dropped from POD " --log-level 4

Wrapping all

Iptables rules are applied in reverse order, from bottom to top. Therefore, after compiling all the rules mentioned above, we've created the following script that must be executed on each cluster node.

#!/bin/bash
set +e

INTERNAL_NETWORK=$(cat /vagrant/internal_network.txt)
DOMAINS=$(cat /vagrant/allowed_domains.txt

resolve_domain() {
    DOMAIN=$1
    echo "Resolving domain $DOMAIN..."
    IP=$(nslookup $DOMAIN -type=A | grep -A 1 "Name:" | grep "Address" | tail -n1 | awk '{print $2}')
    if [ -z "$IP" ]; then
        echo "Error: Unable to resolve domain $DOMAIN"
        exit 1
    fi
    echo "Domain $DOMAIN resolved to IP $IP"
}

# Flush existing iptables rules
echo "Flushing existing iptables rules..."
sudo iptables -F
sudo iptables -X

# Allow traffic to internal network (Vagrant VM cluster)
echo "Allowing traffic to internal network \\"$INTERNAL_NETWORK\\"..."
sudo iptables -A OUTPUT -d $INTERNAL_NETWORK/16 -j ACCEPT

# Allow loopback traffic
echo "Allowing loopback traffic..."
sudo iptables -A OUTPUT -o lo -j ACCEPT

# Allow DNS traffic
echo "Allowing DNS traffic to DNS servers..."
# Google DNS, Cloudflare DNS, systemd-resolved, CoreDNS, OpenDNS
DNSs="1.1.1.1 8.8.8.8 127.0.0.53 4.2.2.1 4.2.2.2 208.67.220.220"

# We explicitly set the DNSs here to avoid DNS navigation ala YourFreedom
for DNS in $DNSs
do
    sudo iptables -A OUTPUT -p tcp --dport 53 -d $DNS  -j ACCEPT
    sudo iptables -A OUTPUT -p udp --dport 53 -d $DNS  -j ACCEPT
  
    sudo iptables -A OUTPUT -s  10.244.0.0/16 -p udp --dport 53 -d $DNS -j ACCEPT
    sudo iptables -A OUTPUT -s  10.244.0.0/16 -p tcp --dport 53 -d $DNS -j ACCEPT

    sudo iptables -A FORWARD -s 10.244.0.0/16 -p udp --dport 53 -d $DNS -j ACCEPT
    sudo iptables -A FORWARD -s 10.244.0.0/16 -p tcp --dport 53 -d $DNS -j ACCEPT
done

# Allow DNS traffic in the nodes by default
sudo iptables -A OUTPUT -p udp --dport 53 -j ACCEPT
sudo iptables -A OUTPUT -p tcp --dport 53 -j ACCEPT

# Resolve the domain to an IP address

for domain in $DOMAINS
do
	resolve_domain $domain
	# Allow traffic to the specific domain's IP address
	echo "Allowing traffic to domain $DOMAIN (IP: $IP)..."
	sudo iptables -A OUTPUT -d $IP -j ACCEPT
  # From inside pods external domains can be accessed
  sudo iptables -A OUTPUT -s 10.244.0.0/16 -d $IP -j ACCEPT
  sudo iptables -A FORWARD -s 10.244.0.0/16 -d $IP -j ACCEPT
done

# Allow traffic to the Kubernetes API and other internal services
HOST_IP=$(ip route | grep default | awk '{print $3}')
sudo iptables -A OUTPUT -d $HOST_IP -j ACCEPT
sudo iptables -A OUTPUT -d 10.244.0.0/16 -j ACCEPT
sudo iptables -A OUTPUT -d 10.96.0.0/16 -j ACCEPT
# Pod to pod allowed
sudo iptables -A FORWARD -s 10.244.0.0/16 -d 10.244.0.0/16 -j ACCEPT
sudo iptables -A OUTPUT -s 10.244.0.0/16 -d 10.244.0.0/16 -j ACCEPT

# Pod to internal network, e.g. the kube api is set as a nodePort

sudo iptables -A FORWARD -s 10.244.0.0/16 -d $INTERNAL_NETWORK/16 -j ACCEPT
sudo iptables -A OUTPUT -s 10.244.0.0/16 -d $INTERNAL_NETWORK/16 -j ACCEPT

# From kube api to pod
sudo iptables -A FORWARD -s 10.96.0.0/16 -d 10.244.0.0/16 -j ACCEPT
sudo iptables -A OUTPUT -s 10.96.0.0/16 -d 10.244.0.0/16 -j ACCEPT

# Pod to Kube API allowed
sudo iptables -A OUTPUT -s 10.244.0.0/16 -d 10.96.0.0/16  -j ACCEPT
sudo iptables -A FORWARD -s 10.244.0.0/16 -d 10.96.0.0/16 -j ACCEPT

# Now block all other traffic from pods to external networks
sudo iptables -A FORWARD -s 10.244.0.0/16 -j LOG --log-prefix "Dropped from POD " --log-level 4
sudo iptables -A FORWARD -s 10.244.0.0/16 -j DROP
sudo iptables -A OUTPUT -s 10.244.0.0/16 -j LOG --log-prefix "Dropped from POD " --log-level 4
sudo iptables -A OUTPUT -s 10.244.0.0/16 -j REJECT

# Set default policy to drop outgoing traffic
echo "Setting default OUTPUT policy to DROP..."
sudo iptables -A OUTPUT -j LOG --log-prefix "Dropped from VM " --log-level 4
sudo iptables -P OUTPUT DROP

Conclusions

We have effectively created an air-gapped test for evaluating every change in Hopsworks. This test verifies that a deployment in Kubernetes can be conducted in air-gapped environments. We achieve this by implementing fine-grained iptables rules that block all external access, except for a whitelist of domain names that includes our images registry.

References