<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Fredrick Biering</title>
    <description></description>
    <link>https://fredrickb.com/</link>
    <atom:link href="https://fredrickb.com/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Tue, 24 Mar 2026 20:03:37 +0000</pubDate>
    <lastBuildDate>Tue, 24 Mar 2026 20:03:37 +0000</lastBuildDate>
    <generator>Jekyll v4.4.1</generator>
    
      <item>
        <title>Upgrading K3s</title>
        <description>&lt;p&gt;I haven’t upgraded the K3s cluster in the homelab for quite some time.
Now seemed like a good time to do a round of upgrades before starting
a new project.&lt;/p&gt;

&lt;p&gt;What has been upgraded:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Item&lt;/th&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;New version&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;K3s cluster&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1.35.1+k3s1&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Calico&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v3.31.x&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;ArgoCD&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3.x&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Longhorn&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.10.x&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Sealed Secrets&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2.18.x&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;step-ca&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.29.x&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;step-issuer&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.9.9&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;MetalLB&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.15.3&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Prometheus&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;28.13.x&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Loki&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;6.53.x&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;opentelemetry-collector&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0.146.1&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;cert-manager&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.19.x&lt;/code&gt; (Helm chart version)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;pve-exporter&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3.8.1&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;An excerpt of the upgrade process is detailed in the following sections.&lt;/p&gt;

&lt;h2 id=&quot;k3s&quot;&gt;K3s&lt;/h2&gt;

&lt;p&gt;For K3s, the upgrade procedure boils down to:&lt;/p&gt;

&lt;p&gt;Grab the latest version of the K3s installation script from
&lt;a href=&quot;https://get.k3s.io&quot;&gt;https://get.k3s.io&lt;/a&gt; and overwrite the
previous version stored in the Ansible role.&lt;/p&gt;

&lt;p&gt;Change the release channel and version in the inventory:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gd&quot;&gt;--- a/inventory.ini
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/inventory.ini
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -1,8 +1,8 @@&lt;/span&gt;
[all:vars]
# Channels and versions for k3s is located at
# https://update.k3s.io/v1-release/channels
&lt;span class=&quot;gd&quot;&gt;-k3s_channel=v1.34
-k3s_version=v1.34.4+k3s1
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+k3s_channel=v1.35
+k3s_version=v1.35.0+k3s1
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;k3s_install_script_src=installation_scripts/install_k3s.sh
k3s_install_script_dest=/usr/local/bin/install_k3s.sh
reinstall=false
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Rerun the Ansible playbook with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-e reinstall=true&lt;/code&gt; to trigger a
rerun of the installation script using the new version.
The playbook uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;serial: 1&lt;/code&gt; to upgrade a single node at a time,
starting with the control plane nodes before proceeding to the
worker nodes.&lt;/p&gt;

&lt;p&gt;Thats it. I have yet to experience any issues with this approach
and I commend the maintainers of K3s for providing a great upgrade
experience.&lt;/p&gt;

&lt;p&gt;This is during the upgrade process to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1.32.0+k3s1&lt;/code&gt;. The control
plane nodes have been upgraded, worker nodes are in the process
of being upgraded:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-k3s-march-2026/k3s_mid_upgrade_cluster.png&quot; title=&quot;K3s during the upgrade process&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-k3s-march-2026/k3s_mid_upgrade_cluster.png&quot; alt=&quot;Grafana terminal showing K3s upgrade process&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h2 id=&quot;calico&quot;&gt;Calico&lt;/h2&gt;

&lt;p&gt;When upgrading Calico there was an issue with the CPU architecture
of the control plane node VMs in Proxmox. I had not switched the
architecture from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;qemu&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;x86-64-v2-AES&lt;/code&gt;, resulting in this error:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;csi-node-driver-registrar This program can only be run on AMD64 processors with v2 microarchitecture support.
calico-csi This program can only be run on AMD64 processors with v2 microarchitecture support.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;When it started to fail I tried rolling back the version of Calico to
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v3.30.x&lt;/code&gt;, but doing so proved more difficult (for several reasons)
than just going ahead with the upgrade. I drained the control plane nodes,
switched the VM CPU architecture, rebooted, and once all 3 nodes were
fixed I had version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v3.31.x&lt;/code&gt; running.&lt;/p&gt;

&lt;p&gt;Changes to the Terraform Proxmox VM config for control plane nodes:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gd&quot;&gt;--- a/environments/production/terraform.tfvars
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/environments/production/terraform.tfvars
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -7,7 +7,7 @@&lt;/span&gt; vms = [
     ip                  = &quot;10.0.3.9/24&quot;
     memory              = 4096
     cores               = 4,
&lt;span class=&quot;gd&quot;&gt;-    cpu_architecture    = &quot;qemu64&quot;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    cpu_architecture    = &quot;x86-64-v2-AES&quot;
&lt;/span&gt;     bootdisk_size       = &quot;20&quot;
     disks               = []
     tags                = [&quot;k8s&quot;, &quot;control&quot;, &quot;ubuntu-24.04&quot;]
&lt;span class=&quot;p&quot;&gt;@@ -20,7 +20,7 @@&lt;/span&gt; vms = [
     ip                  = &quot;10.0.3.10/24&quot;
     memory              = 4096
     cores               = 4,
&lt;span class=&quot;gd&quot;&gt;-    cpu_architecture    = &quot;qemu64&quot;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    cpu_architecture    = &quot;x86-64-v2-AES&quot;
&lt;/span&gt;     bootdisk_size       = &quot;20&quot;
     disks               = []
     tags                = [&quot;k8s&quot;, &quot;control&quot;, &quot;ubuntu-24.04&quot;]
&lt;span class=&quot;p&quot;&gt;@@ -33,7 +33,7 @@&lt;/span&gt; vms = [
     ip                  = &quot;10.0.3.11/24&quot;
     memory              = 4096
     cores               = 4
&lt;span class=&quot;gd&quot;&gt;-    cpu_architecture    = &quot;qemu64&quot;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+    cpu_architecture    = &quot;x86-64-v2-AES&quot;
&lt;/span&gt;     bootdisk_size       = &quot;20&quot;
     disks               = []
     tags                = [&quot;k8s&quot;, &quot;control&quot;, &quot;ubuntu-24.04&quot;]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;argocd&quot;&gt;ArgoCD&lt;/h2&gt;

&lt;p&gt;ArgoCD had a long chain of releases since my last upgrade, reading changelogs
took most of the time. Not only was the major version of ArgoCD itself bumped
from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2.x&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3.x&lt;/code&gt; (&lt;a href=&quot;https://argo-cd.readthedocs.io/en/latest/operator-manual/upgrading/2.14-3.0/&quot;&gt;upgrade notes&lt;/a&gt;),
but the Helm chart had gone through a few major releases from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;7.x&lt;/code&gt;to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;9.x&lt;/code&gt;.
I had to follow the advice in &lt;a href=&quot;https://github.com/argoproj/argo-helm/issues/3272#issuecomment-2837476069&quot;&gt;this issue&lt;/a&gt;
and disable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;redis.ha&lt;/code&gt; temporarily to be able to upgrade to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;8.x&lt;/code&gt;. The upgrade
from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;8.x&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;9.x&lt;/code&gt; did not cause any issues.&lt;/p&gt;

&lt;h2 id=&quot;longhorn&quot;&gt;Longhorn&lt;/h2&gt;

&lt;p&gt;Longhorn was time-consuming since each version upgrade was preceded by an offsite
backup of all volumes. I upgraded from version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.8.x&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.10.x&lt;/code&gt;.
My Longhorn installation has a tendency to attach all volumes to a single worker node,
which sometimes overloads the node and causes instability. It does eventually stabilize
after each upgrade, but I haven’t found the root cause yet.&lt;/p&gt;

&lt;h2 id=&quot;prometheus&quot;&gt;Prometheus&lt;/h2&gt;

&lt;p&gt;I temporarily lost some cluster metrics due to a missing port in a NetworkPolicy
after upgrading the Prometheus chart from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;27.x&lt;/code&gt; to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;28.x&lt;/code&gt; (starting point was &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;26.x&lt;/code&gt;):&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-k3s-march-2026/grafana_dashboard_prometheus_cluster_metrics.png&quot; title=&quot;K3s cluster metrics disappeared before the NetworkPolicy is updated&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-k3s-march-2026/grafana_dashboard_prometheus_cluster_metrics.png&quot; alt=&quot;Screenshot of Grafana panel showing K3s cluster metrics disappearing before the NetworkPolicy is updated&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;I crosschecked the traffic in &lt;a href=&quot;/2025/10/25/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/&quot;&gt;Whisker&lt;/a&gt;
and &lt;a href=&quot;/2026/01/12/ingesting-calico-flow-logs-into-loki-in-the-homelab/&quot;&gt;calico-flow-logs-otlphttp-exporter&lt;/a&gt;
to identify the blocked port:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-k3s-march-2026/calico_whisker_screenshot_blocked_network_traffic.png&quot; title=&quot;Whisker showing blocked network traffic&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-k3s-march-2026/calico_whisker_screenshot_blocked_network_traffic.png&quot; alt=&quot;Screenshot of Whisker showing blocked network traffic&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-k3s-march-2026/calico_flow_logs_otlphttp_exporter_grafana_panel.png&quot; title=&quot;Grafana panel showing blocked network traffic from calico-flow-logs-otlphttp-exporter&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-k3s-march-2026/calico_flow_logs_otlphttp_exporter_grafana_panel.png&quot; alt=&quot;Screenshot of Grafana panel showing blocked network traffic from calico-flow-logs-otlphttp-exporter&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Updating the ports in the NetworkPolicy for traffic from Prometheus to the K3s VLAN
fixed it:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gd&quot;&gt;--- a/prometheus/allow_egress_to_k3s_vlan_netpol.yaml
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/prometheus/allow_egress_to_k3s_vlan_netpol.yaml
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -15,3 +15,4 @@&lt;/span&gt; spec:
         ports:
           - 80
           - 7472
&lt;span class=&quot;gi&quot;&gt;+          - 10250
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Using Ansible to manage cluster configuration and ArgoCD to manage applications
is one of the better investments I’ve made in the homelab. Most of my time spent
upgrading is reading changelogs, taking backups of persistent data and updating
config. I don’t really have to worry about versioning or how to apply changes.&lt;/p&gt;
</description>
        <pubDate>Tue, 24 Mar 2026 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2026/03/24/upgrading-k3s/</link>
        <guid isPermaLink="true">https://fredrickb.com/2026/03/24/upgrading-k3s/</guid>
        
        <category>homelab</category>
        
        <category>k3s</category>
        
        <category>longhorn</category>
        
        <category>step-ca</category>
        
        <category>argocd</category>
        
        <category>sealed-secrets</category>
        
        <category>calico</category>
        
        <category>metallb</category>
        
        <category>prometheus</category>
        
        <category>loki</category>
        
        <category>opentelemetry-collector</category>
        
        <category>cert-manager</category>
        
        <category>pve-exporter</category>
        
        
      </item>
    
      <item>
        <title>Issue when customizing cloud-init images on Proxmox 9</title>
        <description>&lt;p&gt;I upgraded to &lt;a href=&quot;/2025/11/11/upgrade-proxmox-from-8-to-9/&quot;&gt;Proxmox 9&lt;/a&gt;
in the homelab last year without any noticeable issues. It was only when I 
started the upgrade process of my Ansible playbook for the Proxmox hosts
that I ran into problems. The role for creating cloud-init VM templates failed
during the task which installs the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;qemu-guest-agent&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Install qemu-guest-agent in all images in directory &quot;{{ pvesm_local_storage_path }}/{{ pvesm_local_storage_iso_subpath }}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;iso_images_to_download&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;virt-customize \&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;-a &quot;{{ pvesm_local_storage_path }}/{{ pvesm_local_storage_iso_subpath }}/{{ item.filename }}&quot; \&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;--install qemu-guest-agent&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Error messages indicated DNS problems, similar to those in
&lt;a href=&quot;https://forum.proxmox.com/threads/virt-customize-install-broken.130473/&quot;&gt;this forum post&lt;/a&gt;. There is already a
&lt;a href=&quot;https://github.com/libguestfs/libguestfs/issues/211&quot;&gt;GitHub issue&lt;/a&gt; on this in
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;libguestfs&lt;/code&gt;. A suggested solution is to install &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dhcpcd-base&lt;/code&gt; prior to running
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;virt-customize&lt;/code&gt;. I added it to the playbook:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Install dhcpcd-base&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.apt&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;dhcpcd-base&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Install qemu-guest-agent in all images in directory &quot;{{ pvesm_local_storage_path }}/{{ pvesm_local_storage_iso_subpath }}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;loop&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;iso_images_to_download&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;virt-customize \&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;-a &quot;{{ pvesm_local_storage_path }}/{{ pvesm_local_storage_iso_subpath }}/{{ item.filename }}&quot; \&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;--install qemu-guest-agent&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And then I had a working VM template again:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/posts/issue-when-customizing-cloud-init-images-on-proxmox-9/ubuntu-24.04-cloud-init-vm-template.png&quot; alt=&quot;Screenshot of Ubuntu Server 24.04 cloud-init VM template&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Resources:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.proxmox.com/threads/virt-customize-install-broken.130473/&quot;&gt;Proxmox forum thread #1&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.proxmox.com/threads/proxmox-9-upgrade-virt-customize-no-longer-has-internet-access.169355/&quot;&gt;Proxmox forum thread #2&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/libguestfs/libguestfs/issues/211&quot;&gt;libguestfs issue&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Wed, 28 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2026/01/28/proxmox-9-virt-customize-issues/</link>
        <guid isPermaLink="true">https://fredrickb.com/2026/01/28/proxmox-9-virt-customize-issues/</guid>
        
        <category>proxmox</category>
        
        <category>homelab</category>
        
        <category>cloud-init</category>
        
        <category>virt-customize</category>
        
        <category>vm-templates</category>
        
        
      </item>
    
      <item>
        <title>Ingesting Calico flow logs into Loki in the homelab
</title>
        <description>&lt;p&gt;I’ve written about &lt;a href=&quot;https://www.tigera.io/blog/calico-whisker-your-new-ally-in-network-observability/&quot;&gt;Whisker&lt;/a&gt; previously in
&lt;a href=&quot;/2025/10/25/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/&quot;&gt;migrating from Flannel to Calico&lt;/a&gt;. While I like
being able to debug network traffic in realtime,
Whisker does not give me a historical overview:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;What services has the most traffic measured by
packets or bandwidth in the last 30 minutes?&lt;/li&gt;
  &lt;li&gt;What is the distribution of traffic by protocol
over the last 30 minutes?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3.30&lt;/code&gt; of Calico OSS also introduced &lt;a href=&quot;https://www.tigera.io/blog/calico-open-source-3-30-exploring-the-goldmane-api-for-custom-kubernetes-network-observability/&quot;&gt;Goldmane&lt;/a&gt;, the gRPC-API
powering Whisker. Goldmane uses &lt;a href=&quot;https://github.com/projectcalico/calico/blob/07e10a5c28fa756165e5a3578d15d7a3db148cd5/goldmane/proto/api.proto&quot;&gt;Protobufs&lt;/a&gt;, which exposes a
&lt;a href=&quot;https://github.com/projectcalico/calico/blob/07e10a5c28fa756165e5a3578d15d7a3db148cd5/goldmane/proto/api.proto#L16&quot;&gt;streaming endpoint&lt;/a&gt; to receive flow logs in realtime.&lt;/p&gt;

&lt;p&gt;I use &lt;a href=&quot;https://grafana.com/docs/loki/latest/&quot;&gt;Loki&lt;/a&gt; to store logs for everything I run in my homelab, including
firewall logs from OPNsense. It only makes sense to ingest flow logs
into Loki as well. The protobuf payload from Goldmane contains information
about the amount of bytes and packet count for each network flow.
I can use this info with &lt;a href=&quot;https://grafana.com/docs/loki/latest/query/metric_queries/&quot;&gt;Metric Queries&lt;/a&gt; to turn the flow logs into
metrics.&lt;/p&gt;

&lt;p&gt;While researching the HTTP API for ingesting logs into Loki,
I stumbled upon the documentation for the &lt;a href=&quot;https://grafana.com/docs/loki/latest/send-data/otel/&quot;&gt;OTLP endpoint&lt;/a&gt;. Having
no experience with &lt;a href=&quot;https://opentelemetry.io/docs/specs/otlp/&quot;&gt;OTLP&lt;/a&gt;, I went down the rabbit hole and learned
about &lt;a href=&quot;https://opentelemetry.io/docs/specs/otlp/#otlphttp&quot;&gt;OTLP/HTTP&lt;/a&gt;, &lt;a href=&quot;https://opentelemetry.io/docs/specs/otel/logs/#direct-to-collector&quot;&gt;Direct to collector&lt;/a&gt; logging, &lt;a href=&quot;https://opentelemetry.io/docs/specs/otel/logs/sdk/&quot;&gt;Logs SDK&lt;/a&gt; and the
&lt;a href=&quot;https://pkg.go.dev/go.opentelemetry.io/contrib/bridges/otelslog&quot;&gt;log/slog Logging bridge&lt;/a&gt;. Using an open standard like OTLP/HTTP
instead of targeting Loki specifically sounds like a much better idea.&lt;/p&gt;

&lt;p&gt;This lead me to writing &lt;a href=&quot;https://github.com/FredrickB/calico-flow-logs-otlphttp-exporter&quot;&gt;calico-flow-logs-otlphttp-exporter&lt;/a&gt;, which streams
flow logs from Goldmane to anything thats compatible with OTLP/HTTP.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: calico-flow-logs-otlphttp-exporter is still in development,
use at your own risk.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is my setup to ingest flow logs from Goldmane into Loki in the homelab:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The OpenTelemetry Collector is not really necessary, its just
there in case I want to add processors later.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
  goldmane[Goldmane]
  exporter[calico-flow-logs-otlphttp-exporter]
  otel-collector[OpenTelemetry Collector]
  loki[Loki]
  grafana[Grafana]
  exporter--&amp;gt;|Subscribe to flow logs streaming endpoint|goldmane
  goldmane--&amp;gt;|Stream flow logs|exporter
  exporter--&amp;gt;|Push logs using OTLP/HTTP|otel-collector
  otel-collector--&amp;gt;|Push logs using OTLP/HTTP|loki
  grafana--&amp;gt;|Consume Datasource|loki
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Using Grafana to query flow logs ingested into Loki:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/ingesting-calico-flow-logs-into-loki-in-the-homelab/otel_collector_loki_otlp_demo.gif&quot;&gt;
          &lt;img src=&quot;/img/posts/ingesting-calico-flow-logs-into-loki-in-the-homelab/otel_collector_loki_otlp_demo.gif&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;I also created a &lt;a href=&quot;https://github.com/FredrickB/calico-flow-logs-otlphttp-exporter/blob/main/docs/monitoring/loki-grafana-dashboard.json&quot;&gt;Grafana dashboard&lt;/a&gt; to see the distribution of traffic based
on protocol/bandwidth over a specific time period:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/ingesting-calico-flow-logs-into-loki-in-the-homelab/grafana_dashboards.png&quot;&gt;
          &lt;img src=&quot;/img/posts/ingesting-calico-flow-logs-into-loki-in-the-homelab/grafana_dashboards.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;I now have historical data for all network flows in the K3s cluster.
The flow logs can be analyzed, visualized, and I retain all the data
in my own infrastructure.&lt;/p&gt;

&lt;!-- Links --&gt;

</description>
        <pubDate>Mon, 12 Jan 2026 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2026/01/12/ingesting-calico-flow-logs-into-loki-in-the-homelab/</link>
        <guid isPermaLink="true">https://fredrickb.com/2026/01/12/ingesting-calico-flow-logs-into-loki-in-the-homelab/</guid>
        
        <category>calico</category>
        
        <category>kubernetes</category>
        
        <category>goldmane</category>
        
        <category>otlp</category>
        
        <category>otlphttp</category>
        
        <category>opentelemetry</category>
        
        <category>homelab</category>
        
        <category>loki</category>
        
        <category>grafana</category>
        
        
      </item>
    
      <item>
        <title>Upgrading the Proxmox cluster from 8 to 9 in the homelab</title>
        <description>&lt;p&gt;&lt;a href=&quot;https://www.proxmox.com/en/about/company-details/press-releases/proxmox-virtual-environment-9-0&quot;&gt;Proxmox 9 was released in August&lt;/a&gt;. I’ve focused the
past few weeks on &lt;a href=&quot;/2025/10/25/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/&quot;&gt;migrating from Flannel to Calico&lt;/a&gt;,
and with the CNI-switch in K3s out of the way I was
able to dedicate time to upgrade Proxmox.&lt;/p&gt;

&lt;p&gt;Proxmox has a pretty nice &lt;a href=&quot;https://pve.proxmox.com/wiki/Upgrade_from_8_to_9&quot;&gt;guide&lt;/a&gt; for upgrading from 8
to 9. I opted for doing an in-place upgrade this time
as opposed to reinstalling the entire OS. I did a mix
of one-off commands and running a temporary Ansible
playbook against each host.&lt;/p&gt;

&lt;p&gt;The Proxmox cluster as it stands currently:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph LR
  subgraph cluster[Datacenter: pve-cluster-1]
    pve2
    pve3
    pve4
  end
  subgraph pve2[Node: pve2]
    pve2_vms[VMs]
  end
  subgraph pve3[Node: pve3]
    pve3_vms[VMs]
  end
  subgraph pve4[Node: pve4]
    pve4_vms[VMs]
  end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Upgrades were done in the following order:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve2&lt;/code&gt; -&amp;gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve3&lt;/code&gt; -&amp;gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve4&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Upgrade process performed on each node:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Migrate all VMs to another node in the cluster&lt;/li&gt;
  &lt;li&gt;Upgrade the node to the latest &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;8.4.x&lt;/code&gt; version&lt;/li&gt;
  &lt;li&gt;Run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve8to9 --full&lt;/code&gt; and fix reported errors&lt;/li&gt;
  &lt;li&gt;Perform the upgrade from 8 to 9&lt;/li&gt;
  &lt;li&gt;Migrate VMs back to the upgraded node&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then the Terraform provider in all configs
for Proxmox VMs was updated to the latest version.&lt;/p&gt;

&lt;h2 id=&quot;preparing-for-the-upgrade&quot;&gt;Preparing for the upgrade&lt;/h2&gt;

&lt;p&gt;I had to fix this prior to starting (&lt;a href=&quot;https://forum.proxmox.com/threads/problem-with-removable-bootloader.163555/&quot;&gt;more info&lt;/a&gt;):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Removable bootloader found at &lt;span class=&quot;s1&quot;&gt;&apos;/boot/efi/EFI/BOOT/BOOTX64.efi&apos;&lt;/span&gt;, but GRUB packages not &lt;span class=&quot;nb&quot;&gt;set &lt;/span&gt;up to update it!
Run the following &lt;span class=&quot;nb&quot;&gt;command&lt;/span&gt;:

&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;grub-efi-amd64 grub2/force_efi_extra_removable boolean true&apos;&lt;/span&gt; | debconf-set-selections &lt;span class=&quot;nt&quot;&gt;-v&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-u&lt;/span&gt;

Then reinstall GRUB with &lt;span class=&quot;s1&quot;&gt;&apos;apt install --reinstall grub-efi-amd64&apos;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A reboot later it was fixed.&lt;/p&gt;

&lt;h2 id=&quot;performing-the-upgrade&quot;&gt;Performing the upgrade&lt;/h2&gt;

&lt;p&gt;Migrating VMs takes a long time for the K3s nodes
dedicated to storage. Each of those VMs has a large
disk reserved specifically for Longhorn:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/migrating_vms_between_nodes.png&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/migrating_vms_between_nodes.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;Migrating a K3s worker VM running Longhorn
before upgrading the underlying Proxmox host
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;p&gt;I ran &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve8to9 --full&lt;/code&gt; to identify and fix issues
before starting the upgrade:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;pve8to9 &lt;span class=&quot;nt&quot;&gt;--full&lt;/span&gt;
...
&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; SUMMARY &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;

TOTAL:    48
PASSED:   39
SKIPPED:  5
WARNINGS: 1
FAILURES: 0

ATTENTION: Please check the output &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;detailed information!
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;systemd-boot&lt;/code&gt; package had to be removed:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;systemd-boot meta-package installed. This will cause
problems on upgrades of other boot-related packages.
Remove ‘systemd-boot’.
See https://pve.proxmox.com/wiki/
Upgrade_from_8_to_9#sd-boot-warning
for more information.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For nodes &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve3&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve4&lt;/code&gt; I opted to remove the package
using an Ansible role, but for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve2&lt;/code&gt; (which was the
initial host) I did it manually:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;apt remove systemd-boot
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fixed all of the apt sources:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nb&quot;&gt;sed&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-i&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;s/bookworm/trixie/g&apos;&lt;/span&gt; /etc/apt/sources.list
&lt;span class=&quot;nb&quot;&gt;sed&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-i&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;s/bookworm/trixie/g&apos;&lt;/span&gt; /etc/apt/sources.list.d/&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;.list&lt;span class=&quot;k&quot;&gt;*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Cleaned up the Grafana apt repository sources used to install
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;promtail&lt;/code&gt; for log collection:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;wget &lt;span class=&quot;nt&quot;&gt;-q&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-O&lt;/span&gt; - https://apt.grafana.com/gpg.key | gpg &lt;span class=&quot;nt&quot;&gt;--dearmor&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;tee&lt;/span&gt; /etc/apt/keyrings/grafana.gpg &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /dev/null
&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main&quot;&lt;/span&gt; | &lt;span class=&quot;nb&quot;&gt;tee&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt; /etc/apt/sources.list.d/grafana.list
&lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; /etc/apt/sources.list.d/apt_grafana_com.list
&lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; /etc/apt/trusted.gpg.d/grafana.asc
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Replaced privilege &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;VM.Monitor&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Sys.Audit&lt;/code&gt; in the Terraform
provisioning user role since &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;VM.Monitor&lt;/code&gt; is deprecated in Proxmox 9.
Also switched to privileges listed in the &lt;a href=&quot;https://registry.terraform.io/providers/bpg/proxmox/latest/docs#api-token-authentication&quot;&gt;bpg provider&lt;/a&gt; docs, even if
they are a bit excessive:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;roles/create-terraform-provisioning-user/vars/main.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gh&quot;&gt;diff --git a/roles/create-terraform-provisioning-user/vars/main.yaml b/roles/create-terraform-provisioning-user/vars/main.yaml
index 7a98193..8ca22be 100644
&lt;/span&gt;&lt;span class=&quot;gd&quot;&gt;--- a/roles/create-terraform-provisioning-user/vars/main.yaml
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/roles/create-terraform-provisioning-user/vars/main.yaml
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -2,4 +2,4 @@&lt;/span&gt; terraform_user_token_name: proxmox-kubernetes-terraform-setup
 terraform_provider_role:
&lt;span class=&quot;gd&quot;&gt;-terraform_user_token_role_privileges: &quot;Datastore.AllocateSpace Datastore.Audit Pool.Allocate Sys.Audit Sys.Console Sys.Modify VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.Cloudinit VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Migrate VM.Monitor VM.PowerMgmt SDN.Use&quot;
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+terraform_user_token_role_privileges: &quot;Datastore.AllocateSpace Datastore.AllocateTemplate Datastore.Audit Pool.Allocate Sys.Audit Sys.Console Sys.Modify VM.Allocate VM.Audit VM.Clone VM.Config.CDROM VM.Config.Cloudinit VM.Config.CPU VM.Config.Disk VM.Config.HWType VM.Config.Memory VM.Config.Network VM.Config.Options VM.Migrate VM.PowerMgmt VM.GuestAgent.Audit SDN.Use&quot;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Made a temporary role to help prepare hosts for
the upgrade:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;roles/upgrade-pve8-to-pve9/tasks/main.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Remove old systemd-boot package&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.apt&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;systemd-boot&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;absent&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#LVM/LVM-thin_storage_has_guest_volumes_with_autoactivation_enabled&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Fix LVM/LVM-thin storage has guest volumes with autoactivation enabled&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cmd&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/usr/share/pve-manager/migrations/pve-lvm-disable-autoactivation --assume-yes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The role for adding apt repositories was permanently changed
and now also uses the new DEB822 source format:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;roles/update-apt-repositories/tasks/main.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gh&quot;&gt;diff --git a/roles/update-apt-repositories/tasks/main.yaml b/roles/update-apt-repositories/tasks/main.yaml
index 8e0c240..8ce693c 100644
&lt;/span&gt;&lt;span class=&quot;gd&quot;&gt;--- a/roles/update-apt-repositories/tasks/main.yaml
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/roles/update-apt-repositories/tasks/main.yaml
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -1,30 +1,59 @@&lt;/span&gt;
 ---
&lt;span class=&quot;gd&quot;&gt;-# https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_package_repositories
-- name: Remove pve-enterprise repository from list
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+- name: Add Debian base repositories
&lt;/span&gt;   block:
&lt;span class=&quot;gd&quot;&gt;-  - apt_repository:
-      repo: deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise
-      state: absent
-      filename: /etc/apt/sources.list.d/pve-enterprise.list
-      update_cache: false
-  - apt_repository:
-      repo: deb https://enterprise.proxmox.com/debian/ceph-quincy bookworm enterprise
-      state: absent
-      filename: /etc/apt/sources.list.d/ceph.list
-      update_cache: false
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+  - ansible.builtin.deb822_repository:
+      enabled: true
+      name: debian
+      types:
+      - deb
+      - deb-src
+      uris: http://deb.debian.org/debian/
+      suites:
+      - trixie
+      - trixie-updates
+      components:
+      - main
+      - non-free-firmware
+      signed_by: /usr/share/keyrings/debian-archive-keyring.gpg
+  - ansible.builtin.deb822_repository:
+      enabled: true
+      name: debian-security
+      types:
+      - deb
+      - deb-src
+      uris: http://security.debian.org/debian-security/
+      suites:
+      - trixie-security
+      components:
+      - main
+      - non-free-firmware
+      signed_by: /usr/share/keyrings/debian-archive-keyring.gpg
&lt;/span&gt; 
&lt;span class=&quot;gd&quot;&gt;-- name: Add pve-no-subscription repository to list
-  block:
-  - apt_repository:
-      repo: deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription
-      state: present
-      filename: /etc/apt/sources.list.d/pve-enterprise.list
-      update_cache: false
-  - apt_repository:
-      repo: deb http://download.proxmox.com/debian/ceph-quincy bookworm no-subscription
-      state: present
-      filename: /etc/apt/sources.list.d/ceph.list
-      update_cache: false
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+- name: Add Proxmox no-subscription repository
+  ansible.builtin.deb822_repository:
+    enabled: true
+    name: proxmox
+    types:
+    - deb
+    uris: http://download.proxmox.com/debian/pve
+    suites:
+    - trixie
+    components:
+    - pve-no-subscription
+    signed_by: /usr/share/keyrings/proxmox-archive-keyring.gpg
+
+- name: Add Ceph repositories
+  ansible.builtin.deb822_repository:
+    enabled: true
+    name: ceph
+    types:
+    - deb
+    uris: http://download.proxmox.com/debian/ceph-squid
+    suites:
+    - trixie
+    components:
+    - no-subscription
+    signed_by: /usr/share/keyrings/proxmox-archive-keyring.gpg
&lt;/span&gt;...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then I did the actual upgrade manually:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;apt dist-upgrade
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/performing_upgrade.png&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/performing_upgrade.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;Upgrading pve3 in this case
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;p&gt;After rebooting I fixed the old apt sources:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;apt modernize-sources
The following files need modernizing:
  - /etc/apt/sources.list
  - /etc/apt/sources.list.d/grafana.list

Modernizing will replace .list files with the new .sources format,
add Signed-By values where they can be determined automatically,
and save the old files into .list.bak files.

This &lt;span class=&quot;nb&quot;&gt;command &lt;/span&gt;supports the &lt;span class=&quot;s1&quot;&gt;&apos;signed-by&apos;&lt;/span&gt; and &lt;span class=&quot;s1&quot;&gt;&apos;trusted&apos;&lt;/span&gt; options. If you
have specified other options inside &lt;span class=&quot;o&quot;&gt;[]&lt;/span&gt; brackets, please transfer them
manually to the output files&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; see sources.list&lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;5&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;a mapping.

For a simulation, respond N &lt;span class=&quot;k&quot;&gt;in &lt;/span&gt;the following prompt.
Rewrite 2 sources? &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;Y/n] Y
Modernizing /etc/apt/sources.list...
- Writing /etc/apt/sources.list.d/debian.sources

Modernizing /etc/apt/sources.list.d/grafana.list...
- Writing /etc/apt/sources.list.d/grafana.sources
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Reran the playbook to remove the enterprise repo file again
and verified apt worked after the changes:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;apt update
Hit:1 http://deb.debian.org/debian trixie InRelease
Hit:2 http://deb.debian.org/debian trixie-updates InRelease
Hit:3 http://security.debian.org/debian-security trixie-security InRelease
Hit:4 https://apt.grafana.com stable InRelease
Hit:5 http://download.proxmox.com/debian/ceph-squid trixie InRelease
Hit:6 http://download.proxmox.com/debian/pve trixie InRelease
All packages are up to date.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I noticed “IO Pressure Stall” increasing drastically
when migrating VMs to a node running version 9 from
a node running version 8:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/io_pressure_stall_during_vm_migration_back_to_pve9_host.png&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/io_pressure_stall_during_vm_migration_back_to_pve9_host.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;IO Pressure Stall while migrating VMs to pve2 running Proxmox 9
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;p&gt;This was reflected in some of the VMs running on the 
affected Proxmox host, among them being &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-5&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get nodes
NAME            STATUS                        ROLES                       AGE   VERSION
k8s-control-4   Ready                         control-plane,etcd,master   45h   v1.31.7+k3s1
k8s-control-5   NotReady                      control-plane,etcd,master   44h   v1.31.7+k3s1
k8s-control-6   Ready                         control-plane,etcd,master   44h   v1.31.7+k3s1
k8s-worker-1    Ready                         &amp;lt;none&amp;gt;                      44h   v1.31.7+k3s1
k8s-worker-2    NotReady,SchedulingDisabled   &amp;lt;none&amp;gt;                      44h   v1.31.7+k3s1
k8s-worker-3    Ready                         &amp;lt;none&amp;gt;                      44h   v1.31.7+k3s1
k8s-worker-4    Ready                         &amp;lt;none&amp;gt;                      44h   v1.31.7+k3s1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There is a &lt;a href=&quot;https://www.reddit.com/r/Proxmox/comments/1mkgmlj/io_pressure_stall_on_proxmox_9/&quot;&gt;Reddit thread&lt;/a&gt; describing a similar issue
even on fresh Proxmox 9 installations. The IO Pressure
Stall has since been reduced. This is the month maximum
for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pve2&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/io_pressure_stall_today.png&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/io_pressure_stall_today.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;The IO pressure stall maximum for pve2
last month
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;p&gt;VMs have been stable after the migration, so I’m
going to just keep monitoring for now.&lt;/p&gt;

&lt;h2 id=&quot;upgrading-terraform-provider&quot;&gt;Upgrading Terraform provider&lt;/h2&gt;

&lt;p&gt;I’ve changed Terraform provider for Proxmox VM configuration
to the &lt;a href=&quot;https://registry.terraform.io/providers/bpg/proxmox/latest/docs#api-token-authentication&quot;&gt;bpg provider&lt;/a&gt; over the last few years. Upgrading to
the latest version of the provider (&lt;a href=&quot;https://registry.terraform.io/providers/bpg/proxmox/0.86.0&quot;&gt;0.86.0&lt;/a&gt; as of this
writing) worked without any issues:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/running_terraform_plan_and_apply_post_proxmox_upgrade.png&quot;&gt;
          &lt;img src=&quot;/img/posts/upgrading-the-proxmox-cluster-from-8-to-9/running_terraform_plan_and_apply_post_proxmox_upgrade.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;The VM configuration after having upgraded the
provider
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The tools and guides for preparing and performing in-place
upgrades of Proxmox are quite good. With the exception of
the IO Pressure Stall situation, everything went smoothly.&lt;/p&gt;

&lt;p&gt;Resources:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://pve.proxmox.com/wiki/Upgrade_from_8_to_9#In-place_upgrade&quot;&gt;Proxmox official upgrade guide&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.virtualizationhowto.com/2025/08/how-to-upgrade-from-proxmox-ve-8-to-9-fast-and-hassle-free/&quot;&gt;VirtualizationHowTo upgrade guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- Links --&gt;

</description>
        <pubDate>Tue, 11 Nov 2025 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2025/11/11/upgrade-proxmox-from-8-to-9/</link>
        <guid isPermaLink="true">https://fredrickb.com/2025/11/11/upgrade-proxmox-from-8-to-9/</guid>
        
        <category>homelab</category>
        
        <category>proxmox</category>
        
        
      </item>
    
      <item>
        <title>Replacing Flannel with Calico as CNI in the K3s cluster</title>
        <description>&lt;p&gt;Locking down network traffic in the K3s cluster in the homelab has
been in the backlog for some time. While not strictly required, I’d
rather have a setup more secure by default.&lt;/p&gt;

&lt;p&gt;Since &lt;a href=&quot;https://docs.k3s.io/networking/basic-network-options&quot;&gt;K3s ships with Flannel&lt;/a&gt; (which does not support &lt;a href=&quot;https://kubernetes.io/docs/concepts/services-networking/network-policies/&quot;&gt;NetworkPolicies&lt;/a&gt;)
as standard I have to switch &lt;a href=&quot;https://www.tigera.io/learn/guides/kubernetes-networking/kubernetes-cni/&quot;&gt;CNI&lt;/a&gt;. After looking at multiple options,
I ended up with &lt;a href=&quot;https://www.tigera.io/project-calico&quot;&gt;Calico&lt;/a&gt;. Their documentation has a guide &lt;a href=&quot;https://docs.tigera.io/calico/latest/getting-started/kubernetes/k3s/multi-node-install&quot;&gt;specifically for K3s&lt;/a&gt;.
One feature in particular that caught my attention was &lt;a href=&quot;https://www.tigera.io/blog/calico-whisker-your-new-ally-in-network-observability/&quot;&gt;Whisker&lt;/a&gt;, which is a UI
for network flow logs offered in version v3.30 of Calico.&lt;/p&gt;

&lt;p&gt;The plan:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Modify the existing Ansible playbook for K3s to install Calico&lt;/li&gt;
  &lt;li&gt;Expose Whisker using an Ingress&lt;/li&gt;
  &lt;li&gt;Identify existing network flows to target with NetworkPolicies&lt;/li&gt;
  &lt;li&gt;Add NetworkPolicies to the ArgoCD configuration&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;modifying-the-ansible-playbook&quot;&gt;Modifying the Ansible playbook&lt;/h2&gt;

&lt;p&gt;New additions to the main inventory file:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inventory.ini&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&quot;language-ini highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;k3s_server_install_args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;--disable servicelb --flannel-backend=none --disable-network-policy&quot;&lt;/span&gt;
&lt;span class=&quot;py&quot;&gt;pod_cidr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;10.42.0.0/16&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;blockquote&gt;
  &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pod_cidr&lt;/code&gt; variable is now placed in the
inventory since it is needed for the Calico
configuration later.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The new role &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;install-calico&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;roles/install-calico/tasks/main.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Install Calico&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.shell&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;k3s kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/{{ calico_version }}/manifests/operator-crds.yaml&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;k3s kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/{{ calico_version }}/manifests/tigera-operator.yaml&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Wait until Calico crds are installed&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.shell&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s kubectl api-resources | grep operator.tigera.io/v1&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;result&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;until&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;result.stdout.find(&quot;Installation&quot;) != -1&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;retries&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;5&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;delay&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Template out Calico custom-resources.yaml&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.template&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;templates/calico-custom-resources.yaml.j2&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;calico_custom_resources_dest&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Apply Calico custom-resources.yaml&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s kubectl apply -f {{ calico_custom_resources_dest }}&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Enable Felix metrics&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;k3s kubectl patch felixconfiguration default --type merge --patch &apos;{&quot;spec&quot;:{&quot;prometheusMetricsEnabled&quot;: true}}&apos;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;roles/install-calico/templates/calico-custom-resources.yaml.j2&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# This section includes base Calico installation configuration.&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.Installation&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;operator.tigera.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Installation&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;default&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Configures Calico networking.&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;calicoNetwork&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;bgp&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Disabled&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;ipPools&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;default-ipv4-ippool&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;blockSize&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;26&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;cidr&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;{{&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;pod_cidr&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;}}&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;encapsulation&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;VXLANCrossSubnet&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;natOutgoing&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Enabled&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;nodeSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;all()&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;typhaMetricsPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;9093&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# This section configures the Calico API server.&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServer&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;operator.tigera.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;APIServer&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;default&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;{}&lt;/span&gt;

&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Configures the Calico Goldmane flow aggregator.&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;operator.tigera.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Goldmane&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;default&lt;/span&gt;

&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# Configures the Calico Whisker observability UI.&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;operator.tigera.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Whisker&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;default&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;roles/install-calico/vars/main.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;calico_version&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v3.30.2&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;calico_custom_resources_dest&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/tmp/calico-custom-resources.yaml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Output after recreating the VMs of the nodes in Proxmox and
reinstalling K3s using the new playbook:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get pods &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; calico-system
NAME                                      READY   STATUS    RESTARTS     AGE
calico-kube-controllers-f979cd594-pdwl7   1/1     Running   0            58d
calico-node-grhjn                         1/1     Running   0            51d
calico-node-hgw28                         1/1     Running   2 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;8d ago&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;   51d
calico-node-nhtwk                         1/1     Running   0            51d
calico-node-qxm97                         1/1     Running   1 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;8d ago&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;   51d
calico-node-tgf2v                         1/1     Running   0            51d
calico-node-tzxgc                         1/1     Running   0            51d
calico-node-vwczl                         1/1     Running   0            51d
calico-typha-588c846999-8jn4f             1/1     Running   0            53d
calico-typha-588c846999-mkzjb             1/1     Running   0            53d
calico-typha-588c846999-sfpmm             1/1     Running   0            8d
csi-node-driver-9jnqc                     2/2     Running   0            58d
csi-node-driver-nwwhs                     2/2     Running   0            58d
csi-node-driver-r72m4                     2/2     Running   2 &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;8d ago&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;   58d
csi-node-driver-s6rf4                     2/2     Running   0            58d
csi-node-driver-sd2g8                     2/2     Running   0            58d
csi-node-driver-vvndc                     2/2     Running   0            58d
csi-node-driver-z62lp                     2/2     Running   0            58d
goldmane-64584db4cf-26bdg                 1/1     Running   0            58d
whisker-98dff96db-t78nq                   2/2     Running   0            58d
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;exposing-whisker&quot;&gt;Exposing Whisker&lt;/h2&gt;

&lt;p&gt;With Calico up and running I added an Ingress and a
NetworkPolicy to the ArgoCD configuration to expose
Whisker:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;NetworkPolicy&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;allow-traefik-to-whisker&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ingress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;kubernetes.io/metadata.name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kube-system&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;podSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;app.kubernetes.io/name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;traefik&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;8081&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;protocol&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;TCP&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;podSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;matchLabels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;app.kubernetes.io/name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;whisker&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;policyTypes&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;annotations&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;step-issuer&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer-group&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;certmanager.step.sm&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer-kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;StepClusterIssuer&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;whisker&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;rules&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;whisker.homelab.fredrickb.com&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;http&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;paths&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;backend&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;service&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;whisker&lt;/span&gt;
            &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
              &lt;span class=&quot;na&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;8081&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;pathType&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Prefix&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;tls&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;whisker.homelab.fredrickb.com&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;secretName&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;whisker-tls-cert&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After applying changes with ArgoCD:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/whisker-ui-showing-network-flows-logs-in-calico.png&quot; title=&quot;Whisker running in the homelab&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/whisker-ui-showing-network-flows-logs-in-calico.png&quot; alt=&quot;Whisker running in the homelab&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;Whisker running in the homelab
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;h2 id=&quot;locking-down-network-traffic&quot;&gt;Locking down network traffic&lt;/h2&gt;

&lt;p&gt;I repeated the following steps until all network traffic
was covered by a NetworkPolicy:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Identify traffic for NetworkPolicies using &lt;a href=&quot;https://www.tigera.io/blog/calico-whisker-your-new-ally-in-network-observability/&quot;&gt;Whisker&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Label known IP-ranges outside of K3s using &lt;a href=&quot;https://docs.tigera.io/calico/latest/reference/resources/globalnetworkset&quot;&gt;GlobalNetworkSets&lt;/a&gt;/&lt;a href=&quot;https://docs.tigera.io/calico/latest/reference/resources/networkset&quot;&gt;NetworkSets&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.tigera.io/blog/dry-run-your-kubernetes-network-policies-with-calico-staged-network-policies/&quot;&gt;Dry-run NetworkPolicies&lt;/a&gt; before enforcing them using
&lt;a href=&quot;https://docs.tigera.io/calico-cloud/reference/resources/stagedglobalnetworkpolicy&quot;&gt;StagedGlobalNetworkPolicies&lt;/a&gt;/
&lt;a href=&quot;https://docs.tigera.io/calico/latest/reference/resources/stagednetworkpolicy#stagednetworkpolicy&quot;&gt;StagedNetworkPolicies&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I created GlobalNetworkSets for systems external to the cluster
such as Proxmox and GitHub Actions self-hosted runner hosts:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;GlobalNetworkSet&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;proxmox-hosts&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;proxmox-hosts&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;node-exporter&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;process-exporter&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;promtail&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;nets&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;10.0.2.0/24&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;GlobalNetworkSet&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;github-actions-runner-hosts&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;node-exporter&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;process-exporter&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;promtail&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;true&quot;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;nets&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;10.0.7.0/24&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;These GlobalNetworkSets were referenced in &lt;a href=&quot;https://docs.tigera.io/calico/latest/network-policy/get-started/calico-policy/calico-network-policy&quot;&gt;Calico NetworkPolicies&lt;/a&gt; using labels
to allow traffic from Prometheus in the cluster to services such as &lt;a href=&quot;https://github.com/prometheus/node_exporter&quot;&gt;node_exporter&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;NetworkPolicy&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;allow-egress-to-node-exporter-hosts&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;prometheus&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Egress&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;egress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Allow&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;protocol&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;TCP&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;global()&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;node-exporter == &quot;true&quot;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;9100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Names of GlobalNetworkSets are reflected in the column &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dest_name&lt;/code&gt; in
Whisker when inspecting traffic from Prometheus running in K3s to Proxmox
and GitHub Actions self-hosted runner hosts on port 9100:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/whisker-showing-allowed-prometheus-node-exporter-traffic.png&quot; title=&quot;Whisker showing Prometheus scraping node_exporter service on Proxmox and GitHub Actions self-hosted runner hosts&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/whisker-showing-allowed-prometheus-node-exporter-traffic.png&quot; alt=&quot;Whisker showing Prometheus scraping node_exporter service on Proxmox and GitHub Actions self-hosted runner hosts&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;Whisker showing Prometheus scraping node_exporter service on Proxmox and GitHub
Actions self-hosted runner hosts
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;p&gt;One thing I was hoping to avoid was the need to have a NetworkPolicy in every namespace
to allow traffic between workloads. I did not find a way to cover this use-case with
GlobalNetworkPolicies. I had to duplicate the following NetworkPolicy in all namespaces to
achieve the correct behaviour:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;NetworkPolicy&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;allow-same-namespace&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;namespace&amp;gt;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Egress&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;egress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Allow&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;&amp;lt;namespace&amp;gt;&quot;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;&amp;lt;namespace&amp;gt;&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ingress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Allow&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;&amp;lt;namespace&amp;gt;&quot;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;&amp;lt;namespace&amp;gt;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Example from the Grafana namespace:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;NetworkPolicy&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;allow-same-namespace&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Egress&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;egress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Allow&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;grafana&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;grafana&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ingress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Allow&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;grafana&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;namespaceSelector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;kubernetes.io/metadata.name == &quot;grafana&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There are a couple of issues addressing this already, so we’ll see if this
improves in the future:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/projectcalico/calico/issues/4751&quot;&gt;4751&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/projectcalico/calico/issues/6107&quot;&gt;6107&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;monitoring-calico&quot;&gt;Monitoring Calico&lt;/h2&gt;

&lt;p&gt;Calico ships with metrics that can be scraped using Prometheus
(&lt;a href=&quot;https://docs.tigera.io/calico/latest/operations/monitor/monitor-component-metrics&quot;&gt;more details on monitoring here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;I added the necessary services for Felix and Typha to expose their metrics:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Service&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;felix-metrics-svc&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;calico-system&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;clusterIP&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;None&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;k8s-app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;calico-node&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;9091&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;targetPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;9091&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Service&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;typha-metrics-svc&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;calico-system&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;clusterIP&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;None&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;selector&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;k8s-app&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;calico-typha&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;9093&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;targetPort&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;9093&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The updated scrape config in Prometheus, copied from the &lt;a href=&quot;https://docs.tigera.io/calico/latest/operations/monitor/monitor-component-metrics#create-prometheus-config-file&quot;&gt;documentation&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;extraScrapeConfigs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;|&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;- job_name: &apos;felix_metrics&apos;&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;scrape_interval: 5s&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;scheme: http&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;kubernetes_sd_configs:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- role: endpoints&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;relabel_configs:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- source_labels: [__meta_kubernetes_service_name]&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;regex: felix-metrics-svc&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;replacement: $1&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;action: keep&lt;/span&gt;

  &lt;span class=&quot;s&quot;&gt;- job_name: &apos;typha_metrics&apos;&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;scrape_interval: 5s&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;scheme: http&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;kubernetes_sd_configs:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- role: endpoints&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;relabel_configs:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- source_labels: [__meta_kubernetes_service_name]&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;regex: typha-metrics-svc&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;replacement: $1&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;action: keep&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- source_labels: [__meta_kubernetes_pod_container_port_name]&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;regex: calico-typha&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;action: drop&lt;/span&gt;

  &lt;span class=&quot;s&quot;&gt;- job_name: &apos;kube_controllers_metrics&apos;&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;scrape_interval: 5s&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;scheme: http&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;kubernetes_sd_configs:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- role: endpoints&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;relabel_configs:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;- source_labels: [__meta_kubernetes_service_name]&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;regex: calico-kube-controllers-metrics&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;replacement: $1&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;action: keep&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The documentation offers some premade Grafana dashboards nested in a ConfigMap &lt;a href=&quot;https://docs.tigera.io/calico/latest/operations/monitor/monitor-component-visual#2-provisioning-calico-dashboards&quot;&gt;here&lt;/a&gt;,
but they’ll need a small adjustment. I extracted the JSON payloads and replaced the
hardcoded value of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;datasource.uid&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;$datasource&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;From this:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;datasource&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;prometheus&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;uid&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&amp;lt;Some-UID&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To this:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;datasource&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;prometheus&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;uid&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;$datasource&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Once done, I had Grafana dashboards showing the status of components Felix and Typha:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/grafana-dashboard-felix.png&quot; title=&quot;Grafana dashboard for Felix in Calico&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/grafana-dashboard-felix.png&quot; alt=&quot;Grafana dashboard for Felix in Calico&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;Grafana dashboard for Felix in Calico
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/grafana-dashboard-typha.png&quot; title=&quot;Grafana dashboard for Typha in Calico&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/grafana-dashboard-typha.png&quot; alt=&quot;Grafana dashboard for Typha in Calico&quot; /&gt;
      &lt;/a&gt;
    
  
  
    &lt;figcaption&gt;Grafana dashboard for Typha in Calico
&lt;/figcaption&gt;
  
&lt;/figure&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;All network traffic within the cluster and to systems external to the
cluster is now locked down with NetworkPolicies.
While I initially planned to stick with standard NetworkPolicies, the
Calico NetworkPolicies provide more control and flexibility.&lt;/p&gt;

&lt;p&gt;I really like the combination of Whisker, (Global)NetworkSets and
Staged(Global)NetworkPolicies. You get tools to identify active network flows,
label specific IP-ranges/hosts and dry-run NetworkPolicies before enforcing them.
This helps a lot when retroactively locking down network traffic in a cluster that
already has a lot of services running.&lt;/p&gt;

&lt;p&gt;In terms of observability, there are Prometheus metrics and predefined Grafana
dashboards available for Felix and Typha.&lt;/p&gt;

&lt;p&gt;Network traffic in the cluster is now secure by default for current and future
workloads.&lt;/p&gt;

&lt;!-- Links --&gt;

</description>
        <pubDate>Sat, 25 Oct 2025 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2025/10/25/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/</link>
        <guid isPermaLink="true">https://fredrickb.com/2025/10/25/replacing-flannel-with-calico-as-cni-in-the-k3s-cluster/</guid>
        
        <category>k3s</category>
        
        <category>flannel</category>
        
        <category>calico</category>
        
        <category>homelab</category>
        
        <category>whisker</category>
        
        <category>ansible</category>
        
        <category>argocd</category>
        
        <category>grafana</category>
        
        
      </item>
    
      <item>
        <title>Switching kanban board</title>
        <description>&lt;p&gt;I’ve used &lt;a href=&quot;https://trello.com/&quot;&gt;Trello&lt;/a&gt; for several years as my main kanban board for ideas and personal projects. While I like the tool, taking more control of my data and having deeper links between my notes and projects is more important going forward.&lt;/p&gt;

&lt;p&gt;I started using &lt;a href=&quot;https://obsidian.md/&quot;&gt;Obsidian&lt;/a&gt; for note-taking in 2020.
Everything is in a single directory, as files, interlinked, in git. It just works.&lt;/p&gt;

&lt;p&gt;Obsidian has a large ecosystem of &lt;a href=&quot;https://obsidian.md/plugins&quot;&gt;plugins&lt;/a&gt; available. The &lt;a href=&quot;https://publish.obsidian.md/kanban/Obsidian+Kanban+Plugin&quot;&gt;Obsidian Kanban Plugin&lt;/a&gt; is exactly what I need. It lets me  create notes, represented as cards, in a board. I can then interlink these notes (cards) to any other note, such as my internal system documentation  for the homelab.&lt;/p&gt;

&lt;p&gt;Trello provides a way to &lt;a href=&quot;https://support.atlassian.com/trello/docs/exporting-data-from-trello/#Export-data-from-a-board&quot;&gt;export a board to JSON&lt;/a&gt;. I extracted the information I wanted from the JSON export using a Python script:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Lists
    &lt;ul&gt;
      &lt;li&gt;Name&lt;/li&gt;
      &lt;li&gt;Cards&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Cards
    &lt;ul&gt;
      &lt;li&gt;Title&lt;/li&gt;
      &lt;li&gt;Description&lt;/li&gt;
      &lt;li&gt;Checklists&lt;/li&gt;
      &lt;li&gt;Comments&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The information is converted to Markdown files matching the format of the Obsidian Kanban Plugin using the same script:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Each list is created as a heading in the markdown file for the board itself (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;Boardname&amp;gt;.md&lt;/code&gt;)&lt;/li&gt;
  &lt;li&gt;Each card is created as a file (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kanban/&amp;lt;board-name&amp;gt;/&amp;lt;card-name&amp;gt;.md&lt;/code&gt;) and added as a list item beneath the correct list heading in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;Boardname&amp;gt;.md&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An example card file (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;kanban/&amp;lt;board-name&amp;gt;/PiKVM.md&lt;/code&gt;):&lt;/p&gt;

&lt;div class=&quot;language-md highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;PiKVM&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;created&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;2022-11-26 - 12:46&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;board&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;Boardname&amp;gt;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;kanban&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;raspberry-pi&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;
Board: [[&lt;span class=&quot;nt&quot;&gt;&amp;lt;Boardname&amp;gt;&lt;/span&gt;]]
&lt;span class=&quot;gt&quot;&gt;
&amp;gt; This card was imported from Trello: &amp;lt;Trello-Card-URL&amp;gt;&lt;/span&gt;

&lt;span class=&quot;nt&quot;&gt;&amp;lt;Content&amp;gt;&lt;/span&gt;

&lt;span class=&quot;gh&quot;&gt;# Comments&lt;/span&gt;

&lt;span class=&quot;nt&quot;&gt;&amp;lt;date&amp;gt;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;&amp;lt;time&amp;gt;&lt;/span&gt;

&lt;span class=&quot;nt&quot;&gt;&amp;lt;Comment&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;
---
&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;&amp;lt;date&amp;gt;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;&amp;lt;time&amp;gt;&lt;/span&gt;

&lt;span class=&quot;nt&quot;&gt;&amp;lt;Comment&lt;/span&gt; &lt;span class=&quot;err&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;nt&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;
---
&lt;/span&gt;
&lt;span class=&quot;gh&quot;&gt;#kanban #raspberry-pi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I kept the URL to the Trello card in the event I want to go back and
check the original.&lt;/p&gt;

&lt;p&gt;Example of card list item in the list &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Ideas/Backlog&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;Boardname&amp;gt;.md&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-md highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;kanban-plugin&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;basic&lt;/span&gt;
&lt;span class=&quot;nn&quot;&gt;...&lt;/span&gt;

&lt;span class=&quot;nn&quot;&gt;---&lt;/span&gt;

&lt;span class=&quot;gu&quot;&gt;## Ideas/Backlog&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;
-&lt;/span&gt; [ ] [[PiKVM]]
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The end result shown below.&lt;/p&gt;

&lt;p&gt;From this:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/switching-kanban-board/trello-list.png&quot;&gt;
          &lt;img src=&quot;/img/posts/switching-kanban-board/trello-list.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;To this:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/switching-kanban-board/obsidian-kanban-plugin-list.png&quot;&gt;
          &lt;img src=&quot;/img/posts/switching-kanban-board/obsidian-kanban-plugin-list.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;I’m satisfied with the result. I get the same functionality
I had before, and I own the data.&lt;/p&gt;
</description>
        <pubDate>Sun, 23 Mar 2025 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2025/03/23/switching-kanban-board/</link>
        <guid isPermaLink="true">https://fredrickb.com/2025/03/23/switching-kanban-board/</guid>
        
        <category>trello</category>
        
        <category>obsidian</category>
        
        <category>cloud-services</category>
        
        <category>notes</category>
        
        
      </item>
    
      <item>
        <title>Making the K3s cluster control plane HA</title>
        <description>&lt;p&gt;I’ll admit, I haven’t been running the K3s control plane in HA. The architecture of 1 control plane node and 3 worker nodes is from a time when every node was running physically on &lt;a href=&quot;/2021/08/22/recreating-the-raspberry-pi-homelab-with-kubernetes/&quot;&gt;Raspberry Pis&lt;/a&gt;. I then &lt;a href=&quot;/2023/08/05/setting-up-k3s-nodes-in-proxmox-using-terraform/&quot;&gt;gradually migrated to VMs&lt;/a&gt; with no changes to the architecture.&lt;/p&gt;

&lt;p&gt;There are a few reasons this is starting to become important:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The nodes are long overdue for an update from Ubuntu 20.04 to Ubuntu 24.04&lt;/li&gt;
  &lt;li&gt;In Ubuntu 20.04, the kernel version (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;5.15.x&lt;/code&gt;) can cause issues when upgrading to &lt;a href=&quot;https://longhorn.io/docs/1.6.2/v2-data-engine/prerequisites/#prerequisites&quot;&gt;Longhorn version &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1.6.x&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Upgrading the cluster, while still mostly automated, is still a risk without HA and requires more planning&lt;/li&gt;
  &lt;li&gt;Live migration of the control plane node VM between Proxmox hosts causes unknown behaviour at times&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The new Proxmox host (&lt;a href=&quot;/homelab#hardware&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Proxmox host 3&lt;/code&gt;&lt;/a&gt;) ensures I now have the physical resources required.&lt;/p&gt;

&lt;p&gt;So here’s the plan:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.k3s.io/datastore/cluster-loadbalancer&quot;&gt;Add a load balancer in front of the control plane&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://docs.k3s.io/datastore/ha-embedded#existing-single-node-clusters&quot;&gt;Change the existing single-node cluster to HA with embedded etcd&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;IP address assignments:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;IP address&lt;/th&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Host&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10.0.3.100&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Load Balancer frontend VIP&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10.0.3.2&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10.0.3.7&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-2&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;10.0.3.8&lt;/code&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-3&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;setting-up-the-haproxy-plugin-in-opnsense-as-the-load-balancer&quot;&gt;Setting up the HAProxy plugin in OPNsense as the load balancer&lt;/h2&gt;

&lt;p&gt;I could spin up 2 VMs to create a HAProxy load balancer setup with failover like in &lt;a href=&quot;https://docs.k3s.io/datastore/cluster-loadbalancer&quot;&gt;the official documentation&lt;/a&gt;. I can also use the OPNsense &lt;a href=&quot;https://github.com/opnsense/plugins&quot;&gt;HAProxy plugin&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;While I prefer IaC and configuration in code, I opted for using OPNsense. The physical host of OPNsense is currently underutilized, and it would be nice to get more capabilities out of it.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I won’t go into the specifics of the HAProxy plugin itself. See the official documentation for more information. Some of the screenshots were taken after the setup was confirmed working so the titles sometimes display “Edit” as opposed to “New”.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Installing the plugin was fairly straightforward:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_haproxy_plugin.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_haproxy_plugin.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Once installed, the Services list is updated:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_services_sidebar_updated.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_services_sidebar_updated.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The VIP is defined as an &lt;a href=&quot;https://docs.opnsense.org/manual/firewall_vip.html#ip-alias&quot;&gt;IP Alias&lt;/a&gt;. I added this to the existing &lt;a href=&quot;https://docs.opnsense.org/manual/other-interfaces.html#vlan&quot;&gt;VLAN&lt;/a&gt; interface of the K3s cluster:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_k3s_vip.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_k3s_vip.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The VIP was added to a &lt;a href=&quot;https://docs.opnsense.org/manual/aliases.html&quot;&gt;firewall alias&lt;/a&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;K3s_Control_Plane_Loadbalancer&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_k3s_vip_alias.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_k3s_vip_alias.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;A new firewall rule was created on the K3s interface to allow traffic to the VIP from the worker nodes:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_k3s_firewall_rule.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/opnsense_k3s_firewall_rule.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;blockquote&gt;
  &lt;p&gt;What I’m doing from this point on is more or less copying the HAProxy configuration from the &lt;a href=&quot;https://docs.k3s.io/datastore/cluster-loadbalancer&quot;&gt;documentation&lt;/a&gt; into OPNsense. I also used information from &lt;a href=&quot;https://forum.opnsense.org/index.php?topic=22116.msg105660#msg105660&quot;&gt;this forum post&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Adding the existing control plane host as a server:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_add_initial_controlplane_host.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_add_initial_controlplane_host.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Creating the health monitor:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_health_monitor.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_health_monitor.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Creating the backend pool:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_k3s_backend_pool_initial_setup.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_k3s_backend_pool_initial_setup.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Creating the public service (frontend):&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_k3s_public_service.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_k3s_public_service.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Statistics page:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_statistics_after_adding_initial_server.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_statistics_after_adding_initial_server.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The Config Export:&lt;/p&gt;

&lt;div class=&quot;language-conf highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Frontend: k3s-controlplane (K3s controlplane hosts)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frontend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k3s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;controlplane&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bind&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tcp&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;default_backend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k3s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;controlplane&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# logging options
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;option&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tcplog&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Backend: k3s-controlplane (Pool serving all K3s controlplane hosts)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k3s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;controlplane&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;option&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;health&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;checks&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# health check: k3s-controlplane
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tcp&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;balance&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;roundrobin&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k8s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;control&lt;/span&gt;-&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;check&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inter&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is everything that is required until the cluster is expanded
with more control plane nodes. I’ll do that in the next section.&lt;/p&gt;

&lt;h2 id=&quot;changing-the-existing-single-node-cluster-to-ha-with-embedded-etcd&quot;&gt;Changing the existing single-node cluster to HA with embedded etcd&lt;/h2&gt;

&lt;p&gt;The &lt;a href=&quot;https://docs.k3s.io/datastore/ha-embedded#existing-single-node-clusters&quot;&gt;documentation&lt;/a&gt; on this is pretty good. The existing control plane node (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt;) will be the node bootstrapping the cluster.&lt;/p&gt;

&lt;p&gt;Inventory of the Ansible playbook:&lt;/p&gt;

&lt;div class=&quot;language-ini highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nn&quot;&gt;[all:vars]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c&quot;&gt;# VIP
&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;k3s_control_plane_vip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;10.0.3.100&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;c&quot;&gt;# There will only ever be one host designated to
# bootstrap the cluster
&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;[initial_k3s_server]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;10.0.3.2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;py&quot;&gt;node_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k8s-control-1&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;[remaining_k3s_servers]&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;10.0.3.7&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;py&quot;&gt;node_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k8s-control-2&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;10.0.3.8&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;py&quot;&gt;node_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k8s-control-3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I added a new role: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k3s-initial-server&lt;/code&gt;, with the following tasks:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Check if k3s is already installed&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.stat&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/usr/local/bin/k3s&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Bootstrap k3s cluster with etcd&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;shell&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_install_script_dest&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;environment&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_CHANNEL&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_channel&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_VERSION&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_version&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# I extracted the existing token from /var/lib/rancher/k3s/server/token&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# and set it as a variable to be passed into the Ansible playbook&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_TOKEN&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_token&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_NODE_NAME&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;node_name&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Set the --tls-san to the VIP&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_EXEC&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;--disable&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;servicelb&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;--tls-san&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_control_plane_vip&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;--cluster-init&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;when&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;(reinstall | bool) or not k3s.stat.exists&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The role is delegated to the group &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;initial_k3s_server&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Setup initial k3s server&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;initial_k3s_server&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;roles&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;role&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;k3s-initial-server&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;k3s-baseline&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Running the Ansible playbook with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-e reinstall=true&lt;/code&gt; changes the existing control plane installation from SQLite to etcd:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;NAME            STATUS                     ROLES                       AGE     VERSION
k8s-control-1   Ready                      control-plane,etcd,master   296d    v1.24.17+k3s1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;blockquote&gt;
  &lt;p&gt;At this stage I have only reinstalled &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt;, none of the worker nodes were reinstalled until after I confirmed that the HAProxy setup was working.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The configuration file &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/rancher/k3s/config.yaml&lt;/code&gt; now has the new VIP in it:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;k8s-control-1:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat&lt;/span&gt; /etc/rancher/k3s/config.yaml
token: X
tls-san:
- 10.0.3.100
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This was not enough to update the existing certificates. Simply downloading the new kubeconfig from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt; and changing the IP address to the VIP produced this:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;kubectl config view
...
clusters:
- cluster:
    ...
    server: https://10.0.3.100:6443
...

&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;kubectl get pods &lt;span class=&quot;nt&quot;&gt;-A&lt;/span&gt;
Unable to connect to the server: tls: failed to verify certificate: x509: certificate is valid &lt;span class=&quot;k&quot;&gt;for &lt;/span&gt;10.0.3.2, 10.43.0.1, 127.0.0.1, ::1, not 10.0.3.100
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I had to delete the old certificates and generate new ones. The solution comes from &lt;a href=&quot;https://github.com/k3s-io/k3s/issues/2856#issuecomment-2345974361&quot;&gt;this comment&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;On &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;k8s-control-1:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;k3s kubectl &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; kube-system delete secrets/k3s-serving
k8s-control-1:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sudo mv&lt;/span&gt; /var/lib/rancher/k3s/server/tls/dynamic-cert.json /tmp/dynamic-cert.json
k8s-control-1:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sudo &lt;/span&gt;systemctl restart k3s
&lt;span class=&quot;c&quot;&gt;# After verifying that it works&lt;/span&gt;
k8s-control-1:~&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;rm&lt;/span&gt; /tmp/dynamic-cert.json
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After downloading the kubeconfig from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt; again and changing the IP address to the VIP, I was able to access the control plane over TLS:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;kubectl get nodes
NAME            STATUS   ROLES                       AGE    VERSION
k8s-control-1   Ready    control-plane,etcd,master   293d   v1.24.17+k3s1
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The new control plane nodes are initialized using the old role &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k3s-servers&lt;/code&gt; with some changes:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Check if k3s is already installed&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.stat&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/usr/local/bin/k3s&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Install k3s on server&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;shell&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_install_script_dest&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;environment&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_CHANNEL&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_channel&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_VERSION&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_version&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_TOKEN&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_token&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# There will only ever be one host in this group&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_URL&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;https://{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;groups[&apos;initial_k3s_server&apos;]&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}:6443&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_NODE_NAME&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;node_name&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# This needs to include `server` first so it does not get registered as an agent. Set the --tls-san to the VIP&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_EXEC&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;server&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;--disable&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;servicelb&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;--tls-san&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_control_plane_vip&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;when&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;(reinstall | bool) or not k3s.stat.exists&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The role is delegated to the group &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;remaining_k3s_servers&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Setup remaining k3s servers&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;remaining_k3s_servers&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;roles&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;role&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;k3s-servers&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;tags&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;k3s-baseline&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This installs K3s and joins the remaining control plane nodes to the cluster:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;kubectl get nodes
NAME            STATUS   ROLES                       AGE     VERSION
k8s-control-1   Ready    control-plane,etcd,master   297d    v1.24.17+k3s1
k8s-control-2   Ready    control-plane,etcd,master   40h     v1.24.17+k3s1
k8s-control-3   Ready    control-plane,etcd,master   40h     v1.24.17+k3s1
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The new control plane nodes were then added to the setup in HAProxy.&lt;/p&gt;

&lt;p&gt;Creating the hosts:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_create_k3s_controlplane_real_servers_for_remaining_servers.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_create_k3s_controlplane_real_servers_for_remaining_servers.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Updating the backend pool:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_add_hosts_to_existing_k3s_backend_pool.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_add_hosts_to_existing_k3s_backend_pool.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Statistics page:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_k3s_haproxy_statistics_after_adding_remaining_servers.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_k3s_haproxy_statistics_after_adding_remaining_servers.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The Config Export:&lt;/p&gt;

&lt;div class=&quot;language-conf highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# Frontend: k3s-controlplane (K3s controlplane hosts)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;frontend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k3s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;controlplane&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bind&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;100&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tcp&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;default_backend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k3s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;controlplane&lt;/span&gt;

    &lt;span class=&quot;c&quot;&gt;# logging options
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;option&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tcplog&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# Backend: k3s-controlplane (Pool serving all K3s controlplane hosts)
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;backend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k3s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;controlplane&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;option&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;health&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;checks&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# health check: k3s-controlplane
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tcp&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;balance&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;roundrobin&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k8s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;control&lt;/span&gt;-&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;check&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inter&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k8s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;control&lt;/span&gt;-&lt;span class=&quot;m&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;7&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;check&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inter&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;server&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k8s&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;control&lt;/span&gt;-&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;0&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;.&lt;span class=&quot;m&quot;&gt;8&lt;/span&gt;:&lt;span class=&quot;m&quot;&gt;6443&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;check&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;inter&lt;/span&gt; &lt;span class=&quot;m&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With the HAProxy setup confirmed working, I went on to reinstall the workers and join them to the cluster using the new VIP. Role &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k3s-agents&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Check if k3s is already installed&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ansible.builtin.stat&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/usr/local/bin/k3s&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Install k3s on agent&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;become&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;yes&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;shell&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_install_script_dest&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;environment&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_CHANNEL&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_channel&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;INSTALL_K3S_VERSION&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_version&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_TOKEN&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_token&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# The VIP is used instead of k8s-control-1 IP address&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_URL&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;https://{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s_control_plane_vip&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}:6443&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;K3S_NODE_NAME&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;node_name&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;when&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;(reinstall | bool) or not k3s.stat.exists&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/etc/systemd/system/k3s-agent.service.env&lt;/code&gt; after reinstalling the workers:&lt;/p&gt;

&lt;div class=&quot;language-conf highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;...
&lt;span class=&quot;n&quot;&gt;K3S_URL&lt;/span&gt;=&lt;span class=&quot;s1&quot;&gt;&apos;https://10.0.3.100:6443&apos;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I was expecting a reinstall with a new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;K3S_URL&lt;/code&gt; to break the existing setup in &lt;em&gt;some&lt;/em&gt; way, but it didn’t. Everything worked and none of the workloads seemed to have been impacted. This made me suspicious enough that I had to verify the new setup.&lt;/p&gt;

&lt;p&gt;I ended up:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Draining and shutting down &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-cluster-1&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Performing operations that would trigger traffic in the cluster, such as deleting pods&lt;/li&gt;
  &lt;li&gt;Observing that workers still functioned as normal&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I considered the setup verified after this.&lt;/p&gt;

&lt;h2 id=&quot;enabling-the-haproxy-plugin-prometheus-exporter&quot;&gt;Enabling the HAProxy plugin Prometheus exporter&lt;/h2&gt;

&lt;p&gt;I enabled the Prometheus exporter to gather metrics from HAProxy:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_enabling_prometheus_exporter_haproxy.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_enabling_prometheus_exporter_haproxy.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The Config Export:&lt;/p&gt;

&lt;div class=&quot;language-conf highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;frontend&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prometheus_exporter&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;bind&lt;/span&gt; *:&lt;span class=&quot;m&quot;&gt;8404&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;http&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;http&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;request&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;use&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;service&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prometheus&lt;/span&gt;-&lt;span class=&quot;n&quot;&gt;exporter&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;if&lt;/span&gt; { &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt; /&lt;span class=&quot;n&quot;&gt;metrics&lt;/span&gt; }
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Updating the Prometheus scrape config:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;job_name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;k3s-controlplane-haproxy&apos;&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;scrape_interval&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;15s&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;static_configs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;targets&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;10.0.3.100:8404&apos;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I imported the &lt;a href=&quot;https://grafana.com/grafana/dashboards/12693-haproxy-2-full/&quot;&gt;HAProxy 2 Grafana dashboard&lt;/a&gt;. Displaying metrics from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k3s-controlplane&lt;/code&gt; services:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_grafana_dashboard.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_grafana_dashboard.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The amount of traffic passing through HAProxy seemed very low compared to that I was expecting:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_grafana_dashboard_no_traffic.png&quot;&gt;
          &lt;img src=&quot;/img/posts/making-the-k3s-cluster-control-plane-ha/haproxy_grafana_dashboard_no_traffic.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;I could not get this to compute. There is enough traffic between the worker nodes and the control plane that no traffic for such a long period of time means something isn’t right. I verified earlier that traffic was not exclusively being sent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-control-1&lt;/code&gt; after the reinstall.&lt;/p&gt;

&lt;p&gt;I checked the logs of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k8s-worker-1&lt;/code&gt; and discovered this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Oct 27 08:45:36 k8s-worker-1 k3s[2406293]: time=&quot;2024-10-27T08:45:36Z&quot; level=info msg=&quot;Removing server from load balancer k3s-agent-load-balancer: 10.0.3.2:6443&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That seemed strange, the worker nodes aren’t configured with any knowledge of the setup in HAProxy.&lt;/p&gt;

&lt;p&gt;Turns out that K3s agents don’t actually use the Fixed Registration Address (the VIP in this case) for anything beyond initial registration, or as a failover if all control plane nodes are down: &lt;a href=&quot;https://docs.k3s.io/architecture#fixed-registration-address-for-agent-nodes&quot;&gt;https://docs.k3s.io/architecture#fixed-registration-address-for-agent-nodes&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In the high-availability server configuration, each node can also register with the Kubernetes API by using a fixed registration address, as shown in the diagram below.&lt;/p&gt;

  &lt;p&gt;After registration, the agent nodes establish a connection directly to one of the server nodes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Which makes perfect sense as to why there is close to no traffic going through the HAProxy frontend after the initial registration. There are some issues that clarify this:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/k3s-io/k3s/discussions/4488#discussioncomment-1719009&quot;&gt;https://github.com/k3s-io/k3s/discussions/4488#discussioncomment-1719009&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/k3s-io/k3s/discussions/10991#discussioncomment-10848102&quot;&gt;https://github.com/k3s-io/k3s/discussions/10991#discussioncomment-10848102&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And sure enough, a restart of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;k3s-agent.service&lt;/code&gt; produced the following:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Starting k3s agent v1.24.17+k3s1 (026bb0ec)&quot;
Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Adding server to load balancer k3s-agent-load-balancer: 10.0.3.100:6443&quot;
Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Adding server to load balancer k3s-agent-load-balancer: 10.0.3.7:6443&quot;
Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Adding server to load balancer k3s-agent-load-balancer: 10.0.3.8:6443&quot;
Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Adding server to load balancer k3s-agent-load-balancer: 10.0.3.2:6443&quot;
Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Removing server from load balancer k3s-agent-load-balancer: 10.0.3.100:6443&quot;
Oct 27 18:57:32 k8s-worker-1 k3s[327800]: time=&quot;2024-10-27T18:57:32Z&quot; level=info msg=&quot;Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -&amp;gt; [10.0.3.2:6443 10.0.3.7:6443 10.0.3.8:6443] [default: 10.0.3.100:6443]&quot;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;With that cleared up, I merged the changes to the playbook and called it a day.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;One downside of the current setup is that I’m placing more responsibility in OPNsense by running HAProxy there. I don’t have a secondary instance so there is no failover if OPNsense goes down. In most cases I opt for using Terraform and Ansible, now I’m relying on UI configuration steps with offsite backups in the event of failure.&lt;/p&gt;

&lt;p&gt;That being said, the certificates in the cluster are already configured with the VIP. If I ever change my mind and want to convert from OPNsense to something else, as long as whatever load balancer I end up using serves the frontend over the same IP address I should be fine.&lt;/p&gt;

&lt;p&gt;It’s great to finally have proper HA setup for the control plane.&lt;/p&gt;
</description>
        <pubDate>Tue, 29 Oct 2024 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2024/10/29/making-the-k3s-cluster-ha/</link>
        <guid isPermaLink="true">https://fredrickb.com/2024/10/29/making-the-k3s-cluster-ha/</guid>
        
        <category>opnsense</category>
        
        <category>k3s</category>
        
        <category>haproxy</category>
        
        <category>ha</category>
        
        <category>k3s</category>
        
        
      </item>
    
      <item>
        <title>Using Terraform, Ansible and GitHub Actions to automate provisioning and configuration of workloads in the homelab</title>
        <description>&lt;p&gt;Around six months ago I converted all of the internal cluster workloads
in k3s from manually installed Helm releases to &lt;a href=&quot;https://argo-cd.readthedocs.io/en/stable/&quot;&gt;ArgoCD&lt;/a&gt; applications.
This has made upgrading everything within the cluster a breeze. I’ve thought
about doing the same for the VMs used as cluster nodes since then. Now
seemed like a good time.&lt;/p&gt;

&lt;p&gt;I already use Terraform for provisioning VMs in Proxmox, with Ansible for installing
and configuring the cluster. All of this is ran from a local machine. The task is
therefore to automate these steps from another machine using pipelines.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In a perfect scenario I would have some tool with access to the Proxmox API
which could spawn clusters on-demand and do all of the lifecycle management. There are a
couple of options for other hypervisors, but I haven’t found anything for Proxmox at the
time of this writing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Enter GitHub Actions and &lt;a href=&quot;https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners&quot;&gt;self-hosted runners&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;scope&quot;&gt;Scope&lt;/h2&gt;

&lt;p&gt;Since I’m using &lt;a href=&quot;https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-a-repository&quot;&gt;repository runners&lt;/a&gt;, spinning up new runners has to be an easy
process that scales without the need for manual steps.&lt;/p&gt;

&lt;p&gt;The criteria I have:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Provisioning and configuring runners should be done using IaC and workflows&lt;/li&gt;
  &lt;li&gt;The initial runner is the only runner which should be provisioned and configured manually&lt;/li&gt;
  &lt;li&gt;A runner connected to a repository should be able to do the following in a workflow:
    &lt;ul&gt;
      &lt;li&gt;Use Terraform to provision new VMs in Proxmox&lt;/li&gt;
      &lt;li&gt;Use Ansible to configure VMs&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;architecture&quot;&gt;Architecture&lt;/h2&gt;

&lt;h3 id=&quot;high-level&quot;&gt;High-level&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
    subgraph github_actions_runner_spawner_vm[VM: GitHub Actions Runner Spawner]
        github_actions_runner_spawner[Runner: GitHub Actions Runner Spawner]
        repo_1_n_runner[Container: Repository 1..N Runner]
    end

    subgraph github[GitHub.com]
        github_actions_self_hosted_runners[GitHub Actions Self Hosted Runners Repository]
        repo_1_n[Repository 1..N]
    end

    github_actions_runner_spawner--&amp;gt;|Connects to repository|github_actions_self_hosted_runners
    github_actions_runner_spawner--&amp;gt;|Spawns|repo_1_n_runner
    repo_1_n_runner--&amp;gt;|Connects to repository|repo_1_n

    tf_runner_repo[GitHub Actions Runner Spawner Terraform Repository]--&amp;gt;|Manually provisions|github_actions_runner_spawner_vm
    ansible_runner_repo[GitHub Actions Runner Spawner Ansible Repository]--&amp;gt;|Manually configures|github_actions_runner_spawner_vm
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In simple terms: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GitHub Actions Runner Spawner&lt;/code&gt; is just a runner installed directly on a VM
to spawn other runners as containers on the same host. The runners in containers are connected to other
repositories holding either Terraform config or Ansible playbooks.&lt;/p&gt;

&lt;h3 id=&quot;network&quot;&gt;Network&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
    gha_vlan[VLAN - GitHub Actions Runners - 10.0.7.0/24]
    gha_vlan--&amp;gt;|Port 443|proxmox_vlan[VLAN - Proxmox - 10.0.2.0/24]
    gha_vlan--&amp;gt;|Port 22|k3s_vlan[VLAN - k3s - 10.0.3.0/24]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The firewall rules gives access to other VLANs, more specifically the Proxmox
API and SSH port of the k3s cluster nodes.&lt;/p&gt;

&lt;h3 id=&quot;k3s-provisioning&quot;&gt;k3s provisioning&lt;/h3&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;graph TB
    subgraph github_actions_runner_spawner_vm[VM: GitHub Actions Runner Spawner]
        k3s_tf_repo_runner[Container: k3s Terraform Repository Runner]
        k3s_ansible_repo_runner[Container: k3s Ansible Repository Runner]
    end

    subgraph github[GitHub.com]
        tf_repo[k3s Terraform Repository]
        ansible_repo[k3s Ansible Repository]
    end

    subgraph proxmox_vlan[Proxmox VLAN]
        pve_hosts[Proxmox hosts]
    end

    subgraph k3s_vlan[k3s VLAN]
        k3s_hosts[k3s hosts]
    end

    tf_repo--&amp;gt;|Delegate workflow|k3s_tf_repo_runner
    ansible_repo--&amp;gt;|Delegate workflow|k3s_ansible_repo_runner
    k3s_tf_repo_runner--&amp;gt;|Access|pve_hosts
    k3s_ansible_repo_runner--&amp;gt;|Configure|k3s_hosts
    pve_hosts--&amp;gt;|Provision|k3s_hosts
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Both k3s repositories have a runner connected to them.&lt;/p&gt;

&lt;h2 id=&quot;implementation&quot;&gt;Implementation&lt;/h2&gt;

&lt;p&gt;The repository &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GitHub Actions Self Hosted Runners Repository&lt;/code&gt; does the following:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Builds and uploads container images for runners to GitHub Packages&lt;/li&gt;
  &lt;li&gt;Spins up new runners as containers on &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GitHub Actions Runner Spawner&lt;/code&gt; using docker compose&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The workflow for spinning up new runners:&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.github/workflows/spawn_runners.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;Spawn&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;runners&quot;&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;push&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;branches&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;paths&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;.github/workflows/spawn_runners.yaml&apos;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;docker-compose.yaml&apos;&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;workflow_dispatch&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;REGISTRY&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ghcr.io&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;jobs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;spawn_runners&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Delegate the job to the self hosted runner&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# running on VM `GitHub Actions Runner Spawner`&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;runs-on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;self-hosted&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Spawn runners&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;uses&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# Login to the container image registry&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Login to the image registry&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;uses&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;docker/login-action@65b78e6e13532edd9afa3aa52ac7964289d1a9c1&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;registry&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ env.REGISTRY }}&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;username&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ github.actor }}&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;password&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.GITHUB_TOKEN }}&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# Spawn runners as containers&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Spawn runners in containers&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;env&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# Fine-Grained Personal Access Token with&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# access to the repositories&lt;/span&gt;
          &lt;span class=&quot;na&quot;&gt;PAT&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;${{ secrets.SELF_HOSTED_RUNNERS_HOMELAB_PAT }}&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;docker compose up -d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker-compose.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;services&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Runner connected to the repository with the&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# k3s terraform config&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;k3s_terraform_runner&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;container_name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s-terraform-runner&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ghcr.io/fredrickb/github-actions-self-hosted-runners:2.0.0&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;restart&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;always&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;environment&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;URL=&amp;lt;URL of repository holding k3s terraform config&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;NAME=k3s-terraform-runner&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;PAT=${PAT}&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Runner connected to the repository with the&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# k3s ansible playbook&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;k3s_ansible_runner&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;container_name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;k3s-ansible-runner&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;ghcr.io/fredrickb/github-actions-self-hosted-runners:2.0.0&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;restart&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;always&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;environment&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;URL=&amp;lt;URL of repository holding k3s ansible playbook&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;NAME=k3s-ansible-runner&lt;/span&gt;
      &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;PAT=${PAT}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Running the workflow:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_runner_workflow_runner_spawner.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_runner_workflow_runner_spawner.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;The runners then register to the repositories:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_ansible_runner_registered.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_ansible_runner_registered.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_terraform_runner_registered.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_terraform_runner_registered.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;journald&lt;/code&gt; &lt;a href=&quot;https://docs.docker.com/engine/logging/drivers/journald/&quot;&gt;logging driver&lt;/a&gt; with Promtail, Loki and Grafana provides insight
into the logs of the runners:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_runner_logs.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_self_hosted_runner_logs.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Adding additional runners boils down to:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Including more repositories in the Fine-Grained Personal Access Token&lt;/li&gt;
  &lt;li&gt;Extending &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker-compose.yaml&lt;/code&gt; with more containers&lt;/li&gt;
  &lt;li&gt;Run the workflow&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything is defined in code and automated.&lt;/p&gt;

&lt;h2 id=&quot;using-runners-for-deploying-k3s-cluster&quot;&gt;Using runners for deploying k3s cluster&lt;/h2&gt;

&lt;p&gt;Setting up k3s can now be done using workflows.&lt;/p&gt;

&lt;h3 id=&quot;provision-k3s-nodes-using-terraform-and-workflow&quot;&gt;Provision k3s nodes using Terraform and workflow&lt;/h3&gt;

&lt;p&gt;Snippet from workflow:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Terraform workflow&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;pull_request&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;branches&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;push&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;branches&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;workflow_dispatch&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;jobs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;plan&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Perform Terraform plan and (only branch `main`) apply&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;runs-on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;self-hosted&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Workflow assigned to runner:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_terraform_workflow_setup_job.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_terraform_workflow_setup_job.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Terraform plan being run:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_terraform_workflow_step_output.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_terraform_workflow_step_output.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Workflow summary:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_terraform_workflow_summary.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_terraform_workflow_summary.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h3 id=&quot;install-and-configure-k3s-cluster-using-ansible-and-workflow&quot;&gt;Install and configure k3s cluster using Ansible and workflow&lt;/h3&gt;

&lt;p&gt;Snippet from workflow:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Run Ansible playbook&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;push&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;branches&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;workflow_dispatch&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;

&lt;span class=&quot;na&quot;&gt;jobs&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;run_playbook&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Run playbook&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;runs-on&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;pi&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;self-hosted&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ansible&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;steps&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Workflow assigned to runner:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_ansible_workflow_setup_job_step.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_ansible_workflow_setup_job_step.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Ansible playbook running:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_ansible_workflow_step_output.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_ansible_workflow_step_output.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Workflow summary:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_ansible_workflow_summary.png&quot;&gt;
          &lt;img src=&quot;/img/posts/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/github_actions_k3s_ansible_workflow_summary.png&quot; alt=&quot;&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;I’m at the point where I can almost bootstrap the entire cluster using IaC and automation:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Order&lt;/th&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Task&lt;/th&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Method&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Provision VMs using Terraform&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Automated using workflow&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;2&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Configure VMs using Ansible&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Automated using workflow&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;3&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Install ArgoCD&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;em&gt;Manual step&lt;/em&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;4&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Configure internal cluster workloads&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Automated using ArgoCD&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;Restore PVCs from Longhorn offsite backups&lt;/td&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;em&gt;Manual step&lt;/em&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Overall I’m pleased with the results. Changing the cluster configuration can be
done using pull requests and workflows. The setup also scales for future workloads.&lt;/p&gt;

</description>
        <pubDate>Mon, 14 Oct 2024 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2024/10/14/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/</link>
        <guid isPermaLink="true">https://fredrickb.com/2024/10/14/using-terraform-ansible-and-github-actions-to-automate-provisioning-and-configuration-of-workloads-in-the-homelab/</guid>
        
        <category>proxmox</category>
        
        <category>homelab</category>
        
        <category>terraform</category>
        
        <category>github-actions</category>
        
        <category>ansible</category>
        
        <category>iac</category>
        
        <category>configuration-management</category>
        
        <category>netmaker</category>
        
        
      </item>
    
      <item>
        <title>Replacing all internal DNS records in the homelab</title>
        <description>&lt;p&gt;I’ve used &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.local&lt;/code&gt; as the TLD internally in the homelab since the
beginning. This is not a good idea &lt;a href=&quot;https://en.wikipedia.org/wiki/.local&quot;&gt;for several reasons&lt;/a&gt;, and I decided
it was about time to remove it and start using my own domain for internal
services.&lt;/p&gt;

&lt;h1 id=&quot;updating-the-step-ca-config&quot;&gt;Updating the step-ca config&lt;/h1&gt;

&lt;p&gt;I wanted to expose the step-ca service outside of the k3s cluster
using a domain name different from the current &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;step-ca.local&lt;/code&gt;.
This requires changing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dnsNames&lt;/code&gt; property in the &lt;a href=&quot;https://smallstep.com/docs/step-ca/configuration/#basic-configuration-options&quot;&gt;configuration file&lt;/a&gt;
(this is the same as passing a new value to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--dns&lt;/code&gt; flag when &lt;a href=&quot;https://smallstep.com/docs/step-cli/reference/ca/init/#options&quot;&gt;bootstrapping the creation of a CA&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The resulting section &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dnsNames&lt;/code&gt; in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ca.json&lt;/code&gt; then becomes:&lt;/p&gt;

&lt;div class=&quot;language-json highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;nl&quot;&gt;&quot;dnsNames&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;step-certificates.step-ca.svc.cluster.local&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;step-ca.homelab.fredrickb.com&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;localhost&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
  &lt;/span&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;blockquote&gt;
  &lt;p&gt;I had &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;step-ca.local&lt;/code&gt; as a temporary entry while migrating
the services over to the new domain. The config above is the final result after
completing the migration process.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1 id=&quot;updating-the-opnsense-unbound-dns-config&quot;&gt;Updating the OPNsense Unbound DNS config&lt;/h1&gt;

&lt;p&gt;Adding the new domains in the OPNsense&lt;i class=&quot;fa fa-registered&quot;&gt;&lt;/i&gt;
Unbound DNS host overrides (kudos to &lt;a href=&quot;https://homenetworkguy.com/how-to/create-unbound-dns-override-aliases-in-opnsense/&quot;&gt;Home Network Guy&lt;/a&gt; for the tutorial):&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/opnsense_unbound_dns_host_overrides.png&quot; title=&quot;Screenshot of new OPNsense DNS overrides&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/opnsense_unbound_dns_host_overrides.png&quot; alt=&quot;Screenshot of new OPNsense DNS overrides&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h1 id=&quot;updating-the-proxmox-config&quot;&gt;Updating the Proxmox config&lt;/h1&gt;

&lt;p&gt;In the Ansible playbook I use for my Proxmox hosts I changed the host
used for the ACME account to the following:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# roles/add-acme-certificate/vars/main.yaml&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;step_ca_url&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;https://step-ca.homelab.fredrickb.com/acme/acme/directory&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And then the domain used for the hosts themselves:&lt;/p&gt;

&lt;div class=&quot;language-ini highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;; inventory.ini
&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;10.0.2.2&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;py&quot;&gt;domain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;pve2.homelab.fredrickb.com name=pve2&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;10.0.2.3&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;py&quot;&gt;domain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;pve3.homelab.fredrickb.com name=pve3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Ordering a certificate for the Proxmox hosts now uses the new domain
name of the step-ca service:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/pve3_acme_certificate_provisioning.png&quot; title=&quot;Screenshot of TLS certificate being provisioned to Proxmox host&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/pve3_acme_certificate_provisioning.png&quot; alt=&quot;Screenshot of TLS certificate being provisioned to Proxmox host&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Accessing one of the Proxmox hosts using a new domain with a valid cert:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/pve2_dashboard_new_dns.png&quot; title=&quot;Screenshot of one of the Proxmox hosts after updating DNS records and TLS certificate provisioning&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/pve2_dashboard_new_dns.png&quot; alt=&quot;Screenshot of one of the Proxmox hosts after updating DNS records and TLS certificate provisioning&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h1 id=&quot;updating-the-ingress-config&quot;&gt;Updating the Ingress config&lt;/h1&gt;

&lt;p&gt;In Kubernetes its just a matter of updating the domain in the Ingress and let
the existing &lt;a href=&quot;https://github.com/smallstep/step-issuer&quot;&gt;step-issuer&lt;/a&gt; provision new certificates.&lt;/p&gt;

&lt;p&gt;Below is a snippet from the Grafana ingress:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;apiVersion&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Ingress&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;metadata&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;annotations&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;step-issuer&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer-group&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;certmanager.step.sm&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer-kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;StepClusterIssuer&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;labels&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;spec&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;rules&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;host&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana.homelab.fredrickb.com&lt;/span&gt;
      &lt;span class=&quot;s&quot;&gt;...&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;tls&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana.homelab.fredrickb.com&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;secretName&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana-tls-cert&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Accessing Grafana using the new domain with a valid cert:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/grafana_new_dns.png&quot; title=&quot;Screenshot of Grafana after updating DNS records and TLS certificate provisioning&quot;&gt;
          &lt;img src=&quot;/img/posts/replacing-all-internal-dns-records-in-the-homelab/grafana_new_dns.png&quot; alt=&quot;Screenshot of Grafana after updating DNS records and TLS certificate provisioning&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;Every service in the homelab now uses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*.homelab.fredrickb.com&lt;/code&gt; instead of
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*.local&lt;/code&gt;.&lt;/p&gt;

</description>
        <pubDate>Mon, 15 Jul 2024 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2024/07/15/replacing-all-internal-dns-records-in-the-homelab/</link>
        <guid isPermaLink="true">https://fredrickb.com/2024/07/15/replacing-all-internal-dns-records-in-the-homelab/</guid>
        
        <category>proxmox</category>
        
        <category>step-ca</category>
        
        <category>acme</category>
        
        <category>opnsense</category>
        
        <category>dns</category>
        
        <category>homelab</category>
        
        <category>k3s</category>
        
        <category>cert-manager</category>
        
        
      </item>
    
      <item>
        <title>Using step-ca for certificate provisioning in the homelab</title>
        <description>&lt;p&gt;So what is &lt;a href=&quot;https://github.com/smallstep/certificates&quot;&gt;step-ca&lt;/a&gt;? It is an open-source &lt;a href=&quot;https://en.wikipedia.org/wiki/Certificate_authority&quot;&gt;Certificate Authority (CA)&lt;/a&gt; that
can be self-hosted. Step-ca supports provisioning TLS certificates using the &lt;a href=&quot;https://en.wikipedia.org/wiki/Automatic_Certificate_Management_Environment&quot;&gt;ACME&lt;/a&gt; protocol.
It pairs well with &lt;a href=&quot;https://cert-manager.io/&quot;&gt;cert-manager&lt;/a&gt; using &lt;a href=&quot;https://github.com/smallstep/step-issuer&quot;&gt;step-issuer&lt;/a&gt;, meaning you get certificate provisioning
for services in Kubernetes fully automated.&lt;/p&gt;

&lt;p&gt;Since any client supporting ACME is covered, Proxmox hosts can request certificates
from step-ca if you expose it outside of the Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;In this post I’ll briefly go through some of my own setup, configuration and experience
of using step-ca with Kubernetes and Proxmox.&lt;/p&gt;

&lt;h1 id=&quot;installation&quot;&gt;Installation&lt;/h1&gt;

&lt;blockquote&gt;
  &lt;p&gt;Past this point, step-ca and step-certificates are interchangeable, step-certificates is the name
of the server component.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Step-certificates and step-issuer can be installed into a Kubernetes cluster using their &lt;a href=&quot;https://smallstep.github.io/helm-charts/&quot;&gt;helm charts&lt;/a&gt;.
I won’t cover the process of installing and configuring step-certificates and step-issuer in full,
the documentation on GitHub already cover this. You use the &lt;a href=&quot;https://github.com/smallstep/cli&quot;&gt;step cli&lt;/a&gt; and the
&lt;a href=&quot;https://smallstep.com/docs/step-cli/reference/ca/#description&quot;&gt;step ca command&lt;/a&gt; to generate the initial configuration for step-certificates.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This is probably the most time-consuming step of the process. Making the
installation repeatable and converting it to IaC that can be deployed using something
like ArgoCD is going to take some time. Read the docs thoroughly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Past this point I’ll assume you have a functioning step-certificates installation in your
Kubernetes cluster.&lt;/p&gt;

&lt;h1 id=&quot;certificate-provisioning&quot;&gt;Certificate provisioning&lt;/h1&gt;

&lt;p&gt;Download the Root CA certificate from step-certificates after the installation:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get secrets &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; &amp;lt;namespace&amp;gt; &amp;lt;step-certificates-certs-secret&amp;gt; &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;jsonpath&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;{.data.root_ca\.crt}&apos;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; root_ca.crt
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Keep this handy, we’ll use it in the next sections.&lt;/p&gt;

&lt;h2 id=&quot;kubernetes&quot;&gt;Kubernetes&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;This requires the chart &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;smallstep/step-certificates&lt;/code&gt; to be installed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once everything is installed and configured correctly, you can define a
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StepClusterIssuer&lt;/code&gt;. In this case I have one predefined named &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;step-issuer&lt;/code&gt;.
This is an excerpt of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;values.yaml&lt;/code&gt; I use with the &lt;a href=&quot;https://github.com/smallstep/helm-charts/tree/master/step-issuer&quot;&gt;step-issuer Helm chart&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;stepClusterIssuer&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;create&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;caUrl&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;https://&amp;lt;step-certificates-service-name&amp;gt;.&amp;lt;namespace&amp;gt;.svc.cluster.local&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Base64 encoded root_ca.crt downloaded previously&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;caBundle&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;Base64 encoded root_ca.crt&amp;gt;&lt;/span&gt;
  
  &lt;span class=&quot;na&quot;&gt;provisioner&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# To get these values, see: https://github.com/smallstep/step-issuer?tab=readme-ov-file#3-configure-step-issuer&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;kid&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;lt;kid&amp;gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&amp;lt;mail&amp;gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;passwordRef&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# Defined in the step-certificates helm installation&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;namespace&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;namespace&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;step-certificates-provisioner-password-secret-name&amp;gt;&lt;/span&gt;
      &lt;span class=&quot;na&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;password&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;blockquote&gt;
  &lt;p&gt;There is an official sample for a &lt;a href=&quot;https://github.com/smallstep/step-issuer/blob/master/config/samples/stepclusterissuer.yaml&quot;&gt;StepClusterIssuer here&lt;/a&gt; showing more realistic values.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;With a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StepClusterIssuer&lt;/code&gt; installed, you can now update the Ingress resources that are going
to be exposed over HTTPS.&lt;/p&gt;

&lt;p&gt;Below is an excerpt of the Ingress definition in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;values.yaml&lt;/code&gt; I use with the &lt;a href=&quot;https://github.com/grafana/helm-charts&quot;&gt;Grafana helm chart&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I don’t recommend using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.local&lt;/code&gt; as a TLD, even internally. This is just what I used early
on and has not been changed since.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;ingress&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;enabled&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana.local&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;annotations&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Sets the issuer of the TLS certificate to be the step-issuer&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer-group&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;certmanager.step.sm&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer-kind&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;StepClusterIssuer&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;cert-manager.io/issuer&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;step-issuer&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# Enable TLS for the given domain and provide the name of&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;# the secret which will contain the certifcate&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;tls&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;hosts&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana.local&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;secretName&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;grafana-tls-cert&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fetching the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;certificates.cert-manager.io&lt;/code&gt; for Grafana now provides the certificate
issued by step-certificates:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;kubectl get certificates &lt;span class=&quot;nt&quot;&gt;-n&lt;/span&gt; grafana &lt;span class=&quot;nt&quot;&gt;-o&lt;/span&gt; wide
NAME               READY   SECRET             ISSUER        STATUS                                          AGE
grafana-tls-cert   True    grafana-tls-cert   step-issuer   Certificate is up to &lt;span class=&quot;nb&quot;&gt;date &lt;/span&gt;and has not expired   7d5h
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Add the previously downloaded &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;root_ca.crt&lt;/code&gt; to the CA certificates of the devices
which are going to access your services using HTTPS.&lt;/p&gt;

&lt;p&gt;If everything is correct you’ll see that the certificate is one provisioned by your
own internal CA (in my case its named “Homelab”):&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/grafana_tls_certificate_information.png&quot; title=&quot;Showing the step-certificates provisioned certificate for the Grafana Ingress&quot;&gt;
          &lt;img src=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/grafana_tls_certificate_information.png&quot; alt=&quot;Screenshot Grafana TLS certificate information&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h2 id=&quot;proxmox&quot;&gt;Proxmox&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;This requires the step-certificates Service to be of type &lt;a href=&quot;https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer&quot;&gt;LoadBalancer&lt;/a&gt; and given a domain
name outside of Kubernetes. Exposing the service using an Ingress will not work when using ACME.
This requires that you run something like &lt;a href=&quot;https://metallb.org/&quot;&gt;MetalLB&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Proxmox also supports using &lt;a href=&quot;https://pve.proxmox.com/wiki/Certificate_Management#sysadmin_certs_api_gui&quot;&gt;ACME to request TLS certificates for its web UI
and REST API&lt;/a&gt; (the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pveproxy&lt;/code&gt; service). I’m using the &lt;a href=&quot;https://pve.proxmox.com/wiki/Certificate_Management#sysadmin_certs_acme_http_challenge&quot;&gt;HTTP-01 challenge plugin&lt;/a&gt; to
automate this. The steps detailed in the guide for Let’s Encrypt works for step-certificates
as well when using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;2) Custom&lt;/code&gt; option listed in the &lt;a href=&quot;https://pve.proxmox.com/wiki/Certificate_Management#_acme_examples_with_tt_span_class_monospaced_pvenode_span_tt&quot;&gt;ACME examples guide&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;The step-certificates root CA certificate is loaded into the CA bundle of the Proxmox host&lt;/li&gt;
  &lt;li&gt;Port &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;80&lt;/code&gt; is not in use on the Proxmox  host and can be reached by the step-certificates deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;process-visualized&quot;&gt;Process visualized&lt;/h3&gt;

&lt;p&gt;This is how the certificate provisioning will occur:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-mermaid&quot;&gt;sequenceDiagram
    pve-host-&amp;gt;&amp;gt;step-certificates: Request certificate
    step-certificates-&amp;gt;&amp;gt;pve-host: Return token as part of HTTP-01 challenge
    pve-host-&amp;gt;&amp;gt;pve-host: Start web-server on port 80 and serve token
    pve-host-&amp;gt;&amp;gt;step-certificates: HTTP-01 challenge ready to be verified
    step-certificates-&amp;gt;&amp;gt;pve-host: Verify the HTTP-01 challenge
    step-certificates-&amp;gt;&amp;gt;pve-host: Issue certificate
    pve-host-&amp;gt;&amp;gt;pve-host: Load TLS certificate
    pve-host-&amp;gt;&amp;gt;pve-host: Stop web-server on port 80
    pve-host-&amp;gt;&amp;gt;pve-host: Restart pveproxy service
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;ansible&quot;&gt;Ansible&lt;/h3&gt;

&lt;p&gt;These are snippets from the Ansible playbook I use for automating the
process. Everything here is contained within a role.&lt;/p&gt;

&lt;p&gt;Import the homelab root CA certificate:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# tasks/install_step_ca_root_certificate.yaml&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Copy CA root certificate into correct directory&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;templates/step_ca_root.crt.j2&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;step_ca_root_certificate_path&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Reload CA certificates&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;update-ca-certificates&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Register the ACME account:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# tasks/setup_acme_account.yaml&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Check if ACME account already exists&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;shell&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;pvenode acme account list | grep {{ step_ca_acme_name }}&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;ignore_errors&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;register&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;acme_exists_cmd&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Add ACME account&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;pvenode acme account register {{ step_ca_acme_name }} {{ step_ca_contact_mail }} --directory {{ step_ca_url }}&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;when&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;acme_exists_cmd.rc == &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;1&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Add ACME config&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;pvenode config set --acme domains={{ domain }}&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Order a certificate&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;pvenode acme cert order --force&lt;/span&gt;

&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Restart pveproxy service&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;systemd&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;pveproxy&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;restarted&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Redirect incoming traffic to port 8006 from port 443
using the following iptables rule:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This is to avoid having to append port &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;8006&lt;/code&gt; at the
end of the URL. Will only work when the correct domain is
used for accessing the Proxmox host.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# tasks/add_port_mapping_for_web_ui.yaml&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Add iptables-persistent package to ensure rules are persistent after reboot&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;apt&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;iptables-persistent&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;present&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# This port mapping ensures that Proxmox UI is exposed&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# on port 443 by port forwarding to 8006. When ACME certs&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# are setup this means the URL no longer requires expicit&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# port at the end.&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Add iptables port forwarding from port 443 to port &lt;/span&gt;&lt;span class=&quot;m&quot;&gt;8006&lt;/span&gt;
  &lt;span class=&quot;na&quot;&gt;iptables&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Ensures that only traffic destined&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# for the domain of the pve node is&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# handled by this rule, otherwise all&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# traffic out on port 443 for the VMs&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# sharing the interface vmbr0 will be&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# affected and no traffic over TLS will&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# resolve to the correct CA bundle.&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;destination&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;{{&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;domain&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s&quot;&gt;}}&quot;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;nat&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;chain&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;PREROUTING&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;in_interface&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;vmbr0&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;protocol&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tcp&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;match&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;tcp&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;destination_port&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;443&apos;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;jump&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;REDIRECT&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;to_ports&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&apos;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;8006&apos;&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;comment&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Redirect secure web traffic to default web-ui of Proxmox&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The vars used:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# vars/main.yaml&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;step_ca_root_certificate_path&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;/usr/local/share/ca-certificates/homelab_ca.crt&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;step_ca_url&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;https://&amp;lt;step-certificates-domain&amp;gt;/acme/acme/directory&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;step_ca_acme_name&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;acme-account-name&amp;gt;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;step_ca_contact_mail&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;mail-address&amp;gt;&lt;/span&gt;
&lt;span class=&quot;na&quot;&gt;domain&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&amp;lt;domain&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Tying it all together:&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# tasks/main.yaml&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;import_tasks&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;install_step_ca_root_certificate.yaml&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;import_tasks&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;setup_acme_account.yaml&lt;/span&gt;
&lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;na&quot;&gt;import_tasks&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;add_port_mapping_for_web_ui.yaml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Proxmox hosts will now do periodic certificate renewals automatically:&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/proxmox_certificate_renewal_item_entry.png&quot; title=&quot;Item entry in the task log of Proxmox host&quot;&gt;
          &lt;img src=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/proxmox_certificate_renewal_item_entry.png&quot; alt=&quot;Automated Proxmox certificate renewal item entry&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/proxmox_certificate_renewal_process.png&quot; title=&quot;Displaying the details of the certificate issuing process&quot;&gt;
          &lt;img src=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/proxmox_certificate_renewal_process.png&quot; alt=&quot;Screenshot Proxmox certificate renewal details&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;p&gt;Inspecting the certificate of the Proxmox host using the web ui now gives
a valid TLS certificate with the same CA issuer as before (“Homelab”):&lt;/p&gt;

&lt;figure class=&quot; &quot;&gt;
  
    
      &lt;a href=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/proxmox_tls_certificate_information.png&quot; title=&quot;Showing the step-certificates provisioned certificate for the Proxmox host&quot;&gt;
          &lt;img src=&quot;/img/posts/using-step-ca-for-internal-tls-certificate-provisioning-in-the-homelab/proxmox_tls_certificate_information.png&quot; alt=&quot;Screenshot Proxmox TLS certificate&quot; /&gt;
      &lt;/a&gt;
    
  
  
&lt;/figure&gt;

&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;

&lt;p&gt;I’ve been running step-ca for around 1.5 years at this point. Its been
great once I got past the initial bootstrapping process. The greatest benefit is that
any internal service I run in the homelab which supports ACME can have certificate provisioning
automated.&lt;/p&gt;

&lt;p&gt;I recently converted the Helm installation of step-certificates to an ArgoCD app,
and while it took some time, extracting the configuration of the existing installation and
placing it into &lt;a href=&quot;https://sealed-secrets.netlify.app/&quot;&gt;Sealed Secrets&lt;/a&gt; worked quite well since most of the configuration can
be loaded from secrets (see &lt;a href=&quot;https://github.com/smallstep/helm-charts/blob/920e05d1704c8d7ad9ca9addd16e86701b69476d/step-certificates/values.yaml#L33&quot;&gt;this line in the step-certificates helm chart&lt;/a&gt;).
Doing the same for step-issuer was fairly easy when comparing the two.&lt;/p&gt;

&lt;p&gt;All things considered step-ca has been rock-solid for as long as I have ran it and I’m
quite pleased with it.&lt;/p&gt;

</description>
        <pubDate>Sat, 06 Apr 2024 00:00:00 +0000</pubDate>
        <link>https://fredrickb.com/2024/04/06/using-step-ca-for-certificate-provisioning-in-the-homelab/</link>
        <guid isPermaLink="true">https://fredrickb.com/2024/04/06/using-step-ca-for-certificate-provisioning-in-the-homelab/</guid>
        
        <category>homelab</category>
        
        <category>step-ca</category>
        
        <category>cert-manager</category>
        
        <category>kubernetes</category>
        
        <category>acme</category>
        
        <category>tls</category>
        
        <category>https</category>
        
        <category>grafana</category>
        
        <category>proxmox</category>
        
        
      </item>
    
  </channel>
</rss>
