adasauce

I was in the midst of implementing a little POC migration of a small deployment from using OpenVPN to WireGuard for a pre-production network when I ran into an issue with 3 of the network clients being NanoPC-T4 devices. I didn't initially consider that this would be an issue, but FriendlyElec in their infinite wisdom does not provide a linux-headers package for their kernels, and wireguard doesn't provide a wireguard-modules package for ARM. Other distributions such as DietPi still package the upstream kernel directly as well, so also do not provide a headers package.

After banging my head against the wall looking into custom mainline kernels for the NanoPC-T4, I considered that there might be a userspace utility for WireGuard that doesn't require compiling a kernel module or installing wireguard-dkms.

A quick search turned up both wireguard-go and boringtun*. From a cursory glance and a few user stories, the boringtun project seemed to be a slightly more mature and less error prone implementation that has seen deployments on embedded & SBC devices already.

*Editor's Note: Fuck CloudFlare. Get your FUCK CLOUDFLARE sickers today!

Installation & Configuration

I still need the wireguard-tools package in order to leverage the handy-dandy wg-quick util and related systemd integration so the VPN will connect easily on boot.

Generate client side keys for WireGuard, and set up a simple wg0.conf using the existing OpenVPN IP addresses so I don't have to reconfigure security groups.

$ cat /etc/apt/sources.list.d/unstable.list
> deb http://deb.debian.org/debian/ unstable main

$ cat /etc/apt/preferences.d/limit-unstable
> Package: *
> Pin: release a=unstable
> Pin-Priority: 90

$ apt install wireguard-tools

$ wg genkey | tee privatekey | wg pubkey > publickey

$ cat /etc/wireguard/wg0.conf

> Address = 10.8.0.2/24
> PrivateKey = <privatekey> 
> DNS = <your favourite DNS server>
>
> [Peer]
> PublicKey = <server public key> 
> PresharedKey = <server PSK>
> AllowedIPs = 10.8.0.0/24
> Endpoint = <server IP>:51820 
> PersistentKeepalive = 25

Install BoringTun & Rust

Dietpi package repo's Rust was old and decrepit, so let's just install new rust from rustup.sh. It should automagically install everything you need to build and compile your rusty applications (i.e. cargo & rustc) for whatever platform you're running. In this case: arm64/aarch64.

$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Next. I had to pull down and build my own copy of BoringTun from source to fix a small issue with compilation before building and installing.

$ git clone https://github.com/cloudflare/boringtun.git
$ cd boringtun/
$ vim src/device/tun_linux.rs
# >> https://github.com/cloudflare/boringtun/issues/89/#issuecomment-508962631
cargo install --path .

The fancy new boringtun binary installs to /root/.cargo/bin, which should be on your path if you've sourced $HOME/.cargo/env since installing Rust.

Manual hax

Privilege de-escalation doesn't work properly for boringtun when logged in as root. Passing undocumented environment variables to wg-quick on startup as suggested by using WG_QUICK_USERSPACE_IMPLEMENTATION="boringtun --disable-drop-privileges" wg-quick up /etc/wireguard/wg0.conf was unsuccessful per the project's README. So hacking wg-quick in place was needed in order to accomplish a functional utility.

$ vim $(which wg-quick)
> 
> add_if() {
>         boringtun --disable-drop-privileges "$INTERFACE"
>         # jlocal ret
>         # ...
>

Note: WG_QUICK_USERSPACE_IMPLEMENTATION is purposefully undocumented and added in #b9b78f27399 to help enable userspace implementations of the WireGuard protocol.

It's Tunnel Time

Now that the hacking and slashing is behind us, we can finally bring the WireGuard VPN connection online to the server.

$ wg-quick up /etc/wireguard/wg0.conf
$ wg

interface: wg0
  public key: <redacted>
  private key: (hidden)
  listening port: 57870

peer: <server public key>
  preshared key: (hidden)
  endpoint: <server IP>:51820
  allowed ips: 10.8.0.0/24
  latest handshake: 1 minute, 16 seconds ago
  transfer: 3.79 KiB received, 119.36 KiB sent
  persistent keepalive: every 25 seconds

If you need to tunnel everything over your VPN vs. just the resources on the VPN network, you can change allowed_ips in your configuration to 0.0.0.0 and it will generate a default route and mark all packets to be destined for the WireGuard connection.

Caveats

It seems like when tearing down the connection, wg-quick doesn't actually destroy the BoringTun tunnel and creates a second process, but doesn't actually complain about it. All seems to be working fine, but after 3-4 restarts, the NanoPC's CPU is pinned at 500% usage with all the userspace agents fighting eachother.

Probably something I'll hack on later to clean up, but for now it is working as well as can be expected. Even with using a userspace vs. kernelspace agent, system load is down and overall latency has improved compared to the existing OpenVPN setup.

Notes

For those looking for a great source of information on installing and configuring WireGuard, check out the Arch Linux Wiki page on WireGuard. Top notch as always.

I manage a number of Ubuntu servers, almost all of which over time have developed DNS resolution issues that traced back to something wrong with systemd-resolve. Systemd-resolve has had a pretty horrible track record for actually working most of the time.

The latest wonky behaviour manifested in DNS resolution completely failing randomly, then would work again, then pop out immediately after, then only try and resolve ipv6 addresses, then none would work at all. After swapping out my nameservers and checking on the stability of my connections to said nameservers it all came back again to systemd-resolve.

Samples of the log output:

systemd-resolved[592]: Server returned error NXDOMAIN, mitigating potential DNS violation DVE-2018-0001, retrying transaction with reduced feature level UDP.
systemd-resolved[592]: Using degraded feature set (TCP) for DNS server x.x.x.x
systemd-resolved[592]: Using degraded feature set (UDP) for DNS server x.x.x.x

Then it would die. The only course for recovering connectivity was to kick the service with a systemctl restart systemd-resolved, then it would only work for another few minutes before getting into a bad state again.

I found an issue on launchpad.net #1822416 which seemed to point to the issue I was facing, but it remains open even though there is an upstream systemd fix in place for it on github/systemd.

The final solution?

Kill it with fire and switch to unbound. Unbound has been cropping up more and more in deployments, and it's been relatively painless to run and didn't require any additional configuration out of the box.

$ apt install unbound resolvconf
$ sudo systemctl disable systemd-resolved
$ sudo systemctl stop systemd-resolved
$ sudo systemctl enable unbound-resolvconf
$ sudo systemctl enable unbound
$ sudo systemctl start unbound-resolvconf

We'll see how it goes, but I haven't had any more DNS instability since switching over.

I recently weathered a bloody battle with grub2 which ended with me pondering: Why don't I just boot this VM via UEFI? (Automating an Arch install to boot from UEFI post coming soon.) To hell with grub, it's finicky configuration, and the massive pain in my side.

Step 1: Install OVMF on the VM host

There are a few different OVMF packages available via extra and AUR. After trying the extra/ovmf package, and it not working immediately, I uninstalled and jumped to AUR with aur/ovmf-aarch64 & aur/ovmf-git.

This led me down an interesting path, as it seemed that I the OVMF_CODE.fd and OVMF_VARS.fd were missing, and just OVMF.fd was compiled in its place. One of the OVMF co-maintainers chimed in on the Debian bug tracker explaining essentially that it's up to distributions of the project to split the files, as including the OVMF.fd leads to confusion. It certainly led to my confusion.

Per the co-maintainer's suggestion, I went back to the distribution's ovmf package, and started debugging again from there.

$ yay -S extra/ovmf

Onto the extra configuration!

Step 2: Configure Libvirt to use OVMF

Libvirt claims that the nvram configuration option is obsolete, and Libvirt will follow the QEMU firmware metadata specification to automatically locate firmware images.

I found this to be hopelessly false. Whether or not the package maintainer is following the expected format or locations, these values do need to be configured to those paths where ovmf installs the files.

/etc/libvirt/qemu.conf

...
nvram = [
   "/usr/share/ovmf/x64/OVMF_CODE.fd:/usr/share/ovmf/x64/OVMF_VARS.fd"
]
...

From there, simply restart the libvirtd service and create the new VM with the appropriate <os> values.

<os>
  <type arch="x86_64" machine="pc-q35-4.0">hvm</type>
  <loader readonly="yes" type="pflash">/usr/share/ovmf/x64/OVMF_CODE.fd</loader>
  <nvram>/var/lib/libvirt/qemu/nvram/archlinux_VARS.fd</nvram>
  <boot dev="hd"/>
</os>

If you are using virt-manager to create your VMs through the wizard, you should be able to now select “UEFI x86_64” in the “Firmware” dropdown when you customize your machine.