Add BFD to the existing BGP session over WireGuard — BFD down in ~700 ms, route withdrawn fast enough for Happy Eyeballs. Pick an Ubuntu/BIRD, VyOS, or CHR relay implementation.
This is an optional enhancement to the
VPS path post, which is a prerequisite —
the WireGuard layer, the BGP session, the /48 aggregate, and the learned
::/0 default route all come from there. Adding BFD to that same BGP session
collapses dead-tunnel detection to about 700 ms instead of waiting
for BGP hold-time expiry.
This applies to the VPS path only (one of the two paths in the
RB5009 CGNAT series). If you took the
Route64 /56 path instead, skip this post:
that path has no self-operated BGP relay and ships its own netwatch-driven
fail-to-IPv4 (Route64 post §7).
The problem it solves: a WireGuard interface stays administratively UP even
when the path is dead — NAT mapping expired, peer rebooted, VPS null-routed —
so interface state alone is not a failure signal. Plain BGP will withdraw the
default route when the session dies, but only after its hold timer expires.
BFD on the same session collapses that to sub-second: the route withdraws
the instant BFD declares the path dead, pings fail once, and clients are
already on IPv4 by the next attempt.
Event
Measured
BFD packets blocked → BFD down
~700 ms
BFD packets blocked → default route withdrawn
~1.5 s
BFD packets restored → route reinstalled
~10 s
Full WG service stop/start → route reinstalled
~17 s
BFD bandwidth (200 ms × 3, bidirectional)
~3.4 GB / mo
BFD cost at $2.50/TB
~$0.0085 / mo
Design decisions
BFD inserts between BGP (which it monitors) and RA/SLAAC (which consumes
the routes BGP exchanges), filling in the failure-detection layer the base
build's stack is missing:
text
text
1WireGuard = encrypted transport and peer authorization
2BGP = route exchange
3BFD = fast failure detection ← added by this post
4RA/SLAAC = client addressing
The three layers around it — including the Table = off + AllowedIPs
cryptokey-routing rationale this depends on — are covered in the VPS
post's §3 Return routing.
This post adds only the BFD line and the two edits each side needs to
turn it on.
1. Conventions and placeholders
All placeholders are defined in the VPS post's §2
— <LAN_PREFIX>, <VPS_AS> / <HOME_AS>, <VPS_ROUTER_ID> / <HOME_ROUTER_ID>,
and the wg-host interface created in its §5. Substitute the same values
here before pasting.
2. VPS — add BFD to bird2 and nftables
Three surgical edits on the VPS: append a protocol bfd block to the
existing /etc/bird/bird.conf, flip bfd on; inside its protocol bgp home block, and add one wg0-only allow line to /etc/nftables.conf.
Re-pasting will duplicate all three — re-run only after reverting.
bird2 + nftables: BFD-only diffs
bash
1cp /etc/bird/bird.conf /etc/bird/bird.conf.pre-bfd
2cp /etc/nftables.conf /etc/nftables.conf.pre-bfd
34# 1. Append a BFD protocol to bird.5cat>>/etc/bird/bird.conf <<'EOF'
67protocol bfd {
8 interface "wg0" {
9 min rx interval 200 ms;
10 min tx interval 200 ms;
11 idle tx interval 1 s;
12 multiplier 3;
13 };
14 # Explicit neighbor so bird actively probes; passive-only stalls after a
15 # flap because both sides wait for the other.
16 neighbor <LAN_PREFIX>:0::2 dev "wg0";
17}
18EOF1920# 2. Turn BFD on inside the existing BGP session.21sed-i'/^protocol bgp home {/a\ bfd on;' /etc/bird/bird.conf
2223# 3. Add the wg0-only BFD/3784 allow next to the existing BGP/179 line.24sed-i'/iifname "wg0" tcp dport 179 accept/a\ iifname "wg0" udp dport 3784 accept # BFD from home, wg0 only' /etc/nftables.conf
25systemctl reload nftables
2627# Restart bird so it loads the new BFD protocol and re-establishes BGP with28# BFD enabled. Restart-on-failure for resilience after a tunnel flap.29mkdir-p /etc/systemd/system/bird.service.d
30printf'[Service]\nRestart=on-failure\nRestartSec=2s\n'\31> /etc/systemd/system/bird.service.d/restart.conf
32systemctl daemon-reload && systemctl restart bird
The explicit neighbor in protocol bfd matters. Without it, bird is
passive and only responds to probes; after a flap the home router waits for
BFD before re-establishing BGP, bird waits for BGP before initiating BFD,
and recovery needs a manual birdc restart.
3. Home router — enable BFD on the existing BGP session
RouterOS BGP + BFD
bash
1/routing/bgp/template/set [find name=tpl-host] use-bfd=yes
2/routing/bgp/connection/set [find name=host-vps] use-bfd=yes
34/routing/bfd/configuration/add interfaces=wg-host \5 min-rx=200ms min-tx=200ms multiplier=367# Skip the next add if a UDP/3784 accept rule on wg-host already exists8# from a previous BFD setup — RouterOS does not deduplicate filter rules.9/ipv6/firewall/filter addchain=input action=accept protocol=udp dst-port=3784\10 in-interface=wg-host comment="BFD from VPS"\11 place-before=[find where chain=input and comment="defconf: drop everything else not coming from LAN"]
BFD's UDP/3784 control packets arrive unsolicited, so the defconf
established,related,untracked rule that lets the home-router-initiated
BGP session work does not cover them. The explicit allow above is the one
new input rule this companion adds on the home router. The matching
nftables line on the VPS side is in §2.
BFD is added on top of the BGP session the base build already created. Only
the template and connection get use-bfd=yes, the BFD configuration entry
defines the timing, and the firewall rule allows the control packets. On the
live home router, setting only the template did not alter the already-created
connection; the connection needed its own use-bfd=yes.
4. Verification
Confirm BGP/BFD and the failover
bash
1# On the VPS:2birdc show protocols # bgp + bfd both Established/Up3birdc show route <LAN_PREFIX>::/48 # learned from home4ip-6 route show <LAN_PREFIX>::/48 # proto bird, metric 32, via wg05wg show wg0 allowed-ips # /48 is allowed, not route-owned67# On the home router:8/routing/bgp/session/print # established9/routing/bfd/session/print # state=up10/ipv6/route/print where dst-address=::/0 # bgp, distance 2011/ipv6/route/print where dst-address=<LAN_PREFIX>::/48
1213# Failover: stop WireGuard on the VPS and time it.14# wg-quick down wg015# A client's IPv6 should drop quickly enough for Happy-Eyeballs to move on16# to IPv4; on the measured live path, the route disappeared in ~1.5 s and17# reappeared about ~10 s after BFD packets were restored.
A brief IPv6 miss and quick fallback to IPv4 — instead of waiting for BGP
hold-time expiry — is the whole point of this change.
If you'd rather have a second active upstream than a fast fall-to-IPv4
on the single one, see
Multi-homing IPv6 over CGNAT on RouterOS
— the series finale. It carries the same BFD layer on the VPS session,
adds a parallel Route64 BGP session under one announceable /48, and
lets RouterOS best-path pick the active default. Requires a 32-bit ASN
and an announceable /48 as hard prerequisites.