Build log · MikroTik · BGP + BFD failover

Fast IPv6 failover on RouterOS

Bind BFD to the existing MikroTik/Ubuntu BIRD BGP session over WireGuard so the IPv6 default route withdraws quickly when the relay path dies.

Published 17 May 2026

Ubuntu + BIRD VyOS CHR

RB5009 home-network series · pick a layer, or read in order

1.A small home network behind CGNAT — Start here — overview, address plan, path-choice matrix
2.Trusted, IoT, and Guest VLANs — Foundation — everything else sits on this
3.Encrypted DNS with a stable resolver — Independent layer — needs no IPv6 uplink
4.Routed IPv6 over CGNAT via a VPS — Equal path A — self-operated /48, BGP, Ubuntu/BIRD or VyOS
5.Routed IPv6 over CGNAT via Route64 — Equal path B — broker-operated /56, free, single uplink
6.Per-VLAN IPv6 on RouterOS — GUA + ULA + RDNSS per VLAN, isolation, anti-spoof — after either path
7.Fast IPv6 failover — Optional — VPS path only, Ubuntu/BIRD or VyOS · you are here
8.UniFi controller on the router — Optional add-on
9.Multi-homing IPv6 over CGNAT on RouterOS — Series finale — own ASN + /48, both paths active under BGP best-path

Overview

This is an optional enhancement to the VPS path post, which is a prerequisite — the WireGuard layer, the BGP session, the /48 aggregate, and the learned ::/0 default route all come from there. Adding BFD to that same BGP session collapses dead-tunnel detection to about 700 ms instead of waiting for BGP hold-time expiry.

This applies to the VPS path only (one of the two paths in the RB5009 CGNAT series). If you took the Route64 /56 path instead, skip this post: that path has no self-operated BGP relay and ships its own netwatch-driven fail-to-IPv4 (Route64 post §7).

The problem it solves: a WireGuard interface stays administratively UP even when the path is dead — NAT mapping expired, peer rebooted, VPS null-routed — so interface state alone is not a failure signal. Plain BGP will withdraw the default route when the session dies, but only after its hold timer expires. BFD on the same session collapses that to sub-second: the route withdraws the instant BFD declares the path dead, pings fail once, and clients are already on IPv4 by the next attempt.

Event	Measured
BFD packets blocked → BFD down	`~700 ms`
BFD packets blocked → default route withdrawn	`~1.5 s`
BFD packets restored → route reinstalled	`~10 s`
Full WG service stop/start → route reinstalled	`~17 s`
BFD bandwidth (200 ms × 3, bidirectional)	`~3.4 GB / mo`
BFD cost at $2.50/TB	`~$0.0085 / mo`

Design decisions

BFD inserts between BGP (which it monitors) and RA/SLAAC (which consumes the routes BGP exchanges), filling in the failure-detection layer the base build's stack is missing:

text

WireGuard = encrypted transport and peer authorization
BGP       = route exchange
BFD       = fast failure detection          ← added by this post
RA/SLAAC  = client addressing

The three layers around it — including the Table = off + AllowedIPs cryptokey-routing rationale this depends on — are covered in the VPS post's §3 Return routing. This post adds only the BFD line and the two edits each side needs to turn it on.

1. Conventions and placeholders

All placeholders are defined in the VPS post's §2 — <LAN_PREFIX>, <VPS_AS> / <HOME_AS>, <VPS_ROUTER_ID> / <HOME_ROUTER_ID>, and the wg-host interface created in its §5. Substitute the same values here before pasting.

2. VPS — add BFD to bird2 and nftables

Three surgical edits on the VPS: append a protocol bfd block to the existing /etc/bird/bird.conf, flip bfd on; inside its protocol bgp home block, and add one wg0-only allow line to /etc/nftables.conf. Re-pasting will duplicate all three — re-run only after reverting.

bird2 + nftables: BFD-only diffs

bash

cp /etc/bird/bird.conf /etc/bird/bird.conf.pre-bfd
cp /etc/nftables.conf /etc/nftables.conf.pre-bfd

# 1. Append a BFD protocol to bird.
cat >>/etc/bird/bird.conf <<'EOF'

protocol bfd {
  interface "wg0" {
    min rx interval 200 ms;
    min tx interval 200 ms;
    idle tx interval 1 s;
    multiplier 3;
  };
  # Explicit neighbor so bird actively probes; passive-only stalls after a
  # flap because both sides wait for the other.
  neighbor <LAN_PREFIX>:0::2 dev "wg0";
}
EOF

# 2. Turn BFD on inside the existing BGP session.
sed -i '/^protocol bgp home {/a\  bfd on;' /etc/bird/bird.conf

# 3. Add the wg0-only BFD/3784 allow next to the existing BGP/179 line.
sed -i '/iifname "wg0" tcp dport 179 accept/a\    iifname "wg0" udp dport 3784 accept       # BFD from home, wg0 only' /etc/nftables.conf
systemctl reload nftables

# Restart bird so it loads the new BFD protocol and re-establishes BGP with
# BFD enabled. Restart-on-failure for resilience after a tunnel flap.
mkdir -p /etc/systemd/system/bird.service.d
printf '[Service]\nRestart=on-failure\nRestartSec=2s\n' \
  > /etc/systemd/system/bird.service.d/restart.conf
systemctl daemon-reload && systemctl restart bird

The explicit neighbor in protocol bfd matters. Without it, bird is passive and only responds to probes; after a flap the home router waits for BFD before re-establishing BGP, bird waits for BGP before initiating BFD, and recovery needs a manual birdc restart.

3. Home router — enable BFD on the existing BGP session

RouterOS BGP + BFD

bash

/routing/bgp/template/set [find name=tpl-host] use-bfd=yes
/routing/bgp/connection/set [find name=host-vps] use-bfd=yes

/routing/bfd/configuration/add interfaces=wg-host \
    min-rx=200ms min-tx=200ms multiplier=3

# Skip the next add if a UDP/3784 accept rule on wg-host already exists
# from a previous BFD setup — RouterOS does not deduplicate filter rules.
/ipv6/firewall/filter add chain=input action=accept protocol=udp dst-port=3784 \
    in-interface=wg-host comment="BFD from VPS" \
    place-before=[find where chain=input and comment="defconf: drop everything else not coming from LAN"]

BFD's UDP/3784 control packets arrive unsolicited, so the defconf established,related,untracked rule that lets the home-router-initiated BGP session work does not cover them. The explicit allow above is the one new input rule this companion adds on the home router. The matching nftables line on the VPS side is in §2.

BFD is added on top of the BGP session the base build already created. Only the template and connection get use-bfd=yes, the BFD configuration entry defines the timing, and the firewall rule allows the control packets. On the live home router, setting only the template did not alter the already-created connection; the connection needed its own use-bfd=yes.

4. Verification

Confirm BGP/BFD and the failover

bash

# On the VPS:
birdc show protocols                       # bgp + bfd both Established/Up
birdc show route <LAN_PREFIX>::/48         # learned from home
ip -6 route show <LAN_PREFIX>::/48         # proto bird, metric 32, via wg0
wg show wg0 allowed-ips                    # /48 is allowed, not route-owned

# On the home router:
/routing/bgp/session/print                 # established
/routing/bfd/session/print                 # state=up
/ipv6/route/print where dst-address=::/0   # bgp, distance 20
/ipv6/route/print where dst-address=<LAN_PREFIX>::/48

# Failover: stop WireGuard on the VPS and time it.
#   wg-quick down wg0
# A client's IPv6 should drop quickly enough for Happy-Eyeballs to move on
# to IPv4; on the measured live path, the route disappeared in ~1.5 s and
# reappeared about ~10 s after BFD packets were restored.

A brief IPv6 miss and quick fallback to IPv4 — instead of waiting for BGP hold-time expiry — is the whole point of this change.

If you'd rather have a second active upstream than a fast fall-to-IPv4 on the single one, see Multi-homing IPv6 over CGNAT on RouterOS — the series finale. It carries the same BFD layer on the VPS session, adds a parallel Route64 BGP session under one announceable /48, and lets RouterOS best-path pick the active default. Requires a 32-bit ASN and an announceable /48 as hard prerequisites.

References

Per-VLAN IPv6 on RouterOS — the LAN-side layer whose default route this failover protects.
Trusted, IoT, and Guest VLANs on RouterOS — the bridge layout under that LAN-side IPv6.
Encrypted DNS with a stable resolver address on RouterOS — survives the IPv6 outage because resolver addressing is on the ULA, not the routed prefix.
Multi-homing IPv6 over CGNAT on RouterOS — series finale; same BFD shape on the VPS session, plus a parallel Route64 BGP session under one announceable /48.

Standards and tools

Comments

Comments are powered by GitHub Discussions and require a free GitHub account to post.

Fast IPv6 failover on RouterOS

Overview

Design decisions

1. Conventions and placeholders

2. VPS — add BFD to bird2 and nftables

3. Home router — enable BFD on the existing BGP session

4. Verification

References

Related in the series

Standards and tools