The Problem: No Carrier, But Traffic is There
I was setting up a multi-VLAN DNS server on OpenBSD 7.8 running in Proxmox VE 7.4. The end goal is actually to run CoreDNS with different filtering policies for different VLANs - clean DNS for my server network, aggressive blocklists for the kids’ devices.
The setup seemed straightforward:
- vio0: Untagged interface on VLAN 60 (servers) - worked perfectly
- vio1: VLAN-tagged interface on VLAN 40 (kids) - permanently showed
status: no carrier
vio1: flags=2008802<BROADCAST,SIMPLEX,MULTICAST,LRO> mtu 1500
lladdr c2:8a:93:59:9a:0a
index 2 priority 0 llprio 3
media: Ethernet autoselect
status: no carrier
While I saw no carrier, what was really strange is that: traffic was arriving at the interface. Running tcpdump on the Proxmox host’s tap interface showed VLAN 40 packets flowing perfectly. The VM just couldn’t see them.
Symptoms: How to Know You’re Hitting This
If you’re experiencing this issue, you’ll see:
- OpenBSD
ifconfigshows “no carrier” on virtio interface dmesgshows “msix per-VQ” for the affected virtio device:vio1 at virtio4: 1 queue, address c2:8a:93:59:9a:0a virtio4: msix per-VQ- The interface has a VLAN tag applied at the hypervisor level
tcpdumpon the host tap shows traffic arriving with correct VLAN tags- Other virtio interfaces on the same VM work fine (typically ones without VLAN tags)
- Switching to e1000 NIC model makes it work immediately
The Diabolical Debugging Journey
First Suspect: The MikroTik Router
My network runs on a MikroTik RB5009 with bridge VLAN filtering. Naturally, I assumed I’d misconfigured the VLAN tagging. I spent ages combing through:
- Verifying
ether6was in the tagged list for VLAN 40 - Checking PVID settings (was PVID 60 stripping tags?)
- Comparing working ether7 config with non-working ether6
- Running
/interface bridge vlan printuntil my eyes glazed over
Verdict: MikroTik config was perfect. tcpdump on the Proxmox side proved tagged VLAN 40 traffic was arriving.
Second Suspect: Proxmox VLAN-Aware Bridges
Proxmox 7.4 has a VLAN-aware bridge feature that handles VLAN tag stripping for VMs. Maybe that was broken?
I verified:
cat /sys/class/net/vmbr1/bridge/vlan_filtering # Returns 1 OK
bridge -compressvlans vlan show # Shows correct VLAN membership OK
The tap interface had the right VLAN configuration:
tap107i1 40 PVID Egress Untagged
Verdict: Proxmox bridge was correctly stripping VLAN 40 tags and delivering untagged frames to the VM.
Third Suspect: Proxmox Firewall Bridges
With firewall=1 enabled on the VM network device, Proxmox creates an intermediate bridge chain: VM → tap → fwbr → fwln/fwpr veth pair → vmbr1. Maybe this was breaking VLAN delivery?
Testing with firewall=0:
qm set 107 --net1 virtio=...,bridge=vmbr1,tag=40,firewall=0
Verdict: Firewall bridges weren’t the issue. Still no carrier.
The Breakthrough: Comparing dmesg Output
Looking at working vs. non-working interfaces:
vio0 (working):
vio0 at virtio3: 1 queue, address 7e:34:ff:40:b4:53
vio1 (broken):
vio1 at virtio4: 1 queue, address c2:8a:93:59:9a:0a
virtio4: msix per-VQ
That msix per-VQ line was strange so I thought to go deeper here. The two interfaces were using different MSI-X interrupt modes.
Alternative device config Option: Switching to e1000
Out of desperation:
qm set 107 --net1 e1000=C2:8A:93:59:9A:0A,bridge=vmbr1,tag=40
qm shutdown 107
qm start 107
OpenBSD immediately showed:
em0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> mtu 1500
media: Ethernet autoselect (1000baseT full-duplex)
status: active
Carrier detected. DHCP worked. Everything just worked.
What I Learned: MSI-X and Virtio Internals
MSI-X: Modern PCI Interrupts
MSI-X (Message Signaled Interrupts - Extended) replaced the old physical interrupt pin model. Instead of asserting a wire, devices write to memory addresses:
- Each device has an MSI-X table (up to 2048 entries)
- Each entry maps to a specific interrupt vector
- When the device needs attention, it writes to the address in that entry
- The CPU receives the interrupt without needing to poll device registers
Virtio’s Two MSI-X Modes
Shared MSI-X: One vector for everything. The driver reads an ISR status register to determine what happened (virtqueue completion? config change?).
Per-VQ MSI-X: Dedicated vectors for each virtqueue plus a separate config change vector:
Entry 0: RX queue completions
Entry 1: TX queue completions
Entry 2: Configuration changes (link status, etc.)
Per-VQ mode is more efficient (less register polling), but requires correctly programming all vectors.
The Bug?: Lost Config Change Interrupts
In OpenBSD’s virtio driver (pre-7.6 fix), the MSI-X config change vector wasn’t being re-programmed after device reset. The virtio spec is clear: if config_msix_vector is set to 0xFFFF (NO_VECTOR), the device MUST NOT send config change interrupts.
The lifecycle:
- VM boots, virtio driver allocates MSI-X vectors
ifconfig vio1 uptriggers device reset- Reset clears all MSI-X vector assignments (per spec)
- Driver re-programs queue vectors (entries 0, 1) ✓
- Driver forgets to re-program config vector (entry 2) ✗
- Config change vector remains at
0xFFFF - QEMU cannot deliver link-state change interrupts
- OpenBSD never sees link come up
Why vio0 worked: It landed in shared MSI-X mode (fewer vectors allocated by QEMU for that PCI slot), which uses ISR register polling as a fallback. Config changes work without a dedicated vector.
Why e1000 worked: Intel e1000 emulation uses standard PCI register polling for link status - no virtio config change mechanism involved.
The Fix (That Didn’t Help Me)
I did find some related fix that went into OpenBSD on September 2, 2024 (committed for 7.6). The fix creates virtio_pci_setup_intrs() that programs both queue vectors and the config vector after reset.
Also found a follow-up fix on December 20, 2024 addressed virtio 1.x queue_enable ordering.
But I was running OpenBSD 7.8 which should have had both fixes. Why did it still fail?
The Mystery: Why 7.8 Still Failed
Honestly? I don’t know for certain. Possibilities:
- Edge case in MSI-X vector allocation: Specific PCI topology in my VM triggered a code path where the fix doesn’t apply
- QEMU 7.2 quirk: Proxmox 7.4 runs QEMU 7.2 (from 2022), potentially with its own virtio 1.x bugs
- Timing issue: The config vector write might be happening before QEMU’s MSI-X table is ready
- Different bug entirely: Something that looks similar but has a different root cause
To properly debug this, I’d need to:
- Verify the fix is actually compiled into my kernel
- Capture MSI-X vector assignments with
pcidump -v - Compare interrupt state between working and broken configs
- Test on Proxmox 8.x with newer QEMU
But at some point, pragmatism wins.
The Solution: Just Use e1000
For VLAN-tagged virtio-net interfaces on OpenBSD VMs in Proxmox 7.4:
qm set <VMID> --net1 e1000=<MAC>,bridge=vmbr1,tag=<VLAN_ID>
Trade-offs:
- Performance: e1000 has ~10-15% more overhead than virtio
- Compatibility: e1000 works everywhere, always
- Simplicity: No MSI-X vector debugging required
For my DNS server use case, the performance difference is negligible. CoreDNS CPU usage is measured in single-digit percentages.
The Bigger Picture: Multi-VLAN DNS Architecture
The goal that started this odyssey: differentiated DNS filtering across VLANs.
Final architecture something like this:
OpenBSD VM ()
└─ CoreDNS with ACL-based filtering
├─ VLAN 40 (Kids): Aggressive blocklists
├─ VLAN 60 (Servers): No filtering
└─ Other VLANs: Custom policies
The e1000 workaround means both interfaces work perfectly, and I can finally get CoreDNS configured with per-VLAN policies.
Key Takeaways
- ”No carrier” doesn’t mean no traffic: Always verify with
tcpdumpat multiple layers - MSI-X per-VQ mode is more fragile: The config change vector must be explicitly programmed
- Driver bugs can persist across releases: Even with fixes committed, edge cases may remain
- Performance isn’t everything: e1000 works, virtio doesn’t - ship the working solution
- Document the weird stuff: Someone else will hit this
For OpenBSD Developers
If you’re investigating this, check:
# Does the broken interface use per-VQ MSI-X?
dmesg | grep -i "msix per-VQ"
# What's the config change vector set to?
pcidump -v | grep -A 30 "vendor.*Red Hat"
# Look for "Configuration Change: 0xFFFF" (broken) vs valid vector number