UDM Pro thoughts

UDM ProI finally broke down and added the UDM Pro to my Ubiquiti router lab. Here are the specs as provided in the Early Access store:

  • 8-Port gigabit switch with 10G SFP+ port
  • Dual WAN ports for redundancy and load balancing: 10G SFP+ and 1G RJ-45
  • Bluetooth connectivity for easy setup via UniFi app
  • Scalable UniFi Network Controller with advanced management capabilities
  • UniFi Protect video surveillance NVR with 3.5″ (or 2.5″) HDD support
  • Enterprise-class IPS/IDS and DPI capabilities
  • 1 x 1.3″ Touchscreen display for quick status information
  • Powered by fast 1.7 GHz quad-core processor

Not mentioned is that the UDM / UDM Pro use a new OS that is not derived from EdgeOS / VyOS / Vyatta. This is why I’d held off on buying one, when the UDM was first made available to Early Access back in March it was FAR from having feature-parity with the EdgeOS-based USGs. The present state of UbiOS is much closer to production-ready (by UniFi standards).

From the perspective of the USG Pro, this is a pretty serious upgrade in performance with a very minor bump in the expected MSRP. 10Gb/s inter-VLAN routing and WAN. Supposedly can hit 5Gb/s of IPS/IDS throughput — I’m underwhelmed by Ubiquiti slapping a pretty face over open source Suricata with publicly-available lists, but I seem to be a minority.

It’s also quiet. With it sitting on my desk and the LCD showing the fans at 50%, I struggle to hear it. The USG Pro, USG-XG-8, and their EdgeRouter siblings are not quiet-space-friendly.

I feel like they’ve missed the boat in a couple of areas with regards to Protect.

One, those LAN ports should have POE. As a router, those ports are of marginal value — where a $379 router is justified, it’s going to be attached to a larger switch. But as an NVR with a larger storage capacity than the Cloudkey Gen2 Plus, having 8 PoE ports would cover many deployment scenarios and would be quite valuable (12-16 ports would be better).

Two, it should have more drive bays. 2x LFF would have been nice. More LAN ports w/ PoE and 4x SFF bays could be better.

Three, it needs a USB host port for offloading footage directly to removable storage. As things stand now, pulling footage out of Protect is a pile of suck, but presumably it will get better, and being able to push footage directly to removable storage would be a great feature.

Ubiquiti has teased a larger, 4x LFF device. It’s not clear if that will be a NAS that Protect can use, or if it will run Protect directly, and they haven’t shown the back yet so we don’t know if it has PoE switch ports to act as a more traditional NVR appliance.

DNS Changes in UniFi 5.11.50

It’s not DNS.
There’s no way it’s DNS.

It was DNS.

Took a nasty hit yesterday from a change made in UniFi 5.11.50:

If you have any ‘service dns forwarding options’ configuration defined in config.gateway.json, it will overwrite the provisioning of statically defined name servers, leaving you with no DNS. Either remove the ‘service dns forwarding options’ portion of config.gateway.json, or add additional ‘options’ lines defining name servers, such as ‘server=1.1.1.1’, ‘server=8.8.8.8’, etc.

I’ve long used some config.gateway.json tweaks to add conditional forwarders for my Active Directory zones because Ubiquiti still hasn’t seen fit to put that functionality in the GUI.

So after upgrading to 5.11.50, the first full re-provision cycle of my USG Pro ended up with no catchall forwarder. The interesting bit is how this manifested itself. My client devices are pointed directly at a local PiHole so that its metrics don’t show everything as coming from the router, so they all kept humming.

What broke was failover load-balancing. By default, the USG’s WAN health check pings a target… by DNS name. So the USG thought both my WANs had failed, and due to the vagaries of how the routing tables are managed, that ends up sending all traffic out WAN2. Not a desirable condition when using WAN2 costs me money (LTE) and is significantly slower than WAN1.

This led to a couple hours of cursing at how badly failover load-balancing is messed up on Ubiquiti’s routers, but in the end, as it is so often, the problem was DNS.

More interesting is that this problem was brought about by some behind-the-scenes changes in how UniFi is managing DNS for the gateway device. Prior to this version of UniFi, the USG defaults to putting the WAN DNS value in /etc/resolv.conf. This is not ideal for two reasons:

  1. No DNS caching takes place for things running on the router itself. If you have IPS or load-balancing enabled this can result in an excessive number of DNS queries originating from the router, and creates a ton of noise if you have DNS metrics through PiHole, OpenDNS, etc.
  2. DNS servers are always queried one-by-one in the order they appear. This may or may not be desirable, but when the first DNS server is unavailable it will result in slow DNS responses.

I’ve worked around this on my USGs by configuring the WAN DNS servers manually and making the first entry 127.0.0.1. So the router will query the local dnsmasq first, and dnsmasq itself will ignore that entry when selecting an upstream DNS to forward to. dnsmasq is also “sticky” by default in choosing an upstream DNS server — it won’t keep sending queries to the first resolv.conf entry if it’s non-responsive.

What UniFi 5.11.50 has changed is that now they’re automatically placing just a 127.0.0.1 entry in resolv.conf and have moved the rest of the DNS configuration into /etc/dnsmasq.conf.

Another change they’ve made is setting all-servers. This causes dnsmasq to query all upstream DNS servers simultaneously and whichever answers first wins.

This is great, in that the USG’s DNS configuration is now made sensible by default. And it’s not great, in that if you have config.gateway.json changes to service​ dns forwarding options they will override what UniFi is doing to dnsmasq.conf and you will need to make adjustments to provide the DNS forwarders yourself.

And beware if you’re configuring multiple DNS servers with differing views of DNS! This has always been a bad idea but all-servers can make it much worse to troubleshoot.