Saturday, May 09, 2015

Yet another reason I don't like consumer networking gear

As those who know me in real life (and those who have been reading between the lines on this blog) know, I am a network professional. I develop software for commercial networking equipment. The really good stuff that phone companies use to make their networks work. So, in addition to having a healthy respect for how complicated a networking problem can get, I've also come to expect a certain level of quality in the equipment I use. Unfortunately, I can't afford to install commercial routing gear in my home. It costs too much, uses too much electricity, generates too much heat, etc.

So, for my home LAN, I use consumer gear like everybody else. Right now, this consists of a Zoom cable-modem/router, two Linksys routers in bridge mode (acting as Wi-Fi access points), and three powerline network adapters to connect them all.

But that's enough background material.

This evening, as I was getting ready for bed (problems like this always seem to happen just before bed,) I was informed that the Wi-Fi in the house isn't working. My phone and laptop were working fine, but my wife and daughter were not having any such luck. So I started to log on to the various routers to reset the Wi-Fi. They sometimes flake-out, but I can log on to its management interface (via a wired network connection), disable the Wi-Fi, wait 10 seconds and then re-enable it, and that usually works. I did that on the Zoom modem/router and one of the Linksys routers with no problem. I then logged on to the second Linksys router and discovered that the entire management interface was different (different from what it was the last time I logged on and different from the other router.)

You see, Linksys, like so many other commercial products these days, automatically updates its firmware if you don't explicitly configure it not to. I must have missed this one and it updated itself. Well, I didn't want that, so I proceeded to downgrade the firmware to the same version used by the other router (which has been working just fine.) Install, reboot, and.... All of a sudden, my laptop can't connect to its Wi-Fi anymore. It was connecting just fine before the downgrade, and it has always connected fine to the router that never auto-upgraded (and could connect to it just fine right then.) Other devices (phones, laptops, tablets, etc.) are all refusing to connect to that base station.

After a lot of playing around, I discovered that the 5GHz band is the problem. Disable it and everything connects just fine via the 2.4GHz band. Re-enable it and anyone trying to connect via 5GHz fails.

WTF??? It's not like the firmware is invalid - my other router, which is the same make and model, is running it and nobody has any problem connecting to it at either 2.4GHz or 5GHz. Reboot the router - no luck. Power-cycle it - no luck. Reset all configuration - no luck, and spend 15 minutes putting my configuration back afterward.

I have no idea what is going on here, but I need to get the network back up and running before I go to bed or everybody's going to be really angry in the morning. So I tell the router to go and upgrade its firmware back to what it was. And whaddaya know - the 5GHz band is working just fine again, and the router's configuration is identical to what it was with the older firmware.

I'm sure there must be a logical explanation somewhere, but consumer equipment has garbage for diagnostic and analysis tools. Although there is a system log, it doesn't contain anything that might explain why one of its two radios is rejecting perfectly valid connection attempts.

So I'm left with a great bit unexplained voodoo question. Why the 5GHz band works fine with the new firmware (so we know the hardware isn't defective), but fails with the old firmware, even though another identical router using the old firmware has no such problem, and this router was itself running just fine with the old firmware.

If this was happening to me at work, with the good commercial equipment, I could log in to its console, crank up the debug/tracing level, and spend a few minutes drinking from the firehose of a debug-level system log. I would find out exactly where the problem is, and would then either be able to fix it myself or I'd be able to provide a really useful bug report to the group responsible for fixing the bug. But with cheap consumer junk, there's no such ability - you just have to muddle through without any real understanding of what's going on, and if that doesn't work, your only recourse is to throw it out and buy a new one, because the customer support people can't do any more than you already did and they won't let you talk to anyone in the product's development team, where you might have a prayer of fixing the problem for real.

Oh well. But that's enough grumbling for tonight. It's now 1:30am, the Wi-Fi seems to be operational on the 5GHz band again, and I'm going to bed. Thanks for letting me vent my spleen at you. If you'd like to share your tales of frustration with consumer electronics, please do. If I like it, I'll share the link here or re-post your story.

No comments: