That wasn't it

I’m still not able to reach my Chromecast reliably.

I last lost connectivity two days ago, which was way less than 49.7 days since I last rebooted my WiFi router. I might have a problem with overflowing counters, but I also have something much more immediate going wrong.

I’ve also backed out my IPv6 workaround, since it didn’t seem to be helping; rebooting the router likely had more to do with the Chromecast becoming reachable than the “workaround.”

A crashy network discovery daemon caused these symptoms at least once when I was in Google Fiber. That would be my next guess for what to look at, but I didn’t seem to be running one when I looked. I can’t reach umdns and never installed Avahi.

I really wish I had the syslogs from my WiFi router when this was happening, but they disappeared when I rebooted. D’oh!

This shouldn’t have surprised me; you can only write flash memory so many times before it wears out, and writing logs to flash memory would noticeably shorten the service life of a WiFi router, so they usually don’t.

We dealt with this in Fiber by adding an option to the Linux kernel to persist syslogs in memory across warm reboots, PRINTK_PERSIST, and uploaded them to a service that provided reliable storage with logupload. It worked way better than hoping someone noticed the problem and called you before they and/or their roommate rebooted the router.

I’m missing this a lot now, and might dust some of it off and patch it to work with publicly available infrastructure (Stackdriver maybe?) if I still feel like I want it the next time this happens. No promises though, since no one’s paying me for this and it’s about a kiloreboot of time to do.

I don’t have a a firm conclusion here – I’m going to watch some Netflix while my Chromecast is reachable, rather than waiting for things to stop working again.