It's been a while since I've written about my Big 12 Sports river. It's still there, churning away, updating every hour or so with all the latest headlines about all things Big 12. I still check it daily. Most days I just skim the headlines to get a sense of what's going on. Sometimes a story will jump out at me, and I'll click the headline to read the story. And once in a great while, a story is so interesting that I'll send it to Wendy.
When I created the river (way back here and here), it occurred to me I didn't have any way to monitor the various RSS feeds that are used to pull the headlines. If something stopped working, I would never know, unless I actively went looking for problems. At the time, I decided to let the issue slide. It was a lot of work to create the river, and I was ready to take a break from working on it.
A few months ago, as the college football season was getting into full swing, I noticed I wasn't getting as many headlines as I had in previous seasons. So one weekend I decided to write a program to help me monitor problems. What I came up with was a program that sends me an email once a week, summarizing the daily number of posts from each RSS feed for the previous week. It looks like this:
I also added another table, which shows the most recent response code from each website's server:
Between the two tables, I can get a quick sense of what's going on. And having it emailed to me each week means I don't have to do anything! Just a quick glance at the email each week tells me if anything's wrong.
After creating this program, and running it for the first time, it was clear several things were wrong. After a bit of investigation I learned some things had changed in the 2 years since I created the river:
- In West Virginia, the Charleston Gazette and the Charleston Daily Mail newspapers merged into the Charleston Gazette-Mail. For me, this meant 2 RSS feeds were combined into 1 new feed.
- The Pittsburgh Post-Gazette seems to have stopped covering West Virginia sports entirely.
- The Austin American-Statesman seems to have dropped support for RSS feeds. This was mildly disappointing, but I'm not too sad that there are fewer Texas stories in the river.
- Five or six other websites changed the URL for their RSS feed, probably because they redesigned their website.
I fixed all of these issues and immediately noticed a lot more content in the river. It felt much better, like it had when I first created it.
Writing this program paid off the very next week, when I got my first automated weekly email. I immediately noticed two things:
The Topeka Capital-Journal had returned a "404 - Not Found" status for its Kansas and K-State feeds, and no posts had been recorded for either since the previous Tuesday. When I investigated the cause, I discovered the newspaper had changed the URL for both feeds the previous week. I updated the river with the new URLs, and everything was back to working.
I was extremely pleased that my new monitoring program paid off so quickly!
So I think the lesson here is that many times, writing software is not enough. It helps to write software that monitors your software, and makes it almost effortless to keep tabs on how things are working.