The aftermath of the blog platform migration

Well that was one long, long week…

My plan following the upgrade was to post about how it went and to introduce everyone to the new features of the site. And now it’s 9 days later… oh well, you live, you learn, you try to get some sleep once in a while.

I’m going to share some details with you but in return for that I’m also going to turn off anonymous commenting to this blog. If you want to say something, don’t hide and be anonymous. Step up and show your face rather than being a troll in the dark. I think it’s only fair that if I’m honest, you’re honest too.

 

D-Day… that’d be deployment day

The day started out really well. We had all the bits and pieces in place and ready to do our final deployment. Our bloggers had spent time over the weekend getting their updated designs in place and making sure their content was up to date. We began the deployment around 12pm and started deploying the final bits. Shortly after that we flipped the switch to let you, our blog readers in on the new design… and that is when things didn’t go exactly to Plan A. Well, Plan A for MSDN. The TechNet blogs didn’t really run into much trouble – we deployed, and given the scope of the content and the visitors, we were able to stay up and remain online…

However, MSDN blogs was another story. Over the next 20-30 minutes we saw our SQL Server (really sweet piece of hardware) start to get backed up with queries. That in turn backed up the web servers and before we knew it we were locking people out of the site. We removed a few widgets from the site that we knew could have an impact to performance and it helped a bit… but not enough.

We kicked into investigation mode while discussing in the background how long we had before we rolled back to the old version of the blogs. Yes, that was in our list of plans should things not get resolved quickly. We care about our bloggers internally and you our blog readers.

As we looked at the SQL traces and the site setup we found a page within the site that contained a full site tag cloud. The page was in the out-of-the-box platform deployment and just happened to be sitting at a URL the previous version of the blogs had registered with the search engines. Well, bad timing and all and the search engines decided that crawling that page right as we were going live was the best idea they’d had all day. Well every tag in the site turns out to be a lot for the MSDN Blogs and the resulting search crawl turned browsing the blogs into a crawl. Once we removed that page, things settled down for us.

After that we saw one more spike that afternoon. We have a full suite of analytics installed with the blogs. And we’d been running those for weeks without any issues. However after the search engines began their crawl the analytics system wanted to process all those page views and activity and that shot the analytics into overdrive. While the blogs remained online, we certainly were suffering from some very slow page loads. We’ve since moved any batch processing of the analytics into replicated databases to avoid this issue moving forward. Lessons learned.

Well after that we started to settle down. Well sort of. When a new blogging platform comes online not only does everyone outside Microsoft want to look at it, the 15,000 or so bloggers we have internally all decided it was time to post new articles. You just can’t script a movie story line like this. Anyway, all those bloggers posting and updating the blog designs took a bit of a toll on the system as well.

Finally around 4am I went home. Exhausted but at least we had the blogs migrated and accessible…

 

The days that followed…

We spent the next few days working through additional performance investigations and listening to the literally hundreds of people telling us how much we sucked for moving their cheese. We heard complaints about the W3C validation. We heard complaints about the rolling blog post list no longer existing. Heck, I think I even saw a few demands for someone to be fired… eeek!

Clearly folks are passionate about reading the blogs, which is great. However contrast what was going on outside the blogs and you come to realize this is all just very small. Oil was pouring out of the Gulf of Mexico and BP couldn’t stop it – if I thought I was getting harassed, I can’t imagine what they are dealing with. A 1000 dead Americans in the war. I even contrast it to my wife… she’s a counselor and she is dealing with 16 year olds who have no real home, no family support and are pregnant. Again, the blogs… small, really small in comparison to what’s going on in the rest of the world.

So over the past week we’ve assisted our bloggers get used to the new system. We’ve also deployed some performance fixes to both the blog sites. You should find both are now performing quite well even under load. We aren’t done but at least we are stable.

And today we’ve been able to bring back the rolling blog post list. No, it’s not on the home page – but it’s linked off there so it’s easy to get to. There were clearly some people very mad about this functionality not being online – however it was one of the casualties of the early performance issues we experienced. We tweaked a few SQL indexes and now the control is back. For some of you this is the only thing you cared about on the entire blog site – but based on our metrics, you’re in the minority. Most people find the blog content through search (external and internal to the blogs) and by direct visits to a blog. Anyway, it’s back now so hopefully that’ll help you sleep at night.

 

What’s up with the home page?

The home page has been updated to move beyond just the rolling blog post list. As a landing page to the blogs, new visitors need to land softly. The old home page, while clearly useful for some, actually was one of the weirdest experiences for new users I’ve seen. It didn’t even look like MSDN or TechNet. The traffic to the home page told a clear story – yes, some folks landed there, but that was less than 1% of our traffic. We have many blogs that receive more traffic than that. And if we wanted to move the blogs on to 2010, then we needed to look at the data and adjust from there. Which we’ll continue to do…

So we have a few things on the home pages now…

  1. An editorial section. Right now there is a welcome message. As time goes on we’ll be updating that to be in alignment with spotlights and features from the corresponding MSDN and TechNet sites. We’ve clearly heard the feedback that we need to remove the silo’s of information and this is one step forward to bring consistency between our applications and sites.
  2. Recent Blog Posts. The recent blog posts is a fast way to see what’s been posted recently and expand the reach of the RSS feed for the blogs. Many, many people follow the blogs solely via RSS and that’s great. We want to encourage that. You’ll now find the link to the rolling blog post list at the bottom of the Recent Blog Post widget.
  3. Recent Site Activity. This lets you know what’s going on in the site, with the people you’re following and the activity you’ve performed. Perhaps you commented on a blog but can’t remember which one, just look at your activity and you’ll be quick to return. Again, as a landing page for the blogs this allows people to see the site is active and not be overwhelmed by a never ending list of posts.

If you’re one of those who just want to the old blog post list back then use the new rolling blog post list page. There is nothing else on the page to distract you. Enjoy. :-)

 

What’s next?

So at this point we’ve got the new blog site where we expected it to be about a week ago. We’ll be updating the help pages and finally getting to talk about the new features. Our bloggers are mostly up to speed now and I’m seeing a lot of posts come out daily which is great.

So what’s next… well we are working on the next update which should be out in a couple of weeks from now. This should include:

  • Bug Fixes: Yep, there are bugs. We’ve been logging them and we are working on fixing them.
  • Performance Improvements: There is still yet more performance we can wring out of this platform. We’re analyzing load times, SQL traces, CSS and HTML to reduce what we can.
  • Search within a Blog: Right now if you search while viewing a blog, you’ll get search results from all blogs. That seems like a great idea if you’re just looking for an answer if you don’t care where it comes from. But we also know searching within a blog is a great feature so that’s on the schedule as well.
  • Default discovered RSS Feed on a Blog: Right now it’s broken. It points to a search RSS feed that doesn’t exist. This one slipped through. We’ll correct this to be the posts for a blog.
  • A few new widgets: Our bloggers will have a few new widgets to use including a standard code syntax highlighter along with the Microsoft Translation widget.
  • Validation of HTML:We have identified a few issues with the HTML being output. We’ll fix those over time as we can.

 

Ok, with that I’m definitely late for dinner. Hopefully this post won’t cause a riot and we can have constructive conversations about the blogs, not just shouting, flaming abuse sent my way. I want to listen and I want the blogs to be a great place for all of us. Let’s keep the talk respectful and the comments on topic and we can move forward together…

Thanks,
Sean