Testing... 1... 2... Is this thing on? (Posted by Avi)

I wanted to kick off this new blog with a post describing the team, what we do, who we are, etc.

Internally, we're known as the "DDIT Web Team", as in "Developer Division IT Web Team". Developer Division IT is the group in DevDiv that encompasses the Build Lab, including all the cruft that's required to keep it up and running - build ops, build tools development, hardware maintenance, build testing, customer management, and the web team.

The Web Team consists of 4 full time employees (Aaron, Arturo, Paul and myself [Avi]) and 3 contract employees (Hatim, Viktor and Joe [joining us on Monday!]). All but me are developers - I'm what they often refer to as "overhead", or "comic relief" if they're being nice :)

We started out much smaller a few years ago. The lab was running maybe 2-3 builds per day, and was using an old clunky web page to report their status. Perhaps 20% of the time, the web page was wrong - causing the customers to ignore it completely and just call the lab for status. This was our first lesson: If a status site is ever wrong, it may as well be always wrong. If people don't trust it, they'll ignore it and get around it. So, our first order of business was rewriting the status site. We wrote it from scratch, designing it to be solid and to cater to the various type of customers we had… We've not looked back since.

In the past two years, the build lab has grown immensely! They're now doing an average of over 70 builds per day, generating over 1.5 TB (yes, that's a "T") of data each day. With that kind of size come problems that just don't exist at a smaller scale, and the web team has needed to grow in order to keep things running smoothly (although we haven't scaled to nearly the same degree ;)).

We've reached a point where the build lab cannot function without the web tools. Some examples of tools we write/maintain for the lab:

  • Build status tracking.
  • Build break tracking/assigning/fixing.
  • Statistics on build breaks, what causes them, who causes them, etc.
  • Space management (how do you keep track of hundreds of builds, over 100TB of space, who needs each build, till when, why, etc?)
  • Customer build request page.
  • Kicking off builds and managing their resource requirements (we have over 1000 servers; which builds will kick off on which servers, when? etc).

Very recently, a re-org has placed the DDIT group within the DevDiv Engineering Excellence org, which puts us in a position to write tools for the whole division, reducing the amount of duplication that may otherwise exist. Examples of tools we own that are used by the whole division (and sometimes by other divisions in the company):

  • Custom statistics tracking (track each team's stats for bugs, test results, etc).
  • Documentation management (online docs/FAQs).
  • Server/asset management and monitoring.
  • Product performance reporting.

There are plenty more examples, but you get the idea. We're in a position where the few of us are supporting over 80 applications, some in use by thousands of people, some critical to the whole division. It's exciting :)

This blog will contain entries from the devs about issues related to ASP.NET, C# and SQL; tips, tricks, the usual. They're doing some fantastic stuff, so there's a lot to learn from them. I'll also try to inject posts about running a web team and supporting a large volume of applications.

Stay tuned; the devs are itching to post stuff. We'd love to get your feedback!

Avi