I Nearly Found Nirvana In The Cloud

So it's been a week of semi-fruitful searching for lots of people. In China there's a team setting out on a million-pound expedition in the mountains and forests of Hubei province to find the Yeren or Yeti that's supposedly been sighted hundreds of times. In Geneva, scientists have revealed that they've probably found the Higgs boson particle they've been searching for over the last fifty years. Meanwhile, as I mentioned last week, I've been seeking a way to rid myself of the cost and hassle of maintaining my own web servers.

The Geneva team reckons there's only a one in 1.7 million chance that what they've found is not the so called "God particle" but they need to examine it in more detail to be absolutely sure. I just hope that the level of probability for the expedition team in China will be more binary in nature. I guess that being faced with a huge black hairy creature that's half man and half gorilla (and which hopefully, unlike the Higgs boson, exists for more than a fraction of a second) will prompt either a definite "yes it exists" or an "it was just a big bear" response.

Meanwhile, my own search for Windows Azure-based heaven has been only partially successful so far. A couple of days playing with Windows Azure technology have demonstrated that everything they say about it is pretty much true. It's easy, quick, and mostly works really well. But unfortunately, having overcome almost all of the issues I originally envisaged, I fell at the last fence.

The plan was to move five community-style websites to Windows Azure Web Sites, with all the data in a Windows Azure SQL Database server. Two of the sites consist of mainly static HTML pages, and these were easy to open in Web Matrix 2 and upload to a new Windows Azure Web Site using the Web Deploy feature in Web Matrix. They just worked. A third site is HTML, but the static pages and graph images are re-generated once an hour by my Cumulus weather station software. However, Cumulus can automatically deploy these generated resources using FTP, and it worked fine with the FTP publishing settings you can obtain from the Windows Azure Management Portal for your site.

The likely problem sites were the other two that use ASP.NET and data that is currently stored in SQL Server. Both use ASP.NET authentication, so I needed to create an ASPNETDB database in my Windows Azure SQL Database server and two other databases as well. However, my web server runs SQL Server 2005 and I couldn't get Management Studio to connect to my cloud database server. In the end I resorted to opening the databases in the Server Explorer window in Visual Web Developer and creating scripts to build the databases and populate them with the data from the existing tables. Then I could create the new databases in the Windows Azure SQL Database management portal and execute the script in the Query page. I had to do some modification to the script (such as removing the FILLFACTOR attributes for tables) but it was generally easy for the ASPNETDB and another small database.

However, problems arose when I looked at the generated script for the local village residents' group website. This is based on the old Club Starter Site, much modified to meet our requirements, and is fiendishly complicated. It also stores all the images in the database instead of as disk files. The result is that the SQL script was nearly 72 MB, which you won't be surprised to hear cannot be copied into the Query page of the management portal. However, I was able to break it up into smaller pieces and load it into a Visual Studio 2008 database query window, connect to the Windows Azure database, and execute each part separately. It was probably the most time-consuming part of the whole process.

Then, of course, comes testing time. Will the ASP.NET authentication work with the hashed passwords in the new ASPNETDB database, or is there some SALT key that is machine-specific? Thankfully it did work, so I don't have to regenerate accounts for all the site members and administrators. In fact, it turns out that almost everything worked just fine. A really good indication that the Web Sites feature does what it says on the tin.

However, there were three things left I needed to resolve. Firstly, I found that one site which generates RSS files to disk could no longer do so because you obviously can't set write permission on the folders in a Windows Azure Web Site. The solution was to change the code that generated the RSS file so it stored the result in a Windows Azure SQL Database table, and add an ASP.NET page that reads it and sends it back with ContentType = "text/xml". That works, but it means I need to change all the links to the original RSS file and the few people who may be subscribing to it won't find it - though I can leave an XML file with the same name in the site that redirects to the new ASP.NET page.

Secondly, I need to be able to send email from the two ASP.NET sites so users can get a password reset email, and administrators are advised of new members and changes made to the site content. There's no SMTP server in Windows Azure so I was faced with either paying for a Virtual Machine just to send email (in which case I could have set up all the websites and SQL Server on it), or finding a way to relay email through another provider. It turns out that you can use Hotmail for this, though you do need to log into the account you use before attempting to relay, and regularly afterwards. So that was another issue resolved.

The final issue to resolve was directing traffic from the domains we use now to the new Windows Azure Web Sites. Adding CNAME records for "www" to my own DNS server was the first step before I investigate moving DNS to an external provider. It's a part-fix because I really want to redirect all requests except for email (MX records) to the new site, but Windows DNS doesn't seem to allow that. However, there are DNS providers who will map a CNAME to a root domain, so that will be the answer.

Unfortunately, this was where it all fell apart. Windows Azure Web Sites obviously uses some host-header-style routing mechanism because requests using the redirected URL just produce a "404 Not Found" response. Checking the DNS settings showed that DNS resolution was working and that it was returning the correct IP address. But accessing the Azure-hosted sites using the resolved IP address also produced the "404 Not Found" response. Of course, thinking about this, I suppose I can't expect to get a unique IP address for my site when it's hosted on a shared server. The whole point of shared servers in Windows Azure is to provide a low cost environment where one IP address serves all of the hosted sites. Without the correct URL, the server cannot locate the site.

According to several blog posts from people intimately connected with the technology there will be a capability to use custom domain names with shared sites soon, though probably at extra cost. The only solution I can see at the moment is to set up a redirect page in each website on my own server that specifies the actual Windows Azure URL, so that routing within Windows Azure works properly. But that means I still need to maintain my own web server!

Meanwhile, here's a few gotcha's I came across that might save you some hassle if you go down the same route as I did:

  • When you configure a Windows Azure SQL Database server, don't use an email address that includes "@" as the user name. The actual user name for the database will be [your user name]@[database name] and the extra "@" seems to confuse the management portal.
  • Also avoid using characters that need to be HTML encoded (such as "&") in database passwords. It gets complicated when you need to specify the password in configuration files and in tools that connect to the database.
  • If you use Web Matrix, you can download the publishing settings for each site from the portal. However, once you import them into Web Matrix is seems to be impossible to edit them or re-import them if you need to change the settings. The only solution I found was to delete the site (but not the content) in the Web Matrix "My Sites" dialog then open the site again as a folder.
  • If you find that database connections are failing, or you get connection errors in your pages, check the connection string settings. I had a weird problem with one site where every time I deployed the Web.config file from Web Matrix it did a fancy transform on it and changed the connection strings. I had the databases configured as linked resources, but despite checking these and all other settings several times I couldn't find any reason. They were correct in the Web Matrix Database page and in the Configure page of Windows Azure management portal for the site. It only happened on one site, and only after I switched it from .NET Framework version 2.0 to version 4.0. The other site where I did the same never suffered this problem. The kludge solution I found was to open the site in Remote View in Web Matrix, open the Web.config file from there, correct the error, and save it again. But it means doing this every time I edit the file and redeploy it.
  • If you switch a .NET 2.0 site to .NET 4.0, and that site turns off request validation in any pages or in Web.config that must accept HTML input (such as blogs), you need to add <httpRuntime requestValidationMode="2.0"> to the <system.web> section in Web.config. If possible turn off request validation only in the specific pages that must accept HTML input, and ensure you always validate the content before saving or displaying it. 

So was the whole "migrate to Azure" exercise a waste of time and effort? No, because I know that I have a solution that will let me get rid of my web server and expensive fixed IP address, business-level ADSL connection in time. And in less than two days I learned a lot about Windows Azure as well. However, what's becoming obvious is that I probably need to go down the road of using reserved instead of shared instances, or even Cloud Services instead of Web Sites. But that just raises the question of cost all over again.

Though, just to cheer me up, a colleague I brainstormed with during the process did point out that what I was really doing was Yak shaving, so I don't feel so bad now...