SharePoint Performance Monitoring with Azure

Aside from epic scaling possibilities, another cool ability Azure gives us is HTTP endpoint monitoring – the ability to check our SharePoint apps are responding nicely, even from various locations around the world if we want it.

This is fairly quick article mainly thanks to the brilliant simplicity of how this works.

Add Endpoint Monitoring for Cloud Services URL

So in my SharePoint/Azure environment I have it publicly exposed via the cloud-services URL. I want to monitor how quickly the default page loads as a rough metric of how stressed the system is (there are all sorts of metrics available of course).

So, I go-to the SharePoint WFE and click “configure”:

clip_image001

At the bottom I can add endpoint monitoring. This is so simple it’s insane; all I have to do is add the URL for Azure to ping and then from which parts of planet Earth I want to ping from.

clip_image003

Save changes and that’s about it; Azure will now ping the URL every 5 minutes & log the response time so you can monitor it, along with other metrics.

clip_image005

Here you can see the response time from Amsterdam, taking a suspiciously long 10 seconds to send back the page.

Why so slow? Well the site in question has a quickly hacked together web-part that lets me slow down the page render by whatever I configure in the web-part properties; 10 seconds in this case:

clip_image007

It’s not pretty but it works, and it’s handy for the next demonstration…

 

Add Rule to Monitoring Response

Even better, once you’ve got your endpoint monitors created you can now setup monitoring alerts on them. Add a rule for any of the metrics just by clicking on it & “add rule”:

clip_image009

This will let you fill out a name/description, and then the conditions for the rule:

clip_image011

…and that’s about it. Now to what happens when alerts activate or resolve…

 

Performance is Bad - Alerts Activated!

If the alert becomes active it means the condition for the rule you setup is being met, or worse. In my case, pages are taking 10 seconds to load so the condition is easily met and Azure sends me a nice email telling me the alert is “active”:

clip_image013

Thanks Azure! Now in the portal we can see more information…

clip_image015

Clearly we have a problem with that site; in our case, the problem just being we have a web-part that kills performance.

Resolving Alerts

Once the page loads below 5 seconds, say because said web-part isn’t killing performance anymore:

clip_image017

…then the alert should become “resolved” again one the condition is met again (15 minutes; 5 second average response max).

clip_image019

Hurrah – we’re back to normal operation again.

But My SharePoint Sites Aren’t Anonymous!

In case you’ve not guessed already, Azure can only make anonymous requests to an end-point. This means of course whatever SharePoint site/page you configure for it needs to be enabled for anonymous access of course in order for this to work.

For some that might sound unappetising at first but all that’s required really is a test site that’s setup exclusively for this performance-pinging purpose. An entirely self-contained site somewhere with no confidential data (or even just with junk data) but with enough data to make a page-load as life-like as possible for a real user. Making the ping-tests load about the same amount of data from the site lists is the goal; to generate the same load on the farm with ping-tests as a real user would on a real site, and of course enabled for anonymous access so Azure doesn’t just get HTTP 401s back each ping.

At the end of the day, this is about gauging performance of how the whole system responds; any test site will share the same content-database, IIS configuration and hardware limitations as your non-public sites. We just want a dummy site that should give the servers the same load as a normal site, whatever & however that will be. Once done, Azure can provide the ping-tests & alerts, and we can figure out what to do about any problems that Azure flags.

 

Wrap-Up

That’s it! It’s pretty simple stuff really & very easy to setup, but quite powerful if you need to react quickly to slowness. As we have our farm in Azure, a simple solution could be to re-dimension various virtual machines until we can figure out the culprit bit of code, albeit for our thread-sleeping web-part that wouldn’t have helped of course.

Incidentally, there’s nothing stopping you from monitoring an on-premises SharePoint farm the same way; Azure just needs a VM to add the HTTP endpoints to. But anyway, have a play – it’s all cool stuff!

Cheers,

Sam Betts