Almost to the day, Azure had another certificate related outage. Last year was more interesting I think and this year it was something different. My initial guess (remember I don't work for Azure nor do I have any knowledge about the details other than what has been communicated to the public) was that a few years ago when they were first creating Azure they generated some certificates to be used with SSL. My guess was that the certificates generated were valid for a long time. Probably like 5 years or something. Since the certificates were created long before azure was available to customers nobody thought about adding monitoring to detect when the certificates expired. Turns out I was wrong and that there were good alerting in place, but that the process had other flaws.
So how can you prevent this from happening in your Project? I would suggest you do the following:
- During development use certificates with very short lifetime. Just one or two months. This way you get used to your certificates expiring and the monitoring you add to warn you when a certificate is about to expire will be tested on a regular basis.
- When you do a risk analysis of your system remember to be detailed and consider the implications of your certificates expiring.
- Learn from others! When things like this happens to the big companies, think about if it could happen to you.
The funniest thing about this outage is an article in Australia thinking this was caused by hackers... I guess I shouldn't get surprised that news papers don't check their facts...