As I continue to apply more engineering rigor to the release process in my team, I hear statements referring to engineers being hopeful and hoping things will go well. Hoping is not the correct way to ship software. I also hear a lot of statements like “we are confident this will work”. Confidence, although great to have as an engineer, cannot solely be the main indicator to release high quality software. What you need to ship high quality software is testing. It’s having the data that shows you ran appropriate tests and validated not only that your software works when it should, but that it works correctly when it shouldn't, when you take an erroneous path through it, and it fails gracefully if necessary. My team has incorporated an extra check to make sure we truly are ready to ship our software when we think we are. We call these extra checkpoints Release Review meetings or Go/No-Go meetings. Think twice before saying you are confident because it may come across like you are trying to sell the fact that the software is ready to release. But this is not the place for a sales pitch. The people needing to give the positive votes in a Release Review meeting don’t just need a statement of confident. Along with it, they need to see the data to back that up, that proves all the correct items were tested and that proves the software works as expected. I see many confident and hopeful software engineers working late nights and weekends because their confidence and hope was short-lived and inappropriately placed. Please don't be one of them.
When we were all learning how to program and how computers work, one of the first things you learn is that the computer, and specifically the software, only does what you tell it to do. If you incorrectly tell it to do something, it will. Software can’t figure out your intentions. It doesn’t have a mind of its own and it doesn’t think “hey, I’m betting my programmer really wanted to do this and not that”. (Although some day it would be great if it could!) Hoping your software does something is a programmer's way of assuming the software understands his or her intentions. The only way to know if your software is doing what you want it to do is to test it. You can’t hope and you can’t push your confidence as a programmer at it. You can only test it. Walk through the customer scenarios. Walk through the places where things could fail and make sure it fails gracefully or recovers without failing. Figure out what you are missing, where your program can fail that you haven’t considered, and then test that and see what actually happens.
Where is your program going to fail? You should always ask that question. And if your software is large and complex that can be a hard question to answer. So consider asking yourself where are you taking risks? And what I mean by this is where are the areas in your software:
- That have dependencies on code outside what your team owns
- That have some code that is unstable and is known to produce a lot of defects
- That have some code that is a bit unknown due to it being legacy software, written by people who no longer work on the team and didn’t comment it well
- That have some code that is written in a complex way that makes it difficult to understand
- That don't have enough testing coverage
- That when released, there is no way to rollback or fix forward your changes if problems occur
Communicating risks is hugely important in understanding the state of your software. Understanding the risks early leads to people taking action to mitigate them and that leads to better software overall. Communicating risks within your code is not a sign of weakness. It's a sign that you understand all aspects of your very complex software system and you have the confidence as an engineer to state where the gaps are. Good software engineers know how to test their code for quality and how to communicate the risks and gaps in their software ecosystem.
If you have read this far, I'm going to assume this topic interests you so let me ask you to do a little assignment. When you are at work, look at your feature, user story, or overall ecosystem and come up with 3 risky areas where your software may fail. Rate the areas as high, medium, or low. And then determine the best way to mitigate those risks. Do you need to refactor your code, remove your unnecessary dependencies, or do more testing? Now go talk to a coworker about the risky areas you uncovered and post your thoughts in the comment section of this blog. Is it difficult to find 3? Is it difficult to determine what a risky area in your software is? Did you find this exercise easy? I'd love to hear from you. Please add a comment letting me know if this assignment was helpful.