Tuesday, 23 November 2021

How to fail without fear?

Failing in production is bad. There have been many solutions, but non is guaranteeing a 100% success rate. There is always a chance things will go bad when we deploy to production. What can we do to mitigate this risk? Over the years there have been many solutions, albeit most of them are technical ones. Let's have a look at different solution that we have come up over the years and find out how we can minimize risk when we deploy.

Fix in production

The first thing we did when something fails is to fix it as soon as possible, most of the time this was in a production environment. Although we can bring our service back up real quickly, the uncertainty of the deployment makes us feel weary to actually do it more often than necessary or on any regular basis.

DTAP

In order to tackle all the problems we have encountered in a production fix, we introduce different environments to test every possible situation. I have seen up to seven different environments in one situation; just imagine the cost and the result is that a lot of environments are doing absolutely nothing for most of the time. 

Next thing we see here is that not all the environments are exactly the same. There is always a minor difference in those environments that can bite us in the last step to production. I've seen different size clusters, different machine configurations, different firewalls and even different data connections.

The strength of multiple environments that are dedicated to one phase gets nullified by the increase in maintenance, the extended process and the cost.

Cloud

In order to solve the DTAP-trap we can spin up environments on demand to test. This means we don't need idle DTAP stations. We can actually do with one instance of a cloud platform. Just spin up an environment, run the necessary tests and then kill the instance. We get billed by the time we use the system, the system is identical to production and our process is smaller.

Pipeline

In order to facilitate a solid process of testing and deploying it is mandatory to use an automated pipeline. A pipeline can be seen as all the steps that are needed from development to production within one automated process. When we hook this up to the cloud and we make it as fast as possible. We can almost mimic fixes in production this way. And because we can implement safe guards through the whole process we can be almost sure that our stuff works.

Experimentation

In order to fail without fear we need to move to pipelines and cloud solutions (either on or off premise) and we need to feel relaxed with experimentation. Experimentation early in the process gives us the opportunity to learn new methods, processes and technical solutions. The more experience we get, the less chances there are that anything will fail.

Conclusion

In order to be sure we deploy safely and with the lowest chance of failure we need to focus on cloud-like delivery through automated pipelines. 

No comments:

Post a Comment