Wednesday, 1 March 2023

No time

There is never enough time. Every successful company has more work than they can handle. There is no problem in that, it is more a luxury problem. Except when you can't handle it. A lot of work can lead to stress and stress leads to suffering. You can work harder, make more time (time is relative) and try to go faster but that depletes resources in no time. There are a couple of common pitfalls we see that leads to too much work and no time to do it. Let's have a look at those and find out how we can resolve this issue.

The loudest customer

First of all there is the situation of the loudest customer. That is the customer that screams the hardest and makes the most emotional impact on the introvert engineer. The engineer feels threatened and will try to make this go away by implementing the change as fast as possible. We can mitigate this by denying direct contact between engineer and customer and place a product owner in between them. PO's are more extravert, they manage the backlog and can do politics with customers, shielding the engineer in the process.

The most persistent customer

Besides a loud customer it is also possible there is a very persistent customer, who doesn't play on emotion, but will frequently contact the engineer or PO for a specific change. Because it is impossible to do all the things at once, prioritizing is the solution. We call that prioritization the backlog. A PO is owner of the backlog and plays all the games with the customers to order the items on the backlog by importance or business value. With a persistent customer the PO can forward the customer to the backlog.

The helpful employee

A helpful employee sounds good, but too helpful isn't good at all. When the employee helps the customer, and make them happy, they don't work on the backlog and are thus not working on the most important items. They dissatisfy other customers who's work has been planned and possibly delayed or not done. A team should deny working on anything else than what is on the backlog, forwarding the customer to the PO to negotiate a spot on the backlog.

The planning PO

In many occasions PO's assign work to engineers in the planning session. They focus on efficiency instead of effectiveness by doing this. Which means that everybody is busy with individual work, forgetting about sharing, building knowledge and most importantly they forget to put the team in control. Work should be ordered by the PO in the backlog. Engineers pick the most important item during the planning of the day and form a team to build a solution to that particular change. More than one engineer is working on the same change, ensuring quality. This way of working focusses on effectiveness and getting things done.

The important manager

Oh, how often have I seen managers walking into a team room and sorting out one engineer to do something for a particular customer completely skipping the product owner, any agreements, any priorities. The engineer will not say no to a manager (most of the time). This form is very disruptive and undermines the position of all team members, the product owner and the scrum master. Modern leaders should facilitate, not delegate. Solve this by coaching managers into a role of servitude; great leaders serve the team.

Feature driven

Nowadays most teams are setting a step from agile to devops, by incorporate run into their daily work. This means teams need to reserve time for maintenance, housekeeping and incidents. Too much focus on developing new features will result in an increase in technical debt. The more technical debt is acquired the slower development of new features will get, ultimately grinding to a halt. Reserve time for maintenance and incidents and keep technical debt to a minimum to ensure flexibility to build new features.

SPO(F/K)

A single point of knowledge is a huge problem. Engineers with a lot of knowledge in a team with less knowledge, are busy doing all the work while some team mates sit idle. Forbid any engineering work for this kind of people and make them teach and coach until the rest of the team is up to par.

In summary, get PO's in place to handle the backlog and set priorities fencing off customers and management. Forbid working from anything else than the backlog. Let the team plan. Share your knowledge.


Monday, 19 September 2022

About the importance of LCM

Life Cycle Management is about keeping your assets up to date. When systems are not up to date they tend to create unplanned work in the form of incidents. Unplanned work means we cannot deliver business value that we promised to our customer. Teams that don't have grip on maintenance are going to be reactive and fix assets when they are broken. The cascading effect of not having systems up to date can be big. It can even result in so much technical debt that your team cannot work on improvements and is dead locked in maintenance.

The first step in overcoming reactive maintenance is to plan it, and reserve resources for it. Each new feature that is developed by a team will result in extra capacity for LCM. Keeping a healthy balance on maintenance and new features is important to keep quality of work on an acceptable level. Unbalanced LCM can lead up to 90% of time spent on maintenance, reducing to amount of time that can be spent on delivering new business value.

To get a grip on LCM it is advisable to create a year calendar that shows when assets need to be updated. This includes hardware, software, licenses and certificates. For all those assets the current version needs to be know and the current available version. The current version doesn't need to be the latest and greatest, it needs to be a recent stable version with all security updates. Alas, the newest version can break your systems sometimes, so be alert on running the newest (n) or the one before the newest (n-1).

When you have the basic information start extrapolate on the history of the versions from your assets, nine out of ten times you can find this information on the asset providers website or in a changelog. Plot out in cadence the upcoming versions. 

Now we have a view on all the LCM work that needs to be done in the upcoming year. The next step is to calculate the workload on the items and get a grip on the time needed to maintain the assets. The calculation is just a sum that you and your team predict for what is needed to perform this maintenance. You end up with a table of all the assets, the predicted maintenance dates and the manpower that is needed to perform that maintenance. All maintenance can now be planned in the correct timeslot you use for planning, be it sprints or months for example.

The collected LCM information can be feed back to any roadmap plans to make realistic planning on upcoming sprints, months or quarters. You just made LCM planned work instead of unplanned or reactive work. 

Tuesday, 28 June 2022

Working together

Once there where two kids next to each other, playing with toy cars in a sandbox. Their parents were in joy because they saw their kids playing together. But where they? The kids played with their own toys in their own patch of sand. They didn't play together, they just played next to each other.
A couple of years later, the kids sat in the sandbox again, but this time they shared their toy cars and build a road in the sandbox. Now they just didn't sit together, they were actually playing together and had a lot of fun.

Most teams work together as the little kids in the story, they don't actually work together, they just happen to be in the same team. They pull their work individually from the worklog. They are just individuals sharing a desk space. Each morning during standup they tell each other what they did, and no one has a clue about what it exactly is they have done or they don't care. 

Some engineers put their name on a couple of items to claim them as future work. When they have one item done, they move on to the next. Others finish their work and pull stuff from the backlog into the sprint because there are no free items to work on. See where this is going? Not everything the team committed to is done but extra stuff is delivered as well. This doesn't fall well with stakeholders.

What can we do about it? First of all no team member can have more than one story on their name. Second, in the daily scrum we talk about the work that is on the scrum board. From top right: the stuff that is almost done, to bottom left: the stuff that still needs to be done. We ask ourselves: "What can we do today to finish this story". Then we ask if anyone wants to help finish it, together. As we progress through the stories in the end we run out of engineers.

So now we have at least two engineers on one story. We need something to enable us to work together. The idea is that everybody participates in fixing the story. We use pair or mob programming to accomplish this.

With pair programming there is a single driver behind a computer and there is one or more navigator who sit close to the driver. The driver is just inputting in the computer, they put in their implementation. The navigator(s) talk about the problem and try finding a solution. Every once in a while the driver switches with a navigator.

When we use this setting we can skip peer review, we are peering the whole time, so there's no use for that. Because more brains have worked out the solution it is of better quality which result in less incidents. And most importantly we shared knowledge and learned from each other. 

Below are a couple of resources to have a look at, the first is Stacy who talks about pair programming, the second is an article about mob programming on medium.

Tuesday, 21 June 2022

The cascading effect of the DevOps way of working

When you are going to work in a DevOps way it all starts with removing the first silo. This means removing the wall between Dev and Ops by putting them together in a team. The consequence is that the whole team is responsible for change and run. When the whole team becomes responsible, no individual can fall back on their role, they are all committed to the cause. Dev can do Ops, test can do Dev, Ops can do test and Dev and so on. This means more collaboration in the team and less push off of work to another role. As a DevOps team you are all in it together.

Working together is only possible if it is done in a small team, and with a small team I mean no more than five engineers (I like to call all team members engineers and not by their role). I state a team of five is the best, a team of three is minimum. With five it will still be a team, with six it will become two teams of three. Six or seven is still possible but communication and therefor collaboration will become increasingly harder. 

If you have smaller teams, you also need smaller software components and with smaller components you need more interfaces. These interfaces are twofold, one is the technical one with APIs, versioning, release cycles and so on. The other is the communication interface. How do you, as a team, communicate with other teams, how are your dependencies organised? You need to think about hard and soft dependencies and come up with a plan. Hard dependencies are those that make your team wait for work from another team or individual. Soft dependencies are those that interface with another team but don't require any work from another team. One of the goals for a DevOps team is to lessen hard dependencies and turn them into soft ones. This can only be done if it is clear who your teams customers are, which is one thing the team should know and investigate.

Working in a small team means a lot of collaboration, you need to share information and to improve sharing it is wise to minimise the work in progress by introducing a WIP limit. A WIP limit is a limit on the amount of work that can be in progress. If you need to learn a lot, make it a low number, so all team members need to work on the same thing together. One of the most errors I see is that every engineer works solely on items of the backlog not sharing knowledge and information. When two engineers work on the same item, they need to communicate and explain what they are doing, teaching the other one about their ways and maybe even learn something themselves.

It can happen that an engineer can't collaborate and that the WIP limit is reached. The purpose of DevOps is not to be as efficient as possible (max. resource utilisation) but to be as effective as possible (max. customer value). This means that when an engineer isn't working on items on the backlog, they should investigate improvements, refactor or learn.

Investigating the way of working and finding new ways to work leads to better processes and automation. When a DevOps team starts automating, many quirks in the process will be found. By solving these the quality of work and the speed of work will increase. 

So what are the steps to take?

  1. Form mixed teams with the disciplines needed to convert customer requests in a usable product or service
  2. Find out all the dependencies of the team
  3. Make communication plans with other teams, technical and social
  4. Introduce WIP limit to increase learning
  5. Automate

Let's start doing DevOps!

Tuesday, 8 February 2022

Why we do things in Scrum


We do daily scrums by answering the three questions.
We give a only a demo in the review.
We skip retrospective.
Planning is distributing tasks
But we do Scrum.

That is not Scrum, that is doing what we do in another form. For me the basis are the three pillars: transparency, inspection and adaption. Every event in Scrum is based on these pillars. 

Take the standup for instance, it is transparent because everybody is welcome to attend and the work progress is made visible. It is a moment to inspect the work for that day and come up with a plan to deliver stuff. It is a moment to adapt the way we are organized to get things done. 
I can't withstand those annoying questions: what did you do, what are you going to to and do you have impediments. The result is somewhat in the area of: "I did x yesterday, didn't finish so I continue on x today and I need to visit the dentist". We are not interested in your status or your agenda. At a standup we are inspecting if we get work done. Nothing about the individual, all about the team. So stop with those questions! One approach you can use is to talk about the work items on the board. Go from almost done in the top right corner, to stuff that can be started in the bottom left. Ask yourselves: what do we need today to get the top story in the done state. Don't make it individual work, but group up to finish the work. Find a way to finish it, then move to the next item on the list until you run out of team members. The work on the top is the most important, so it is also very important to finish it first.

Next on the agenda is the review. A review is a key moment to talk to the stakeholders, inspect the work that has been done, agree on the next goals. You can do a demo if it adds to the experience of the stakeholder. But most important is that you talk about what needs to be done and what you need from each other. Offer insights in work, show metrics, ask what the stakeholder needs and measure. Become predictable. Stakeholders don't need status updates or reports, they want to be involved and this is the moment to do that.

Then comes the retrospective. Most teams draw a boat or a racing car and talk about what is slowing them down and what can speed them up. It's a nice exercise, but it doesn't get to the point of the retrospective. Where the review is about what we do, the retrospective is about how we do it. What is our way of working, are we a team, can we trust each other, what can we do to improve flow and focus. In the retrospective you inspect and adapt on the way you work as a team.

And the last event in Scrum is planning. In this event we don't distribute tasks, we inspect our parameters. What capacity do we have, how does that relate to our speed of delivery and how much work can we take on. Is there any unplanned work we need to take in account? What stuff are we not going to do. Can we finish all the work and be predictable? So we inspect the work load, we adapt our forecast and we publish our sprint log to make it transparent.

We've talked about four Scrum events and how they relate to the pillars. Transparency is needed to build trust, inspection is needed to learn and adaption is needed to improve.

Tuesday, 7 December 2021

Working in an agile team

How does working in an agile team differ from working in a traditional team? Working in an agile team for the first time can be very challenging and you might rethink the way you work.

The first rule of forming an agile team is multidiscipline. An agile team should be able to do all the work, but not everybody in the team has to be able to perform all tasks. The second rule is a clear product vision and purpose; what needs to be build and maintained and why. And the third rule is: don't meddle. Let the team decide and give room for mistakes. Help the team get better and never punish.

Traditional roles are fading in a team. A programmer is learning to test, an operator is learning to code and a tester is learning to develop. All roles become less important and the knowledge is shared. Local heroes are replaced by team players; who support each other in getting the job done. 

One of the most important behaviors of an agile team is working in pairs. This makes sharing of knowledge possible, facilitates reviewing and saves time. While this sounds counter intuitive, results show that lead time and cycle time for changes drop significantly when people pair up. There is also a lower defect rate, less technical debt and more customer satisfaction.

Over time the roles are becoming responsibilities. A multidisciplinary team consists of engineers who are responsible for a part of the development and maintenance of a product. They are responsible for the piece of knowledge they bring to the team. They are expected to keep developing that knowledge and share it with the team and similar interest groups within the company. It is not really important anymore what exactly the job description of that person is. When they belong to the team, they become engineers that want to build excellent products and services for their customers.

An agile team is all about customer value. They engage in customer relations and find the best way to improve customer value. Work does not come from managers anymore but directly from customers. An agile team is not part of any project, it is part of an organisation that builds stuff for customers. In a big enterprise vision and goals (long term) are shared by leaders and teams use that to focus goals on a quarterly basis and in sprints. They organise the work they receive from customers around these goals.

Because an agile team is responsible for the work they do they are multi disciplinary, because they need to serve the customer best they have self autonomy and because they need to know where to go they need vision. 

Tuesday, 23 November 2021

How to fail without fear?

Failing in production is bad. There have been many solutions, but non is guaranteeing a 100% success rate. There is always a chance things will go bad when we deploy to production. What can we do to mitigate this risk? Over the years there have been many solutions, albeit most of them are technical ones. Let's have a look at different solution that we have come up over the years and find out how we can minimize risk when we deploy.

Fix in production

The first thing we did when something fails is to fix it as soon as possible, most of the time this was in a production environment. Although we can bring our service back up real quickly, the uncertainty of the deployment makes us feel weary to actually do it more often than necessary or on any regular basis.

DTAP

In order to tackle all the problems we have encountered in a production fix, we introduce different environments to test every possible situation. I have seen up to seven different environments in one situation; just imagine the cost and the result is that a lot of environments are doing absolutely nothing for most of the time. 

Next thing we see here is that not all the environments are exactly the same. There is always a minor difference in those environments that can bite us in the last step to production. I've seen different size clusters, different machine configurations, different firewalls and even different data connections.

The strength of multiple environments that are dedicated to one phase gets nullified by the increase in maintenance, the extended process and the cost.

Cloud

In order to solve the DTAP-trap we can spin up environments on demand to test. This means we don't need idle DTAP stations. We can actually do with one instance of a cloud platform. Just spin up an environment, run the necessary tests and then kill the instance. We get billed by the time we use the system, the system is identical to production and our process is smaller.

Pipeline

In order to facilitate a solid process of testing and deploying it is mandatory to use an automated pipeline. A pipeline can be seen as all the steps that are needed from development to production within one automated process. When we hook this up to the cloud and we make it as fast as possible. We can almost mimic fixes in production this way. And because we can implement safe guards through the whole process we can be almost sure that our stuff works.

Experimentation

In order to fail without fear we need to move to pipelines and cloud solutions (either on or off premise) and we need to feel relaxed with experimentation. Experimentation early in the process gives us the opportunity to learn new methods, processes and technical solutions. The more experience we get, the less chances there are that anything will fail.

Conclusion

In order to be sure we deploy safely and with the lowest chance of failure we need to focus on cloud-like delivery through automated pipelines.