I've spent my entire career in Operations and Systems, and I've been labeled many different things, from Sysadmin to Ops Engineer and SRE. But looking back, I have always been a DevOps engineer. Even before DevOps was a buzzword.

The DevOps movement emerged from the frustrations of isolated teams of developers and operations members often working against each other. The promise of DevOps—faster deployments, better collaboration, and more automation—resonated deeply with professionals.

DevOps has since attracted a lot of interest. For many, it is the lure of a challenging role. For others, it's the opportunity to make even more of an impact. Whatever the attraction, it's a great chance to be at the forefront of technology.

All this got me thinking: What does it really take to be a DevOps Engineer?

Obviously, you need to know how systems work, some cloud or the other, CI/CD, a bit of containerization, a lot of automation...it’s a long list, and I haven't even gotten to specific technologies!

But those are stuff you can learn. What about mindset and soft skills? Do you have what it takes to be a great DevOps Engineer?  

What Makes a Great DevOps Engineer Infographic

Clear Communication

There is an example I keep coming back to from early in my career. I had been warned that a client wasn't very technical, so if asked anything, I should use very simple terms. So when asked to introduce myself, I told him I was in charge of all the infrastructure and pipelines the project runs on. He sat still for a minute and then said, “So you are in charge of the building and plumbing?” It took me a second to understand what he meant, and I had to explain to him, “No, I am responsible for the software side of the infrastructure, the servers that the entire project runs on.” He started laughing and told me he had never thought of servers as infrastructure before.

That lesson has stayed with me. It reminds me that even if someone has been in the industry for years, they may not understand the jargon we take for granted.

Even if someone has been in the industry for years, they may not understand the jargon we take for granted.

We should be able to explain solutions and decisions to a non-technical audience clearly. When talking to technical people, we should be concise and accurate, lest they feel we are wasting their time.

I have worked in countless projects where there would be only one or two DevOps resources versus about 30 plus other resources, including developers, testers, security engineers, and managers. While talking amongst DevOps members, it might make sense to say, “The ASG in region 1 is not responding.” But to developers, you might have to say, “Automatic scaling is not working in region 1”. At the same time, to the manager, you might need to explain, “We are unable to automatically add more servers to our existing cluster at the moment.” Adapting the message to the audience prevents misunderstanding and speeds up issue resolution.

Another Key Skill Needed? Active Listening. 

Often, you will find yourself surrounded by all sorts of people explaining their issues to you. You should be able to cut through the noise and identify their actual pain points. 

Recently, a client requested that we ensure 100% availability for their deployment cluster. I found this a bit strange. Achieving that kind of uptime is costly, the only advantage being that they can deploy new changes at will. This led me to inquire deeper. It turned out that the client had recently faced a situation where their cluster was corrupted, and they had to spend hours rebuilding all the jobs that ran on it. So what they really needed was a reliable way to back up their configurations. By identifying the actual need, we were able to address it satisfactorily, keeping costs under control.

Problem-Solving Under Pressure

One of the most commonly reported (and dreaded) issues in DevOps is, “The site is slow”. If it is a standalone application, diagnosing the problem is simpler…you can easily look at the components. But what if the site relies on multiple microservices? Where do you start?

Tracing tools make this a little easier. But what if you don't have one?

You need to understand the traffic, backtrace it, and check each component; look at all the available metrics and logs, identify anomalies, and based on that, identify the root issue. Is it a faulty disk on a server? Is your database choking with too many connections?

Knowing where to look is important!

You also get pulled into scenarios that demand quick thinking and debugging: What is the impact of a particular technical issue? Do you have time to wait for the actual fix, or does it make sense to apply a band-aid patch for now? What is the impact of applying this solution? And so on. 

You get pulled into scenarios that demand quick thinking and debugging.

I still remember when the Log4Shell vulnerability was discovered in a widely used library. Exploiting this vulnerability, attackers could execute remote code on any system using it.

We had to act fast.

We assessed the severity of the vulnerability, identified the systems at risk, and developed a mitigation plan. That meant emergency hotfixes, temporary firewalls to block potential attack vectors in applications we couldn't patch right away, and coordination with development teams to update the affected applications. 

DevOps engineers must make these critical decisions, balancing the need to mitigate the immediate risk with the potential disruption to business. This requires careful risk assessment and decision-making under pressure.

Continuous Learning

Looking back at how things were when I first started, many of the must-have tools we had back then are hardly used anymore.

Innovations have led to myriad new technologies that we've had to learn and become proficient in quickly. Public cloud, Software as a Service, Infrastructure as Code, Pull-based systems, Service Discovery, Docker, Tracing, Kubernetes are just a few examples. 

It is impossible to get someone who knows everything needed for a project. Instead, you should always pick the person who can learn quickly and get to work. One of my old colleagues used to say that a DevOps Engineer is the proverbial Jack of all trades. You are lucky if you can actually work on the same tool long enough to master it.

DevOps Engineer is the proverbial Jack of all trades. You are lucky if you can actually work on the same tool long enough to master it.

The other part of the problem is that since many technologies are relatively new, they too are constantly evolving. It is a daunting task to stay updated, especially when you consider that a better, more innovative solution might be just around the corner.

Continuous learning is like running a marathon. Focus on building a strong foundation of core principles and gradually expanding your knowledge based on your interests and career goals. The experience you gain through this is invaluable.

I spent the first few years of my career working with Varnish, a popular HTTP caching tool back then. With cloud platforms, integration of caching mechanisms within services like AWS CloudFront or Google Cloud CDN, the need for dedicated Varnish deployments has diminished. But this experience from 10 years ago turned out to be useful recently when I worked for a client who used Akamai as a caching solution. The knowledge I gained in caching mechanisms and performance optimization helped me troubleshoot issues related to their caching.

When recruiting for DevOps, we don’t insist that you know every tool and technology. Instead, we look for people who can learn fast and apply that knowledge as they go.

Empathy and Collaborative Spirit

An often overlooked fact is that DevOps is not just a technical role. DevOps was created to foster interaction and collaboration with other team members, many of whom may not fully understand your technical constraints, just as you may not grasp their priorities.

That's why it's important to engage with developers instead of dismissing their requests outright. By getting to know their underlying need, you might be able to create a compromise that both suits their requirement and aligns with the project goals.

It's important to engage with developers instead of dismissing their requests.

It's equally important to explain your constraints, so they get the full picture. When others understand the "why" behind your decision, they are less likely to see you as being obstructive.

Many times, people have come up to me and said, "Spin up two servers."

Sure, with the cloud, I can launch as many servers as they wish. But I might have to follow fixed approval flows. I might need to code the server into my existing Infrastructure as Code codebase. I might have to get my Pull Request approved. And I will definitely have to ensure compliance and security protocols are maintained.

So, rather than tell them, "Sure, come back in two days," I try to understand the actual requirement. Is it urgent? Can it be created in our sandbox environment? Do we need additional approvals? And when I finally make a decision, I discuss it with the person who raised the request, explaining my thought process and what is needed.

Recently, our dev team requested that all instances used by a particular application be terminated on February 1. I was taken aback and asked why. It seems the application was being sunset in a phased manner. I raised a concern: what if they need to roll back because some other application is still using one of its APIs?

We sat and discussed this, and I was able to convince them that it would be better to simply redirect traffic to a sunset message. This way, we can monitor the logs for usage before shutting things down.

Often, when you take the time to understand the reasoning behind a request, you can find a better way to address the need.

In Summary

I've tried to list out the non-technical skills that I feel are most important to be a great DevOps engineer. I’m sure there are many more. But if you ask me, a DevOps Engineer must be able to freely communicate, have a keen analytical mind, and be willing to continuously learn.

DevOps is fundamentally about creating an environment where people from different teams can work together well. That takes empathy, perspective-taking, and a genuine willingness to support the team. A great DevOps Engineer brings not just strong technical skills but also these human qualities, enabling a healthier team dynamic and better outcomes for the project.

No Image
Senior Architect, DevOps