Where Site Reliability Engineering Overlaps with DevOps

Spread the love

Site reliability engineers (SREs) are constantly balancing priorities. The job role continues to evolve but is very much real. Catchpoint’s “2020 SRE Report” surveyed over 600 people that do SRE-type work, of which 46% said their organization has a dedicated SRE team that is distinct from teams that handle IT operations and administration. Still, the role is often conflated with DevOps, with 19% of the respondents saying the DevOps team handles SRE responsibilities. In fact, there is reason to believe the two functions can be managed as a whole — 41% said SREs and DevOps are part of the same team, while only 26% consider them to be complementary.
While SREs juggle both development and operations responsibilities, more than half spend less than 25% of their time doing development. Almost half (48%) spend a moderate or large amount of time writing software to help with operations, with much of that code helping to automate previously manual tasks. Although it will take many years to make most infrastructure programmable, SREs can be expected to be leading the way as 71% said infrastructure-as-code is used by site reliability engineers.
Source: Catchpoint’s “2020 SRE Report”.
Overall, monitoring and incident management continues to be the most common activities performed by SREs, but 55% said that application release and deployment management tools are used by SREs. As long as DevOps is the primary owner of application release management, then the distinction between the two teams will likely continue. However, this just means that there will be a new area of conflict. Is it self-evident to you whether SREs should focus on monitoring infrastructure or applications?
Q. What tool categores are used by SREs? Source: Catchpoint’s “2020 SRE Report”.
Catchpoint has historically focused on end-user monitoring, where the end-user can either be a customer or an infrastructure service supporting other applications. The generic nature of the “monitoring and alerting” category means that 93% said SRE use these tools while only 55% use observability tools. Defining exactly what an observability tool is can be difficult. The report confirms findings from TNS sponsor Honeycomb’s recent survey on the subject. That study found that the adoption of individual components of an observability stack is common, even if observability as a practice is relatively immature.
An underlying theme in the report is that observability needs to be a holistic practice. From the vendor’s point of view, this means that all services should have an API that plugs into an observability framework to detect outages and performance issues.

Chaos engineering showed up several times in the report. Resiliency checks via practices like chaos engineering take up at least a moderate amount of time for 19% of those that perform SRE tasks. The prevalence of these tools among SREs is actually higher, but many times the tools are not at the core of day-to-day activities.
SREs are effective working remotely. Of the 356 people that answered the survey after the imposition of COVID-19 stay-at-home orders, only 14% had to be onsite so far. Two-thirds of respondents said SREs are part of the organization’s on-call rotation, which implies that most of the manual work related to this can be performed remotely. Looking at the data collected after the pandemic’s onset, 80% believe their effectiveness handling incident management was not hurt by the move to at-home work. Only 9% said their sites or apps experienced more incidents during the forced at-home period of work, but just as many saw a decline in incidents. While traffic and capacity issues may have occurred due to remote work, they did not have a serious impact on customer-facing operations.

But don’t get too excited about remote work. While half said nothing about incident management is more challenging when they working home, 28% said escalating to the right teams is more difficult. Without face-to-face communication, some aspects of team communication may need to be readjusted. Although SREs may get more flexibility to work at home in the future, many jobs will still have an onsite or in-office element.

Honeycomb is a sponsor of The New Stack.
Feature image by Hans Braxmeier from Pixabay.
At this time, The New Stack does not allow comments directly on this website. We invite all readers who wish to discuss a story to visit us on Twitter or Facebook. We also welcome your news tips and feedback via email: [email protected]
The post Where Site Reliability Engineering Overlaps with DevOps appeared first on The New Stack.

X ITM Cloud News


Leave a Reply

Next Post

How Inkscape Built an Open Source Community with Mac and Windows Users

Tue Jun 30 , 2020
Spread the love          Amazon Web Services (AWS) sponsored this post. Matt Asay Matt is a principal at AWS and has been involved in open source and all that it enables (cloud, machine learning, data infrastructure, mobile, etc.) for nearly two decades, working for a variety of open source companies and writing […]

Cloud Computing – Consultancy – Development – Hosting – APIs – Legacy Systems

X-ITM Technology helps our customers across the entire enterprise technology stack with differentiated industry solutions. We modernize IT, optimize data architectures, and make everything secure, scalable and orchestrated across public, private and hybrid clouds.

This image has an empty alt attribute; its file name is x-itmdc.jpg

The enterprise technology stack includes ITO; Cloud and Security Services; Applications and Industry IP; Data, Analytics and Engineering Services; and Advisory.

Watch an animation of  X-ITM‘s Enterprise Technology Stack

We combine years of experience running mission-critical systems with the latest digital innovations to deliver better business outcomes and new levels of performance, competitiveness and experiences for our customers and their stakeholders.

X-ITM invests in three key drivers of growth: People, Customers and Operational Execution.

The company’s global scale, talent and innovation platforms serve 6,000 private and public-sector clients in 70 countries.

X-ITM’s extensive partner network helps drive collaboration and leverage technology independence. The company has established more than 200 industry-leading global Partner Network relationships, including 15 strategic partners: Amazon Web Services, AT&T, Dell Technologies, Google Cloud, HCL, HP, HPE, IBM, Micro Focus, Microsoft, Oracle, PwC, SAP, ServiceNow and VMware