Blog
  • Blog
  • What’s a “Service Owner” and how can they improve application reliability?
November 18, 2024

What’s a “Service Owner” and how can they improve application reliability?

By Will Searle, Lead Sales Engineer, Causely

Assuring application reliability is a persistent challenge faced by every IT organization, complicated by rapid technology evolution and the increased emphasis on lean engineering.  One trend among progressive companies is to designate a “Service Owner” who is responsible for making sure applications meet their objectives for uptime and customer satisfaction.

In this post, we’ll explain what it means to be a Service Owner, outline key responsibilities associated with the role, and offer advice for companies looking to build a culture of service ownership.

How the Service Owner came to be

When the DevOps movement took off around 2010, it promised to fix the issues with fragmented teams and inefficient software lifecycle management that had been hindering application performance and reliability for years.  This new wave of IT fostered cross-team collaboration, communication, and transparency as a way to accelerate software delivery of software and deliver higher quality of service (QoS).

But there was something still missing: accountability across the end- to-end application lifecycle, from development to roadmap design to customer satisfaction. Hence the emergence of the Service Owner.

So, what is a Service Owner?

Many IT teams today, especially those using microservice architectures, employ multiple Service Owners.  The exact definition varies slightly from company to company, but most would define a Service Owner (SO) as the person who is responsible for making sure the application or service meets its designated service level objectives (SLOs). Implicit in this definition: Service Owners are accountable for the end-to-end lifecycle of applications, from development through operational performance production monitoring, to ensure uptime and customer satisfaction.

For a little more context, here’s how ITIL defines the role:

  • The Service Owner is responsible for delivering a particular service within the agreed service levels.
  • Typically, the Service Owner acts as the counterpart of the Service Level Manager when negotiating Operational Level Agreements (OLAs).
  • Often, this role will lead a team of technical specialists or an internal support unit.

Usually, Service Owners are aligned to a specific product line for the business. Their formal title may be Product Owner or Engineering Manager. For example, imagine a healthcare tech company that sells solutions to businesses and hospitals.  They have 5 different products within their offering suite:

  • AI platform
  • Online therapy product
  • Analytics product
  • Telehealth service
  • Financial management product

A Service Owner would be assigned to each of these product lines and assume full responsibility for its application lifecycle management and reliability engineering.  A Service Owner role can also be broken down by customer, by services within the products, or even by mobile vs desktop applications.

Responsibilities of the Service Owner

Service Owners’ responsibilities tend to vary slightly from company to company but they often include:

  1. Design and Roadmap: Writing the code and overseeing the design and implementation of the service, ensuring it functions properly and is scalable. This includes managing the ongoing balance and prioritization of releasing new features vs. improving the reliability of existing functionality.
  2. Maintenance and Support: Ensuring that the service is properly maintained, updated, and supported over its lifecycle.  This is where creating and adhering to SLOs comes into play.
  3. Performance Monitoring: Monitoring the performance and reliability of the service, by implementing metrics and logging to track its health and flag when things break.  Performance Monitoring also includes implementing proactive monitoring to prevent downtime.
  4. Collaboration: Working closely with other teams, such as Product Management, Sales, Platform Engineering, etc. to align the service with the business goals.
  5. Documentation: Creating and maintaining comprehensive documentation for the service, including things like APIs, user guides, and architecture diagrams.
  6. Governance and Compliance: Ensuring that the service adheres to relevant policies, standards, and regulatory requirements.
  7. Stakeholder Communication: Acting as a point of contact for stakeholders, addressing their needs and concerns regarding the service.

Service Owners work closely with technical experts such as SREs, DevOps, or Developers to maintain service reliability, though they each have distinct roles and tool preferences:

  • Service Owners typically use collaboration and project management tools like Jira or Asana and monitor high-level metrics on observability dashboards like Grafana.
  • Technical Experts handle incident resolution and service reliability, relying primarily on observability and incident management tools like PagerDuty.

Tools like ServiceNow, Atlassian’s Service Management (OpsGenie, Jira), and PagerDuty’s workflow orchestration attempt to bridge the gap between these two roles by providing a unified space for planning, alerting, diagnosis, and response. This enables Service Owners and technical experts to operate more effectively together, allowing engineering teams to enhance alignment, transparency, and accountability.

Code it, ship it, own it

A service owner’s job goes beyond writing and compiling code and bug fixes.  They are responsible for their applications and services after they’ve been shipped to production.  When Service Owners own the lifecycle, organizations see improved QoS and faster MTTR.

If they wrote the code, they know how to fix it.  If something breaks, they are the first responders and take accountability for failures.  They have a deeper knowledge of issues within their service and application so they can fix and develop the fastest, but they must be held accountable for downtime.  No more finger pointing!

One thing service owners MUST do is align themselves with their customers.  They must understand what the customers’ needs and expectations are and then design the code, roadmap, and establish SLOs accordingly.

The benefits of this approach? Products directly solve customer pain, and in most cases, services are delivered faster to customers.  It reminds me of a funny meme I saw years ago — it couldn’t be more accurate.  This situation is exactly what Service Owners are preventing!

A six-panel comic showing variations of a tree swing, highlighting differences in user requests, analyst views, design, programming, desires, and final outcome.

Source: Reddit

Building a culture of service ownership

Since service ownership is a culture and not a tool, it needs to grow over time; it can’t happen overnight, no matter how much pressure the business puts on IT.  There are ways to actively foster a shift in mentality, so teams thrive in an environment where they have more responsibility and where better services are being delivered to the customer.

Based on my conversations with leaders in IT, these are some common best practices for building a culture of service ownership.

  1. Promote collaboration: Encourage open communication between teams—development, operations, and business units. Regular cross-functional meetings and collaborative tools can help break down silos faster and foster a shared understanding of service objectives.
  2. Establish a customer-first mentality: As mentioned before, service ownership will not thrive if everyone has different ideas and goals.  Establishing a common goal like focusing on customers’ needs can align teams. If customer satisfaction is the north star, companies will have more satisfied customers, which means bigger checks 😉.  Defining customer specific SLOs is an excellent way to keep everyone aligned on the mission.  SLOs on latency, number of customer tickets/escalations, and even uptime are some standard ones I see most Service Owners using.
  3. Embrace failure: Taking accountability and responsibility for something that directly impacts a business can be scary. That’s probably the single biggest reason why most software engineers are hesitant to adopt a service ownership role.  If leadership fosters a culture where failure is seen as progress and not regression, then it becomes more appetizing to developers.  No one wants to lose their job over a silly mistake, but they need to learn from these slips and drive towards a more reliable application architecture.

Building a culture of service ownership in IT requires a deliberate and consistent approach. By defining roles, fostering collaboration, and empowering teams, IT can create an environment where service ownership is continuously improved. This culture not only enhances QoS but also drives innovation and responsiveness, ultimately benefiting the health of any business.


FAQs

  • What is application lifecycle management? 
    Application lifecycle management is the end-to-end process of developing, building, deploying, and managing software applications over time to ensure consistent and ongoing quality, reliability, and resilience.
  • What is reliability engineering? 
    Reliability engineering is the practice of ensuring that applications, products, or systems function without failure. Reliability engineers focus on proactively identifying potential failures to determine their root cause and mitigation strategies before they happen.
  • What is a Service Owner? 
    A Service Owner is someone who is responsible for meeting agreed service levels. They ultimately usually own the overall engineering, management and governance of the lifecycle of a service.

 

Keep Me Updated

Subscribe to our newsletter to stay up to date!