GitLab Commit Virtual is here. Register Now for our 24 hour immersive DevOps experience.
Gitlab hero border pattern left svg Gitlab hero border pattern right svg

Product Direction - Monitor

On this page

This is the product direction for Monitor. If you'd like to discuss this direction directly with the product managers for Monitor, feel free to reach out to Dov Hershkovitch (GitLab, Email), Sarah Waldner (GitLab, Email Zoom call) or Kevin Chu (GitLab, Email Zoom call).

Overview

The Monitor stage comes after you've configured your production infrastructure and deployed your application to it.

  1. The Monitor stage is part of the verification and release process - immediate performance validation helps to ensure your service(s) maintain the expected service-level objectives (SLOs) for your users.
  2. The Monitor stage is an observability platform. Observability is the ability to infer internal states of a system based on the system’s external outputs. Whether there are known ways to understand the total health of your systems, or your complex microservices system is full of unknowns, we want you to be able to export your system's telemetry to GitLab and use it to debug and diagnose any potential problem.
  3. The Monitor stage helps you respond when things go wrong. It enables the aggregation of errors and alerts to identify problems and to find improvements. The Monitor stage also enables responders to streamline incident response, so production issues are less frequent and severe.
  4. The Monitor stage also provides is user feedback. Understanding how users experience your product and understanding how users actually use your product are critical to making the right improvements.

Mission

The mission of the GitLab Monitor stage is to provide feedback that decreases the frequency and severity of incidents and improves operational and product performance.

The categories within the Monitor stage fits together to support the mission in the following way:

stateDiagram Development --> Monitor: Code Deploy state Monitor { s1 --> s2: Daily Operations s2 --> s3: Incident s3 --> s4: Resolution s2 --> s4 s1: Verification s1: Metrics s1: DEM (Synthetics) s1: DEM (Web Performance Monitoring) s2: Observability s2: Metrics s2: Traces s2: Logs s2: Errors s3: Response s3: Incident Management s3: Observability s4: Feedback s4: DEM (Real User Monitoring) s4: Product Analytics } Monitor --> Development: Continuous Improvement

Landscape

The Monitor stage directly competes in several markets, including Application Performance Monitoring (APM), Log Management, Infrastructure Monitoring, IT Service Management (ITSM), Digital Experience Management (DEM) and Product Analytics. The total addressable market for the Monitor stage was already more than $1.5 billion in 2018 and is expected to grow as businesses continues to shift to digital.

All of these markets are well-established and crowded. However, they are also being disrupted by the underlying technologies used. The shift to cloud, containers, and microservices architectures changed users' expectation, and many existing vendors have struggled to keep pace. Successful vendors, such as market leader Datadog have leveraged a platform strategy to expand their markets, and even stages within DevOps.

The changes in the market have also revealed opportunities that new entrants into this stage, like GitLab, can take advantage of. Specfically, the Ops section opportunities worth re-emphasizing are:

Vision

In 2 year’s time, the Monitor stage categories of observability, incident management, and product feedback are the default choice for cloud-native teams using GitLab by being complete, cost effective, and simple to setup and operate, enabling continuous improvement.

GitLab is uniquely qualified to deliver on this bold and ambitious vision because:

  1. GitLab is a complete devops tool that is connected across the devops stages. Being one tool makes the circular devops workflow, and feedback, seamless and achievable.
  2. The Monitor stage is pursuing a differentiated strategy from other observability vendors by not pursuing a usage based model business model by charging for processing and storage of observability. Instead, we lean on powerful open source software, such as Prometheus and OpenTelemetry, along with commodity cloud services to enable customers to setup and operate Monitor stage observability solutions effectively. We will be successful because we are well-practiced in integrating different parts of the tool chain together.
  3. Going cloud-native is a disruption to operations as usual. Cloud-native systems are constantly changing, are ephemeral, and are complex. As more and more companies adopt cloud-native, GitLab can create a well-integrated central control-pane that enables broad adoption by building on top of the tools that cloud-native teams are already familiar with and are using.

A trade-off in our approach is that we are explicitly not striving to be a fully turn-key experience that can be used to monitor all applications, particularly legacy applications. Wholesale removing an existing monitoring solution is painful and a land and expand strategy is prudent here. As a customer recently explained, "Every greenfield application that we can deploy with your monitoring tools saves us money on New Relic licenses."

As this stage matures, we will begin to shift our attention and compete more directly with incumbent players as a holistic Monitoring solution for modern applications.

3 Year Strategy

Dovetailing on our 2 year vision statement, our 3 year goal is to have built an integrated package of observability and operations tools that can displace today's front-runner in modern observability, Datadog and compete in all Monitor categories. We'll do that by focusing on the four core workflows of Instrument, Triage, Resolve and Improve.

The following links describe our strategy for each individual workflow:

Pricing

Monitor is a critical component for all software development and operations. The Monitor stage's tier strategy will be broken down by workflow as described below.

Core/Free

To execute our land and expand strategy and to receive as much feedback from our potential user base, Core contains the vast majority of the Monitor features, including metrics, logs, incident management, traces, and error management.

Limits:

Starter/Bronze

Upcoming starter Monitor functionality include:

Premium/Silver

Upcoming premium Monitor functionality include:

Ultimate/Gold

Upcoming ultimate Monitor functionality include:

What's next

From 2020-05 through 2020-07, the following are the goals we are pursuing within the Monitor stage.

  1. The Monitor::Health group is adding the Alert Management category so that customers can centralize all of the alerts from various systems within GitLab.
  2. The Monitor::APM group wants to enable every GitLab user to explore and triage issues for their Kubernetes based application starting from a GitLab metrics dashboard.
    • Key Result 1: All public GitLab.com dashboards uses GitLab dashboards by end of July
    • Key Result 2: 20% increase in APM north star metric compared to the previous 3 month

The quarterly goals fit within the larger overarching objectives of the Monitor stage described below.

First, we plan to provide a streamline triage experience to allows our users to quickly identify and effectively troubleshoot an application problem as described in the following flow:

graph TB; A[Alerts] -->|Embedded Metric Chart in Incident|B B[Metrics] -->|Timespan Log Drilldown|C C[Logs] -->|TraceID Search|D[Traces]

Detailed information can be found in the triage to minimal epic

Second, we plan to dogfood our current capabilities. Monitor and observability solutions, by nature of what they are, have a high bar to meet before adoption. By continuing to improve the triage workflow, we will at the same time enable our GitLab teammates to use GitLap Monitor more fully.

You can see our entire public backlog for Monitor at this link; filtering by labels or milestones will allow you to explore. If you find something you're interested in, you're encouraged to jump into the conversation and participate. At GitLab, everyone can contribute!

Performance Indicators (PIs)

Our Key Performance Indicator for the Monitor stage is the Monitor SMAU (stage monthly active users).

Monitor SMAU is determined by tracking how users configure, interact, and view the features contained within the stage. The following features are considered:

Configure Interact View
Install Prometheus Add/Update/Delete Metric Chart View Metrics Dashboard
Enable external Prometheus instance integration Download CSV data from a Metric chart View Kubernetes pod logs
Enable Jaeger for Tracing Generate a link to a Metric chart View Environments
Enable Sentry integration for Error Tracking Add/removes an alert View Tracing
Enable auto-creation of issues on alerts Change the environment when looking at pod logs View operations settings
Enable Generic Alert endpoint Selects issue template for auto-creation View Prometheus Integration page
Enable email notifications for auto-creation of issues Use /zoom and /remove_zoom quick actions View error list
  Click on metrics dashboard links in issues  
  Click View in Sentry button in errors list  

See the corresponding Periscope dashboard (internal).

Workflows

There are a few workflows that are critical to our users in this stage.

Each of these workflows has a designated level of maturity; you can read more about our category maturity model to help you decide which categories you want to start using and when.

Monitoring - Instrument

This workflow is planned, but not yet available.
Direction

Monitoring - Triage

Starting with the highest level alert, using preconfigured dashboards to review relevant metrics, enabling ad-hoc visualization and immediate drill down from time sliced metrics into logs and traces in the same screen This workflow is planned, but not yet available.

Direction

Monitoring - Resolve

This workflow is planned, but not yet available.
DocumentationDirection

Monitoring - Improve

This workflow is planned, but not yet available.
Direction

Categories

There are a few product categories that are critical for success here; each one is intended to represent what you might find as an entire product out in the market. We want our single application to solve the important problems solved by other tools in this space - if you see an opportunity where we can deliver a specific solution that would be enough for you to switch over to GitLab, please reach out to the PM for this stage and let us know.

Each of these categories has a designated level of maturity; you can read more about our category maturity model to help you decide which categories you want to start using and when.

Metrics

GitLab collects and displays performance metrics for deployed apps, leveraging Prometheus. Developers can determine the impact of a merge and keep an eye on their production systems, without leaving GitLab. This category is at the "viable" level of maturity.

Priority: high • DocumentationDirection

Alert Management

Consolidate all of your IT alerts in GitLab. Quickly triage and investigate problems by correlating alerts to relevant metrics, logs, traces, and errors. Elevate the critical ones to incidents for speedy resolution. This category is at the "minimal" level of maturity.

Priority: high • DocumentationDirection

Incident Management

Track incidents within GitLab, providing a consolidated location to understand the who, what, when, and where of the incident. Define service level objectives and error budgets, to achieve the desired balance of velocity and stability. This category is at the "viable" level of maturity.

Priority: high • DocumentationDirection

Logging

GitLab makes it easy to view the logs distributed across multiple pods and services using log aggregation with Elastic Stack. Once Elastic Stack is enabled, you can view your aggregated Kubernetes logs across multiple services and infrastructure, go back in time, conduct infinite scroll, and search through your application logs from within the GitLab UI itself. This category is at the "viable" level of maturity.

Priority: medium • DocumentationDirection

Tracing

Tracing provides insight into the performance and health of a deployed application, tracking each function or microservice which handles a given request. This makes it easy to understand the end-to-end flow of a request, regardless of whether you are using a monolithic or distributed system. This category is at the "minimal" level of maturity.

Priority: medium • DocumentationDirection

GitLab Self-Monitoring

Self-managed GitLab instances come out of the box with great observability tools, reducing the time and effort required to maintain a GitLab instance.

Priority: low • DocumentationDirection

Error Tracking

Error tracking allows developers to easily discover and view the errors that their application may be generating. By surfacing error information where the code is being developed, efficiency and awareness can be increased. This category is at the "viable" level of maturity.

Priority: low • DocumentationDirection

Product Analytics

This category is planned, but not yet available.
Priority: medium • Direction

Upcoming Releases

13.3 (2020-08-22)

13.4 (2020-09-22)

13.5 (2020-10-22)

Other Interesting Items

There are a number of other issues that we've identified as being interesting that we are potentially thinking about, but do not currently have planned by setting a milestone for delivery. Some are good ideas we want to do, but don't yet know when; some we may never get around to, some may be replaced by another idea, and some are just waiting for that right spark of inspiration to turn them into something special.

Remember that at GitLab, everyone can contribute! This is one of our fundamental values and something we truly believe in, so if you have feedback on any of these items you're more than welcome to jump into the discussion. Our vision and product are truly something we build together!

GIT is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license