Integrating Prometheus Alertmanager with All Quiet: A Technical Guide

Image

Published: Wednesday, 29 November 2023

🚨 Master Alert Management: Dive into the technical nuances of integrating Prometheus Alertmanager with All Quiet. Ideal for teams looking to streamline their monitoring and alerting systems.

Welcome to our step-by-step guide on integrating All Quiet with your email notifications! Setting up an email integration in All Quiet automatically generates a unique email address for you. We'll show you how to use this nifty feature to send alerts from your observability platforms straight to All Quiet, effortlessly turning those emails into actionable incidents. You'll learn how to fine-tune attribute mapping to ensure your incidents are captured accurately. Let’s dive in and streamline your incident management process!

Step 1: Create Prometheus Integration on All Quiet

Login into your All Quiet account.

Create Integration

  1. Click on "Integrations > Inbound" to navigate to the integrations page.
  2. Click on "Create New Integration" to create a new integration.
Create Prometheus Alertmanager Integration
  1. Enter a display name for your integration, e.g. "Prometheus Alertmanager".
  2. Select a team.
  3. Select "Prometheus Alertmanager" as the type.
  4. Click "Create integration".
Configure Integration

Copy Webhook URL

  1. After successfully creating your Prometheus Alertmanager integration, make sure to copy the webhook URL.
Copy Webhook URL

Step 2: Configure Prometheus Alertmanager

Once you've set up an integration of type "Prometheus Alertmanager" with All Quiet, the next crucial steps involve configuring your Prometheus and Alertmanager instances. This is essential for ensuring that your monitoring setup can effectively send incidents to the All Quiet webhook. In this part of the guide, we will walk you through simple yet effective configuration examples for both Prometheus and Alertmanager.

Setting Up Prometheus

First, let's start with the Prometheus configuration. Your prometheus.yml should include the necessary scrape configs to monitor your targets. Here’s an example of a basic configuration:

In your prometheus.yml, the configuration should primarily include scrape_configs and alerting details. Below is an example configuration:


      scrape_configs:
        - job_name: 'allquiet.app'
          scrape_interval: 5s
          scheme: https
          metrics_path: /status
          static_configs:
            - targets: ['allquiet.app']

      rule_files:
        - "*.rules"

      alerting:
        alertmanagers:
          - scheme: http
            static_configs:
              - targets: [ 'your-prometheus-alertmanager.yourdomain.com:9093' ]
    

In this configuration, scrape_configs defines the job for scraping metrics from allquiet.app, with a frequent interval of every 5 seconds. We're observing our own platform in this example :). The https schemeand /status metrics path dictate how Prometheus accesses the data.

The rule_files section tells Prometheus to load any alerting rules from files ending with .rules.

The alerting section is crucial for the integration. It specifies that Prometheus should send alerts to an Alertmanager instance located at your-prometheus-alertmanager.yourdomain.com:9093.

With these settings, Prometheus is configured to monitor allquiet.app closely and forward alerts to Alertmanager, which then communicates with the All Quiet platform, ensuring efficient incident management.

Setting Up Alert Rules

After configuring the prometheus.yml file, the next step in integrating Prometheus with All Quiet is to set up alert rules. Alert rules in Prometheus define the conditions under which an alert should be fired. Below is a sample alert rule file that demonstrates how to create a rule for monitoring response times.

Here's the alert rule configuration:


        groups:
        - name: allquiet.app
          rules:
          - alert: Response Time slow
            expr: scrape_duration_seconds{job="allquiet.app"} > 0.1
            for: 5s
            labels:
              severity: critical
            annotations:
              description: "Response time is bad"

        

This rule is set up under a group named allquiet.app. The rule Response Time slow triggers an alert if the scrape_duration_seconds for the allquiet.app job exceeds 0.1 seconds, sustained over a period of 5 seconds. This means if the response time of the monitored service goes beyond 100 milliseconds and stays that way for at least 5 seconds, an alert is triggered.

The labels section classifies the alert's severity as critical, which can be useful for routing and handling the alert. The annotations section provides a descriptive message for the alert, e.g. indicating that the response time of the service is poor. :)

By implementing this alert rule, you can effectively monitor critical performance metrics like response times and ensure that such issues are promptly flagged and communicated to the All Quiet platform for efficient incident management.

Setting Up Alertmanager

The final step in integrating Prometheus Alertmanager with All Quiet is to configure the Alertmanager itself. This configuration ensures that Alertmanager appropriately routes, groups, and sends alerts to the All Quiet platform. Here's how to set up the Alertmanager using the provided YAML configuration:

In this configuration:


        route:
          group_wait: 5s
          group_interval: 5s
          repeat_interval: 20s
          receiver: 'allquiet'

        receivers:
          - name: 'allquiet'
            webhook_configs:
              - url: 'https://allquiet.app/api/webhook/71873b0f-fce1-28a1-b3g6-2362ff95123e7'
        
  • The route section defines how alerts are processed and sent to receivers. group_wait sets the time to wait before sending a notification about new alerts that are added to a group of alerts. group_interval sets the interval between sending notifications about the same group of alerts, while repeat_interval controls how long to wait before sending repeat notifications.
  • The receiver parameter within the route is set to 'allquiet'. This tells Alertmanager to use the allquiet receiver for notifications.
  • In the receivers section, a receiver named allquiet is defined. This receiver uses webhook_configs to send alerts to the specified URL, which is the webhook provided by All Quiet in Copy Webhook URL.

By applying this configuration, you ensure that Alertmanager routes alerts to All Quiet efficiently. The alerts are grouped and sent based on the defined intervals, and the webhook URL ensures that these alerts are received by All Quiet for effective incident management. This setup completes the integration process, enabling your monitoring system to communicate seamlessly with All Quiet.

Step 3: Test Your Integration

You're almost done. 🥳 The next steps are merely there to verify if everything's setup correctly!

Navigate back to All Quiet and your integration that you've created in Create Prometheus Alertmanager Integration .

  1. Click Reload to load your most recent payloads.
  2. Click ← Select to load the test payload from the previous step.
  3. Observe how the mapping transforms the Prometheus Alertmanager payload into an All Quiet incident.
Test Prometheus Alertmanager

In summary, this guide has walked you through the detailed steps to integrate Prometheus Alertmanager with All Quiet, from setting up your Prometheus and Alertmanager configurations to establishing alert rules and verifying the integration.

Read all blog posts and learn about what's happening at All Quiet.