Find the Best Cosmetic Hospitals

Explore trusted cosmetic hospitals and make a confident choice for your transformation.

“Invest in yourself — your confidence is always worth it.”

Explore Cosmetic Hospitals

Start your journey today — compare options in one place.

Datadog Tutorial: Create Monitors / Alerts using Datadog API — Step by Step

This guide uses the current Datadog Monitor API v1, which is still the main API for creating metric, log, APM, and many other monitor types. Datadog’s API docs list /api/v1/monitor for create/get/update/delete monitor operations, and /api/v1/monitor/validate for validating a monitor definition before creating it. (Datadog Monitoring)

We will create monitors using:

curl
Python
JSON payload files
Datadog API keys from environment variables
Code language: JavaScript (javascript)

No outdated dogapi or old legacy-only approach. We will use direct REST API and the current datadog-api-client Python library where needed.


1. What is a Datadog Monitor?

A Datadog Monitor is an alerting rule.

It watches something like:

CPU
Memory
Disk
Logs
APM traces
Error rate
Latency
Synthetic test result
Kubernetes pod status
Custom metric
Code language: JavaScript (javascript)

Then it triggers alert states such as:

OK
Warn
Alert
No Data

A monitor normally contains:

{
  "name": "Monitor name",
  "type": "query alert",
  "query": "avg(last_5m):avg:system.cpu.user{*} > 80",
  "message": "Alert message",
  "tags": ["env:prod", "team:devops"],
  "options": {
    "thresholds": {
      "critical": 80,
      "warning": 70
    }
  }
}
Code language: JSON / JSON with Comments (json)

For metric monitors, Datadog says any metric reporting to Datadog can be alerted on, and threshold alerts compare metric values to a static threshold over a selected time period. (Datadog Monitoring)


2. Monitor API endpoint

The main API endpoint is:

POST /api/v1/monitor

Full pattern:

https://api.<DATADOG_SITE>/api/v1/monitor
Code language: HTML, XML (xml)

Examples:

US1:     https://api.datadoghq.com/api/v1/monitor
US3:     https://api.us3.datadoghq.com/api/v1/monitor
US5:     https://api.us5.datadoghq.com/api/v1/monitor
EU:      https://api.datadoghq.eu/api/v1/monitor
AP1:     https://api.ap1.datadoghq.com/api/v1/monitor
AP2:     https://api.ap2.datadoghq.com/api/v1/monitor
Gov:     https://api.ddog-gov.com/api/v1/monitor
Code language: JavaScript (javascript)

Datadog’s monitor API examples use DD-API-KEY and DD-APPLICATION-KEY headers when creating monitors. (Datadog Monitoring)


3. Required API keys

You need:

DD_API_KEY
DD_APP_KEY
DD_SITE

Datadog states that API requests must be authenticated. API keys are used for write/reporting access, and application keys are also required for read operations and many API actions. (Datadog Monitoring)

3.1 Export environment variables

Use this on your terminal:

export DD_SITE="datadoghq.com"
export DD_API_KEY="replace_with_your_datadog_api_key"
export DD_APP_KEY="replace_with_your_datadog_application_key"
Code language: JavaScript (javascript)

For Japan/AP1 site, use:

export DD_SITE="ap1.datadoghq.com"
Code language: JavaScript (javascript)

For EU:

export DD_SITE="datadoghq.eu"
Code language: JavaScript (javascript)

Do not hardcode real keys inside scripts.


4. Validate API access

First validate your API key.

curl -sS -X GET "https://api.${DD_SITE}/api/v1/validate" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" | jq
Code language: JavaScript (javascript)

Expected output:

{
  "valid": true
}
Code language: JSON / JSON with Comments (json)

Datadog’s authentication API includes GET /api/v1/validate to check whether an API key is valid; invalid keys return 403. (Datadog Monitoring)


5. Recommended folder structure

Create a small working directory:

mkdir -p datadog-monitor-api-lab/monitors
cd datadog-monitor-api-lab

Recommended structure:

datadog-monitor-api-lab/
├── monitors/
│   ├── cpu-idle-low.json
│   ├── apache-5xx-log-monitor.json
│   ├── apm-error-rate.json
│   └── disk-space-low.json
├── create-monitor.sh
├── validate-monitor.sh
├── get-monitor.sh
├── update-monitor.sh
└── delete-monitor.sh
Code language: JavaScript (javascript)

6. Validate monitor JSON before creating

This is very important.

Before creating a monitor, use:

POST /api/v1/monitor/validate

Datadog provides a monitor validation endpoint and examples for validating monitor definitions before creating them. (Datadog Monitoring)

Create this script:

cat > validate-monitor.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

FILE="${1:?Usage: ./validate-monitor.sh <monitor-json-file>}"

curl -sS -X POST "https://api.${DD_SITE}/api/v1/monitor/validate" \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \
  -d @"${FILE}" | jq
EOF

chmod +x validate-monitor.sh
Code language: PHP (php)

Usage:

./validate-monitor.sh monitors/cpu-idle-low.json

If valid, Datadog usually returns an empty object or success-style response.


7. Create monitor script

Create:

cat > create-monitor.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

FILE="${1:?Usage: ./create-monitor.sh <monitor-json-file>}"

curl -sS -X POST "https://api.${DD_SITE}/api/v1/monitor" \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \
  -d @"${FILE}" | jq
EOF

chmod +x create-monitor.sh
Code language: PHP (php)

Usage:

./create-monitor.sh monitors/cpu-idle-low.json

8. Example 1: Create CPU monitor using Datadog API

This monitor alerts when CPU idle is too low. Low idle means CPU is heavily used.

Create JSON file:

cat > monitors/cpu-idle-low.json <<'EOF'
{
  "name": "[LAB] High CPU Usage - CPU idle below 10% by host",
  "type": "query alert",
  "query": "avg(last_5m):avg:system.cpu.idle{*} by {host} < 10",
  "message": "CPU idle is below 10% for host {{host.name}}.\n\nCurrent value: {{value}}\n\nPlease check:\n- Top processes\n- Recent deployments\n- Load average\n- Container CPU usage\n\nNotify: @slack-devops-alerts",
  "tags": [
    "lab:datadog-api",
    "monitor_type:metric",
    "team:devops",
    "env:training"
  ],
  "priority": 3,
  "options": {
    "thresholds": {
      "critical": 10,
      "warning": 20
    },
    "notify_no_data": false,
    "include_tags": true,
    "new_group_delay": 300,
    "renotify_interval": 60,
    "require_full_window": false,
    "notification_preset_name": "show_all"
  }
}
EOF
Code language: JavaScript (javascript)

Validate:

./validate-monitor.sh monitors/cpu-idle-low.json

Create:

./create-monitor.sh monitors/cpu-idle-low.json

Expected response includes an id:

{
  "id": 123456789,
  "name": "[LAB] High CPU Usage - CPU idle below 10% by host",
  "type": "query alert"
}
Code language: JSON / JSON with Comments (json)

Save the monitor ID:

export MONITOR_ID="123456789"
Code language: JavaScript (javascript)

9. Example 2: Create disk space monitor

This alerts when root filesystem usage is above 85%.

cat > monitors/disk-space-low.json <<'EOF'
{
  "name": "[LAB] Disk Usage High - Root filesystem above 85%",
  "type": "query alert",
  "query": "avg(last_5m):avg:system.disk.in_use{device:/} by {host} > 0.85",
  "message": "Root disk usage is above 85% on {{host.name}}.\n\nCurrent value: {{value}}\n\nAction:\n- Check large files\n- Check logs\n- Check Docker/container disk usage\n- Clean old temporary files\n\nNotify: @slack-devops-alerts",
  "tags": [
    "lab:datadog-api",
    "monitor_type:metric",
    "team:devops",
    "env:training"
  ],
  "priority": 3,
  "options": {
    "thresholds": {
      "critical": 0.85,
      "warning": 0.75
    },
    "notify_no_data": false,
    "include_tags": true,
    "new_group_delay": 300,
    "renotify_interval": 120,
    "require_full_window": false
  }
}
EOF
Code language: JavaScript (javascript)

Validate and create:

./validate-monitor.sh monitors/disk-space-low.json
./create-monitor.sh monitors/disk-space-low.json

10. Example 3: Create Apache 5xx log monitor

This monitor watches Apache logs and alerts if 5xx errors exceed 5 logs in the last 5 minutes.

Datadog log monitors evaluate indexed logs only, and log monitor queries use the same search logic as Log Explorer. Datadog also notes that log monitors have a maximum rolling window of two days. (Datadog Monitoring)

cat > monitors/apache-5xx-log-monitor.json <<'EOF'
{
  "name": "[LAB] Apache 5xx Errors High by Host",
  "type": "log alert",
  "query": "logs(\"source:apache @http.status_code:[500 TO 599]\").index(\"*\").rollup(\"count\").by(\"host\").last(\"5m\") > 5",
  "message": "Apache 5xx errors are high on host {{host.name}}.\n\nCurrent value: {{value}}\n\nInvestigation steps:\n- Open Log Explorer\n- Search: source:apache @http.status_code:[500 TO 599]\n- Group by host and URL path\n- Check Apache error logs\n- Check backend application health\n\nNotify: @slack-devops-alerts",
  "tags": [
    "lab:datadog-api",
    "monitor_type:log",
    "team:devops",
    "service:apache",
    "env:training"
  ],
  "priority": 2,
  "options": {
    "thresholds": {
      "critical": 5,
      "warning": 2
    },
    "enable_logs_sample": true,
    "notify_no_data": false,
    "include_tags": true,
    "renotify_interval": 60,
    "require_full_window": false,
    "on_missing_data": "resolve"
  }
}
EOF
Code language: JavaScript (javascript)

Validate and create:

./validate-monitor.sh monitors/apache-5xx-log-monitor.json
./create-monitor.sh monitors/apache-5xx-log-monitor.json

If your Apache logs do not have @http.status_code, use a fallback query:

logs("source:apache 500 OR 502 OR 503 OR 504").index("*").rollup("count").by("host").last("5m") > 5
Code language: JavaScript (javascript)

11. Example 4: Create Ubuntu failed SSH login log monitor

This alerts when failed SSH attempts exceed 10 in 5 minutes.

cat > monitors/ubuntu-failed-ssh-log-monitor.json <<'EOF'
{
  "name": "[LAB] Ubuntu Failed SSH Login Attempts High",
  "type": "log alert",
  "query": "logs(\"\\\"Failed password\\\" OR \\\"Invalid user\\\"\").index(\"*\").rollup(\"count\").by(\"host\").last(\"5m\") > 10",
  "message": "High failed SSH login activity detected on {{host.name}}.\n\nCurrent value: {{value}}\n\nInvestigation:\n- Search logs for: \"Failed password\" OR \"Invalid user\"\n- Check source IPs\n- Check usernames attempted\n- Consider firewall/security group review\n\nNotify: @slack-security-alerts",
  "tags": [
    "lab:datadog-api",
    "monitor_type:log",
    "team:security",
    "source:ubuntu",
    "env:training"
  ],
  "priority": 2,
  "options": {
    "thresholds": {
      "critical": 10,
      "warning": 5
    },
    "enable_logs_sample": true,
    "notify_no_data": false,
    "include_tags": true,
    "renotify_interval": 60,
    "require_full_window": false,
    "on_missing_data": "resolve"
  }
}
EOF
Code language: JavaScript (javascript)

Validate and create:

./validate-monitor.sh monitors/ubuntu-failed-ssh-log-monitor.json
./create-monitor.sh monitors/ubuntu-failed-ssh-log-monitor.json

12. Example 5: Create APM service error rate monitor

APM monitors can alert at service level on hits, errors, and latency measures. Datadog states APM metric monitors work like regular metric monitors but are tailored for APM, and Analytics monitors can alert on Indexed Spans. (Datadog Monitoring)

This example uses Datadog APM trace metrics.

cat > monitors/apm-error-rate.json <<'EOF'
{
  "name": "[LAB] APM Error Rate High - checkout-api",
  "type": "query alert",
  "query": "sum(last_5m):sum:trace.web.request.errors{service:checkout-api,env:prod}.as_count() / sum:trace.web.request.hits{service:checkout-api,env:prod}.as_count() * 100 > 5",
  "message": "APM error rate is above 5% for checkout-api in prod.\n\nCurrent value: {{value}}%\n\nCheck:\n- APM Service Page\n- Error traces\n- Recent deployments\n- Related logs\n- Downstream dependency errors\n\nNotify: @slack-devops-alerts",
  "tags": [
    "lab:datadog-api",
    "monitor_type:apm",
    "service:checkout-api",
    "env:prod",
    "team:backend"
  ],
  "priority": 1,
  "options": {
    "thresholds": {
      "critical": 5,
      "warning": 2
    },
    "notify_no_data": false,
    "include_tags": true,
    "renotify_interval": 30,
    "require_full_window": false,
    "new_group_delay": 300
  }
}
EOF
Code language: JavaScript (javascript)

Validate and create:

./validate-monitor.sh monitors/apm-error-rate.json
./create-monitor.sh monitors/apm-error-rate.json

Important: Replace trace.web.request.errors and trace.web.request.hits with the actual operation-specific APM metrics used by your service. Common examples are based on operation names such as:

trace.web.request.hits
trace.web.request.errors
trace.servlet.request.hits
trace.servlet.request.errors
trace.fastapi.request.hits
trace.fastapi.request.errors
trace.express.request.hits
trace.express.request.errors
Code language: CSS (css)

Check Datadog Metrics Explorer for your actual trace metric names before finalizing.


13. Example 6: Create Kubernetes pod restart monitor

This alerts when Kubernetes container restarts increase.

cat > monitors/kubernetes-container-restarts.json <<'EOF'
{
  "name": "[LAB] Kubernetes Container Restarts High",
  "type": "query alert",
  "query": "sum(last_10m):sum:kubernetes.containers.restarts{*} by {kube_namespace,kube_deployment} > 3",
  "message": "Kubernetes container restarts are high.\n\nNamespace: {{kube_namespace.name}}\nDeployment: {{kube_deployment.name}}\nCurrent value: {{value}}\n\nCheck:\n- kubectl describe pod\n- CrashLoopBackOff\n- OOMKilled\n- Recent deployment\n- Application logs\n\nNotify: @slack-devops-alerts",
  "tags": [
    "lab:datadog-api",
    "monitor_type:kubernetes",
    "team:devops",
    "env:training"
  ],
  "priority": 2,
  "options": {
    "thresholds": {
      "critical": 3,
      "warning": 1
    },
    "notify_no_data": false,
    "include_tags": true,
    "renotify_interval": 60,
    "require_full_window": false,
    "new_group_delay": 300
  }
}
EOF
Code language: JavaScript (javascript)

Validate and create:

./validate-monitor.sh monitors/kubernetes-container-restarts.json
./create-monitor.sh monitors/kubernetes-container-restarts.json

14. Get monitor details by ID

Create script:

cat > get-monitor.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

MONITOR_ID="${1:?Usage: ./get-monitor.sh <monitor-id>}"

curl -sS -X GET "https://api.${DD_SITE}/api/v1/monitor/${MONITOR_ID}" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" | jq
EOF

chmod +x get-monitor.sh
Code language: PHP (php)

Usage:

./get-monitor.sh 123456789
Code language: JavaScript (javascript)

Datadog lists GET /api/v1/monitor/{monitor_id} for retrieving a monitor’s details. (Datadog Monitoring)


15. Search monitors

Search monitors by query:

curl -sS -X GET "https://api.${DD_SITE}/api/v1/monitor/search?query=lab:datadog-api" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" | jq
Code language: JavaScript (javascript)

Datadog lists GET /api/v1/monitor/search for monitor search. (Datadog Monitoring)

Search by name:

curl -sS -X GET "https://api.${DD_SITE}/api/v1/monitor/search?query=%5BLAB%5D" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" | jq
Code language: JavaScript (javascript)

16. Get all monitors

curl -sS -X GET "https://api.${DD_SITE}/api/v1/monitor" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" | jq
Code language: JavaScript (javascript)

Datadog lists GET /api/v1/monitor for getting all monitors. (Datadog Monitoring)


17. Update an existing monitor

To update a monitor, use:

PUT /api/v1/monitor/{monitor_id}

Datadog lists PUT /api/v1/monitor/{monitor_id} as the edit monitor endpoint. (Datadog Monitoring)

Create script:

cat > update-monitor.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

MONITOR_ID="${1:?Usage: ./update-monitor.sh <monitor-id> <monitor-json-file>}"
FILE="${2:?Usage: ./update-monitor.sh <monitor-id> <monitor-json-file>}"

curl -sS -X PUT "https://api.${DD_SITE}/api/v1/monitor/${MONITOR_ID}" \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \
  -d @"${FILE}" | jq
EOF

chmod +x update-monitor.sh
Code language: PHP (php)

Example: change CPU critical threshold from 10 to 5.

cp monitors/cpu-idle-low.json monitors/cpu-idle-low-updated.json

Edit:

"critical": 5,
"warning": 15
Code language: JavaScript (javascript)

Also update the query:

"query": "avg(last_5m):avg:system.cpu.idle{*} by {host} < 5"
Code language: JavaScript (javascript)

Then update:

./update-monitor.sh 123456789 monitors/cpu-idle-low-updated.json

18. Delete a monitor

Create script:

cat > delete-monitor.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail

MONITOR_ID="${1:?Usage: ./delete-monitor.sh <monitor-id>}"

curl -sS -X DELETE "https://api.${DD_SITE}/api/v1/monitor/${MONITOR_ID}" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" | jq
EOF

chmod +x delete-monitor.sh
Code language: PHP (php)

Usage:

./delete-monitor.sh 123456789
Code language: JavaScript (javascript)

Datadog lists DELETE /api/v1/monitor/{monitor_id} for deleting a monitor. (Datadog Monitoring)


19. Python version using current Datadog API client

Install:

python3 -m venv .venv
source .venv/bin/activate
pip install datadog-api-client

Create:

cat > create_metric_monitor.py <<'EOF'
import os
import sys
from datadog_api_client import ApiClient, Configuration
from datadog_api_client.v1.api.monitors_api import MonitorsApi
from datadog_api_client.v1.model.monitor import Monitor
from datadog_api_client.v1.model.monitor_options import MonitorOptions
from datadog_api_client.v1.model.monitor_thresholds import MonitorThresholds
from datadog_api_client.v1.model.monitor_type import MonitorType


def require_env(name: str) -> str:
    value = os.getenv(name)
    if not value:
        print(f"Missing required environment variable: {name}", file=sys.stderr)
        sys.exit(1)
    return value


def main() -> None:
    require_env("DD_API_KEY")
    require_env("DD_APP_KEY")
    require_env("DD_SITE")

    configuration = Configuration()

    body = Monitor(
        name="[LAB] Python API - CPU idle below 10% by host",
        type=MonitorType.QUERY_ALERT,
        query="avg(last_5m):avg:system.cpu.idle{*} by {host} < 10",
        message=(
            "CPU idle is below 10% on {{host.name}}.\n\n"
            "Current value: {{value}}\n\n"
            "Check top processes and recent deployments.\n\n"
            "Notify: @slack-devops-alerts"
        ),
        tags=[
            "lab:datadog-api",
            "created_by:python",
            "team:devops",
            "env:training",
        ],
        priority=3,
        options=MonitorOptions(
            thresholds=MonitorThresholds(
                critical=10,
                warning=20,
            ),
            notify_no_data=False,
            include_tags=True,
            new_group_delay=300,
            renotify_interval=60,
            require_full_window=False,
        ),
    )

    with ApiClient(configuration) as api_client:
        api = MonitorsApi(api_client)

        print("Validating monitor...")
        api.validate_monitor(body=body)

        print("Creating monitor...")
        response = api.create_monitor(body=body)

        print("Created monitor:")
        print(response)


if __name__ == "__main__":
    main()
EOF
Code language: JavaScript (javascript)

Run:

python3 create_metric_monitor.py
Code language: CSS (css)

Datadog’s current Python examples use datadog_api_client, Configuration, ApiClient, and MonitorsApi for monitor operations. (Datadog Monitoring)


20. Python script to create monitor from JSON file

This is more reusable for students.

cat > create_monitor_from_json.py <<'EOF'
import json
import os
import sys
import requests


def require_env(name: str) -> str:
    value = os.getenv(name)
    if not value:
        print(f"Missing required environment variable: {name}", file=sys.stderr)
        sys.exit(1)
    return value


def main() -> None:
    if len(sys.argv) != 2:
        print("Usage: python3 create_monitor_from_json.py <monitor-json-file>", file=sys.stderr)
        sys.exit(1)

    file_path = sys.argv[1]

    dd_site = require_env("DD_SITE")
    dd_api_key = require_env("DD_API_KEY")
    dd_app_key = require_env("DD_APP_KEY")

    with open(file_path, "r", encoding="utf-8") as f:
        payload = json.load(f)

    headers = {
        "Accept": "application/json",
        "Content-Type": "application/json",
        "DD-API-KEY": dd_api_key,
        "DD-APPLICATION-KEY": dd_app_key,
    }

    validate_url = f"https://api.{dd_site}/api/v1/monitor/validate"
    create_url = f"https://api.{dd_site}/api/v1/monitor"

    print(f"Validating {file_path} ...")
    validate_response = requests.post(validate_url, headers=headers, json=payload, timeout=30)

    if validate_response.status_code >= 400:
        print("Validation failed:")
        print(validate_response.status_code)
        print(validate_response.text)
        sys.exit(1)

    print("Validation OK. Creating monitor ...")
    create_response = requests.post(create_url, headers=headers, json=payload, timeout=30)

    if create_response.status_code >= 400:
        print("Create failed:")
        print(create_response.status_code)
        print(create_response.text)
        sys.exit(1)

    print(json.dumps(create_response.json(), indent=2))


if __name__ == "__main__":
    main()
EOF
Code language: PHP (php)

Install dependency:

pip install requests

Run:

python3 create_monitor_from_json.py monitors/apache-5xx-log-monitor.json

21. Monitor type cheat sheet

Common monitor API type values:

MonitorAPI type
Metric threshold monitorquery alert
Log monitorlog alert
Service check monitorservice check
Event monitorevent alert
Process monitorprocess alert
Composite monitorcomposite
Synthetics alert monitorUsually created automatically from Synthetic tests
Cost monitorcost alert
RUM monitorrum alert
CI pipelines monitorci-pipelines alert
Error Tracking monitorDepends on Error Tracking monitor type/query model

For most infrastructure labs, you will use:

query alert
log alert
service check

22. Query examples for common monitor types

CPU high

avg(last_5m):avg:system.cpu.idle{*} by {host} < 10

Memory high

avg(last_5m):avg:system.mem.pct_usable{*} by {host} < 0.10

Disk high

avg(last_5m):avg:system.disk.in_use{device:/} by {host} > 0.85

Apache 5xx logs

logs("source:apache @http.status_code:[500 TO 599]").index("*").rollup("count").by("host").last("5m") > 5
Code language: JavaScript (javascript)

Ubuntu failed SSH

logs("\"Failed password\" OR \"Invalid user\"").index("*").rollup("count").by("host").last("5m") > 10
Code language: JavaScript (javascript)

Kubernetes restarts

sum(last_10m):sum:kubernetes.containers.restarts{*} by {kube_namespace,kube_deployment} > 3

APM error rate

sum(last_5m):sum:trace.web.request.errors{service:checkout-api,env:prod}.as_count() / sum:trace.web.request.hits{service:checkout-api,env:prod}.as_count() * 100 > 5

23. Important monitor options

Datadog’s monitor API options include useful controls like include_tags, new_group_delay, no_data_timeframe, notify_no_data, notify_by, and on_missing_data. For example, new_group_delay skips evaluations for new groups while they initialize, notify_no_data controls notifications when data stops reporting, and notify_by changes alert granularity for grouped monitors. (Datadog Monitoring)

Common options:

{
  "thresholds": {
    "critical": 80,
    "warning": 70
  },
  "notify_no_data": false,
  "include_tags": true,
  "new_group_delay": 300,
  "renotify_interval": 60,
  "require_full_window": false
}
Code language: JSON / JSON with Comments (json)

My practical recommendation

For lab and beginner monitors:

"require_full_window": false
Code language: JavaScript (javascript)

Why? It makes monitors evaluate more quickly and avoids confusion during short lab sessions.

For production monitors, tune this based on the metric and business need.


24. Notification examples

Datadog notifications are usually written in the message field.

Examples:

Notify: @slack-devops-alerts
Notify: @pagerduty-platform
Notify: @webhook-my-webhook
Notify: @team-platform
Code language: CSS (css)

Example message:

"message": "High CPU detected on {{host.name}}.\n\nCurrent value: {{value}}\n\nNotify: @slack-devops-alerts"
Code language: JavaScript (javascript)

Useful template variables:

{{value}}
{{host.name}}
{{service.name}}
{{env.name}}
{{kube_namespace.name}}
{{kube_deployment.name}}

The exact variable availability depends on your monitor grouping.


25. Best practice: naming convention

Use a standard format:

[ENV] [SERVICE] Symptom - Scope
Code language: CSS (css)

Examples:

[PROD] checkout-api Error Rate High
[PROD] Apache 5xx Errors High by Host
[STAGE] Kubernetes Container Restarts High
[TRAINING] Ubuntu Failed SSH Login Attempts High
Code language: CSS (css)

For labs:

[LAB] Apache 5xx Errors High by Host
Code language: CSS (css)

26. Best practice: tagging monitors

Always tag monitors.

Recommended tags:

env:prod
team:devops
service:apache
monitor_type:log
managed_by:api
severity:p2
Code language: CSS (css)

Example:

"tags": [
  "env:prod",
  "team:devops",
  "service:apache",
  "monitor_type:log",
  "managed_by:api"
]
Code language: JavaScript (javascript)

This helps with monitor search, ownership, dashboards, reporting, and cleanup.


27. Student lab flow

Use this sequence for students:

1. Export DD_SITE, DD_API_KEY, DD_APP_KEY.
2. Validate API key.
3. Create JSON monitor definition.
4. Validate monitor JSON using /api/v1/monitor/validate.
5. Create monitor using /api/v1/monitor.
6. Capture monitor ID.
7. View monitor in Datadog UI.
8. Update monitor threshold.
9. Search monitor using API.
10. Delete lab monitor.
Code language: JavaScript (javascript)

28. Full lab command sequence

export DD_SITE="datadoghq.com"
export DD_API_KEY="replace_me"
export DD_APP_KEY="replace_me"

curl -sS -X GET "https://api.${DD_SITE}/api/v1/validate" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" | jq

./validate-monitor.sh monitors/cpu-idle-low.json

./create-monitor.sh monitors/cpu-idle-low.json

export MONITOR_ID="replace_with_created_monitor_id"

./get-monitor.sh "${MONITOR_ID}"

curl -sS -X GET "https://api.${DD_SITE}/api/v1/monitor/search?query=lab:datadog-api" \
  -H "Accept: application/json" \
  -H "DD-API-KEY: ${DD_API_KEY}" \
  -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" | jq

./delete-monitor.sh "${MONITOR_ID}"
Code language: PHP (php)

29. Troubleshooting

Problem: 403 Forbidden

Possible causes:

Wrong API key
Wrong APP key
Wrong DD_SITE
Application key does not have permission
Account/org mismatch

Check:

echo "$DD_SITE"
curl -sS -X GET "https://api.${DD_SITE}/api/v1/validate" \
  -H "DD-API-KEY: ${DD_API_KEY}" | jq
Code language: PHP (php)

Problem: monitor validation fails

Common causes:

Wrong monitor type
Wrong query syntax
Threshold mismatch
Metric does not exist
Invalid log query escaping
Invalid JSON
Unsupported option for that monitor type
Code language: JavaScript (javascript)

Validate JSON locally:

jq . monitors/apache-5xx-log-monitor.json

Problem: log monitor returns no data

Check:

Are logs indexed?
Is the source correct?
Is @http.status_code parsed?
Is time range correct?
Is the index correct?
Code language: CSS (css)

Datadog log monitors evaluate indexed logs only. (Datadog Monitoring)

Problem: APM monitor does not work

Check actual metric names in Metrics Explorer. Your application may use:

trace.web.request.*
trace.servlet.request.*
trace.express.request.*
trace.fastapi.request.*
Code language: CSS (css)

The service and env tags must match real APM data.


30. Final reusable template

Use this as a base for any monitor:

{
  "name": "[ENV] Service Symptom - Scope",
  "type": "query alert",
  "query": "avg(last_5m):avg:some.metric{env:prod,service:my-service} by {host} > 80",
  "message": "Alert message here.\n\nCurrent value: {{value}}\n\nNotify: @slack-devops-alerts",
  "tags": [
    "env:prod",
    "service:my-service",
    "team:devops",
    "managed_by:api"
  ],
  "priority": 3,
  "options": {
    "thresholds": {
      "critical": 80,
      "warning": 70
    },
    "notify_no_data": false,
    "include_tags": true,
    "new_group_delay": 300,
    "renotify_interval": 60,
    "require_full_window": false
  }
}
Code language: JSON / JSON with Comments (json)

31. Final recommendation

For production, do not create monitors manually one by one forever. Use the API workflow like this:

Monitor JSON in Git
        ↓
Pull request review
        ↓
CI validates monitor via /api/v1/monitor/validate
        ↓
CI creates/updates monitor through API
        ↓
Monitor tagged with managed_by:api
        ↓
Dashboard/SLO/incident workflow uses same tags
Code language: JavaScript (javascript)

For students, the best learning sequence is:

Create metric monitor
Create log monitor
Create APM monitor
Validate JSON
Update threshold
Search monitor
Delete monitor
Code language: JavaScript (javascript)

That gives them the full Datadog alerting lifecycle using the API.


Old Content


curl -X POST -H "Content-type: application/json" -H "DD-API-KEY: XXXXXXXXXXXXXXXXX" -H "DD-APPLICATION-KEY: ddapp_XXXXXXXXXXX" -d @devops.json "https://api.datadoghq.com/api/v1/monitor"Code language: JavaScript (javascript)

Find Trusted Cardiac Hospitals

Compare heart hospitals by city and services — all in one place.

Explore Hospitals
I’m a DevOps/SRE/DevSecOps/Cloud Expert passionate about sharing knowledge and experiences. I have worked at <a href="https://www.cotocus.com/">Cotocus</a>. I share tech blog at <a href="https://www.devopsschool.com/">DevOps School</a>, travel stories at <a href="https://www.holidaylandmark.com/">Holiday Landmark</a>, stock market tips at <a href="https://www.stocksmantra.in/">Stocks Mantra</a>, health and fitness guidance at <a href="https://www.mymedicplus.com/">My Medic Plus</a>, product reviews at <a href="https://www.truereviewnow.com/">TrueReviewNow</a> , and SEO strategies at <a href="https://www.wizbrand.com/">Wizbrand.</a> Do you want to learn <a href="https://www.quantumuting.com/">Quantum Computing</a>? <strong>Please find my social handles as below;</strong> <a href="https://www.rajeshkumar.xyz/">Rajesh Kumar Personal Website</a> <a href="https://www.youtube.com/TheDevOpsSchool">Rajesh Kumar at YOUTUBE</a> <a href="https://www.instagram.com/rajeshkumarin">Rajesh Kumar at INSTAGRAM</a> <a href="https://x.com/RajeshKumarIn">Rajesh Kumar at X</a> <a href="https://www.facebook.com/RajeshKumarLog">Rajesh Kumar at FACEBOOK</a> <a href="https://www.linkedin.com/in/rajeshkumarin/">Rajesh Kumar at LINKEDIN</a> <a href="https://www.wizbrand.com/rajeshkumar">Rajesh Kumar at WIZBRAND</a> <a href="https://www.rajeshkumar.xyz/dailylogs">Rajesh Kumar DailyLogs</a>

Related Posts

Datadog Assignment & Project Master Plan

Target Audience This lab is suitable for: DevOps engineers, SREs, cloud engineers, platform engineers, application engineers, monitoring engineers, and students learning Datadog from practical implementation. Final Outcome…

Read More

Datadog Cloud SIEM: Complete End-to-End Master Guide

Current as of June 2026. Datadog Cloud SIEM is Datadog’s security information and event management product for collecting security telemetry, analyzing logs and events with detection rules,…

Read More

Datadog Agent Commands with Examples

Below is a Datadog Agent command cheat sheet in table format. I’m focusing only on Agent CLI / Agent service commands, with practical examples and explanations. The…

Read More

Datadog Troubleshooting Master Guide

It covers: Datadog Agent, Kubernetes Agent, Cluster Agent, integrations, logs, APM/traces, custom metrics, DogStatsD, OpenTelemetry, API keys, Terraform, monitors, SLOs, RUM, Synthetics, cloud integrations, cost, permissions, and…

Read More

Datadog FAQ / Interview Questions and Answers — 50 Questions

Below is a Datadog theoretical / approach / capability FAQ set — not MCQ style. These are the kinds of questions that usually come in interviews, internal…

Read More

Datadog Interview Questions and Answer

1. What is Datadog primarily used for? A. Source code version controlB. Infrastructure, application, log, and security observabilityC. Database schema migration onlyD. Static website hosting Correct Answer:…

Read More
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Jason Mitchell
Jason Mitchell
2 months ago

This is a very clear and practical guide that explains how to create and manage monitors and alerts in Datadog using the API for better observability and automation.

0
Would love your thoughts, please comment.x
()
x