Using Graphite + Telegraf Linux Metrics
You are on the correct screen now: Alerting → Alert rules → New recording rule.
But one important point first: Grafana recording rules are not stored back into Graphite. In Grafana 13, a Grafana-managed recording rule can read from a supported data source such as Graphite, but the recorded result must be written into a Prometheus-compatible data source. Grafana does not include its own time-series database for recording rule output. (Grafana Labs)
Your Graphite data is valid for this lab. The real metric prefix is:
telegraf.linux-demo.*
Code language: CSS (css)
The uploaded metric output also confirms that servers.* returns empty, so this lab will not use servers.*. It will only use the real captured Telegraf metrics from Graphite.
1. What is a Grafana Recording Rule?
A recording rule pre-calculates a query result and saves it as a new metric.
Example:
Instead of repeatedly querying this Graphite metric:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
Grafana can evaluate it every 1 minute and store the result as a new Prometheus-style metric:
linux_cpu_usage_active_percent
Then dashboards and alerts can query the new recorded metric.
Simple explanation for students
Normal dashboard query:
Grafana asks Graphite every time the dashboard loads.
Recording rule:
Grafana runs the query on schedule and stores the result as a new metric.
Alert rule:
Grafana checks a metric and fires an alert when a condition is true.
Code language: JavaScript (javascript)
2. Very Important: Recording Rule vs Alert Rule
| Option | Purpose | Use in this lab? |
|---|---|---|
| New alert rule | Creates alert condition like CPU > 90% | Yes |
| New recording rule | Creates a new precomputed metric | Yes, for this lab |
| New recording rule with CloudWatch | Not for your Graphite lab | No |
In your screenshot, the query section currently shows CloudWatch. That is why you see an error about CloudWatch statistics. For this lab, students must select the Graphite data source as the query source.
3. Architecture for This Recording Rule Lab
The flow should be explained like this:
Telegraf
↓
Graphite
↓
Grafana recording rule reads Graphite metric
↓
Grafana writes recorded result into Prometheus-compatible data source
↓
Dashboard and alert can use the recorded metric
Code language: PHP (php)
So we need two Grafana data sources:
1. Graphite
Used as the source query data source.
2. Prometheus-compatible data source
Used as the target data source to store recording rule output.
Code language: JavaScript (javascript)
Grafana documentation says Grafana-managed recording rules can query alerting-supported data sources, but the result must be stored in a Prometheus-compatible database. (Grafana Labs)
4. Prerequisites for Students
Before starting this lab, students should already have:
Grafana 13.x running on port 3000
Graphite running on port 8080
Telegraf sending Linux metrics to Graphite
Graphite data source added in Grafana
Code language: CSS (css)
Current valid Graphite metrics include:
telegraf.linux-demo.cpu.*
telegraf.linux-demo.mem.*
telegraf.linux-demo.disk.*
telegraf.linux-demo.diskio.*
telegraf.linux-demo.net.*
telegraf.linux-demo.kernel.*
telegraf.linux-demo.processes.*
telegraf.linux-demo.swap.*
telegraf.linux-demo.system.*
Code language: CSS (css)
Do not use:
servers.*
Do not use .wsp in Grafana queries.
Correct:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
Wrong:
telegraf.linux-demo.cpu.usage_active.wsp
Code language: CSS (css)
5. Add a Prometheus-Compatible Target for Recording Rules
If your Target data source dropdown is empty or does not show a Prometheus-compatible data source, you need to add one.
For a classroom lab, the simplest option is to run a small Prometheus container as the recording-rule target.
Step 1: Create a Prometheus config file
On the Linux VM, run:
mkdir -p ~/prometheus-recording-lab
cd ~/prometheus-recording-lab
Create a file:
cat > prometheus.yml <<'EOF'
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ["localhost:9090"]
EOF
Code language: PHP (php)
Step 2: Start Prometheus container
docker run -d \
--name prometheus-recording \
--restart unless-stopped \
-p 9090:9090 \
-v ~/prometheus-recording-lab/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus:latest \
--config.file=/etc/prometheus/prometheus.yml \
--web.enable-remote-write-receiver \
--web.enable-lifecycle
Code language: JavaScript (javascript)
Prometheus can receive remote-write data when the remote write receiver is enabled, and its receiver endpoint is /api/v1/write. (Prometheus)
Step 3: Verify Prometheus is running
docker ps
Expected container:
prometheus-recording
Test from the Linux VM:
curl "http://localhost:9090/-/ready"
Code language: JavaScript (javascript)
Expected output:
Prometheus Server is Ready.
6. Add Prometheus as a Grafana Data Source
In Grafana 13.x left menu:
Connections → Add new connection
Code language: JavaScript (javascript)
Search:
Prometheus
Click:
Add new data source
Code language: JavaScript (javascript)
Use this name:
prometheus-recording
Use this URL if Grafana is installed directly on the same VM:
http://localhost:9090
Code language: JavaScript (javascript)
If Grafana is running inside Docker, localhost may not work because it points to the Grafana container itself. In that case, use the VM private IP, public IP, or Docker network name.
Click:
Save & test
Expected result:
Successfully queried the Prometheus API.
In Prometheus data source settings, Grafana has an alerting option called Allow as recording rules target, which allows that data source to appear as a target when creating Grafana-managed recording rules. (Grafana Labs)
7. Verify Graphite Query Before Creating Recording Rule
Before creating recording rules, students should test the Graphite query.
In Grafana:
Explore
Select data source:
Graphite
Run:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
If data appears, the query is correct.
Also test:
telegraf.linux-demo.mem.used_percent
Code language: CSS (css)
telegraf.linux-demo.disk.used_percent
Code language: CSS (css)
8. Create First Recording Rule: CPU Usage
Now go to:
Alerting → Alert rules → New recording rule
Code language: PHP (php)
Your screen has these main sections:
1. Enter recording rule and metric name
2. Define recording rule
3. Add folder and labels
4. Set evaluation behavior
Code language: JavaScript (javascript)
Section 1: Enter Recording Rule and Metric Name
Fill the fields like this:
Name
record_cpu_usage_active_percent
Metric
linux_cpu_usage_active_percent
Target data source
Select:
prometheus-recording
Important: the metric name must follow Prometheus metric naming style and must not contain spaces. Grafana’s recording rule documentation also says the metric name must be a Prometheus metric name with no whitespace. (Grafana Labs)
Do not use dots in the recorded metric name.
Wrong:
linux.cpu.usage.active.percent
Code language: CSS (css)
Correct:
linux_cpu_usage_active_percent
Section 2: Define Recording Rule
In the query area, change the data source from CloudWatch to:
Graphite
Use this Graphite query:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
Set time range:
10m to now
Click:
Run queries
You should see data in the preview graph.
Now add an expression:
Add expression → Reduce
Set:
Input: A
Function: Last
Mode: Strict
Code language: HTTP (http)
Click:
Preview
Then click:
Set as recording rule output
Code language: JavaScript (javascript)
Use the Reduce output as the recording rule output.
Why Reduce? Because Graphite returns a time series. The recording rule should store a clean value at every evaluation interval.
Section 3: Add Folder and Labels
Folder:
Linux Monitoring Lab
If it does not exist, click:
New folder
Code language: PHP (php)
Add labels:
source = graphite
host = linux-demo
collector = telegraf
lab = grafana13
Labels help students search, filter, and organize rules.
Section 4: Set Evaluation Behavior
Create or select evaluation group:
graphite-recording-rules-1m
Evaluation interval:
1m
This means Grafana will run the recording rule every 1 minute.
Click:
Save rule
or:
Save rule and exit
Code language: PHP (php)
9. Verify the Recorded Metric
Wait 1–2 minutes.
Go to:
Explore
Select data source:
prometheus-recording
Run:
linux_cpu_usage_active_percent
If labels were added, you can also query:
linux_cpu_usage_active_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Expected result:
A new Prometheus-style metric appears.
Code language: JavaScript (javascript)
This metric was created from the original Graphite metric:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
10. Required Recording Rules for Student Lab
Create these recording rules one by one. They only use the real Graphite metrics captured in your environment.
Rule 1: CPU Active Usage
Name:
record_cpu_usage_active_percent
Metric:
linux_cpu_usage_active_percent
Source data source:
Graphite
Query:
telegraf.linux-demo.cpu.usage_active
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 2: Memory Used Percent
Name:
record_memory_used_percent
Metric:
linux_memory_used_percent
Source data source:
Graphite
Query:
telegraf.linux-demo.mem.used_percent
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 3: Disk Used Percent
Name:
record_disk_used_percent
Metric:
linux_disk_used_percent
Source data source:
Graphite
Query:
telegraf.linux-demo.disk.used_percent
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 4: Disk Inodes Used Percent
Name:
record_disk_inodes_used_percent
Metric:
linux_disk_inodes_used_percent
Source data source:
Graphite
Query:
telegraf.linux-demo.disk.inodes_used_percent
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 5: Swap Used Percent
Name:
record_swap_used_percent
Metric:
linux_swap_used_percent
Source data source:
Graphite
Query:
telegraf.linux-demo.swap.used_percent
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 6: System Load 1 Minute
Name:
record_system_load1
Metric:
linux_system_load1
Source data source:
Graphite
Query:
telegraf.linux-demo.system.load1
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 7: Running Processes
Name:
record_processes_running
Metric:
linux_processes_running
Source data source:
Graphite
Query:
telegraf.linux-demo.processes.running
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 8: Zombie Processes
Name:
record_processes_zombies
Metric:
linux_processes_zombies
Source data source:
Graphite
Query:
telegraf.linux-demo.processes.zombies
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 9: Blocked Processes
Name:
record_processes_blocked
Metric:
linux_processes_blocked
Source data source:
Graphite
Query:
telegraf.linux-demo.processes.blocked
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 10: Disk I/O Utilization
Name:
record_diskio_util_percent
Metric:
linux_diskio_util_percent
Source data source:
Graphite
Query:
telegraf.linux-demo.diskio.io_util
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 11: Network Receive Bytes per Second
Name:
record_network_receive_bytes_per_second
Metric:
linux_network_receive_bytes_per_second
Source data source:
Graphite
Query:
perSecond(telegraf.linux-demo.net.bytes_recv)
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 12: Network Transmit Bytes per Second
Name:
record_network_transmit_bytes_per_second
Metric:
linux_network_transmit_bytes_per_second
Source data source:
Graphite
Query:
perSecond(telegraf.linux-demo.net.bytes_sent)
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 13: Disk Read Bytes per Second
Name:
record_disk_read_bytes_per_second
Metric:
linux_disk_read_bytes_per_second
Source data source:
Graphite
Query:
perSecond(telegraf.linux-demo.diskio.read_bytes)
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 14: Disk Write Bytes per Second
Name:
record_disk_write_bytes_per_second
Metric:
linux_disk_write_bytes_per_second
Source data source:
Graphite
Query:
perSecond(telegraf.linux-demo.diskio.write_bytes)
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 15: Network Input Errors
Name:
record_network_input_errors
Metric:
linux_network_input_errors
Source data source:
Graphite
Query:
telegraf.linux-demo.net.err_in
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 16: Network Output Errors
Name:
record_network_output_errors
Metric:
linux_network_output_errors
Source data source:
Graphite
Query:
telegraf.linux-demo.net.err_out
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
Rule 17: Kernel Context Switches per Second
Name:
record_kernel_context_switches_per_second
Metric:
linux_kernel_context_switches_per_second
Source data source:
Graphite
Query:
perSecond(telegraf.linux-demo.kernel.context_switches)
Expression:
Reduce → Last
Target data source:
prometheus-recording
Code language: CSS (css)
11. Recommended Student Practice Sequence
Do not ask students to create all rules immediately. Use this classroom sequence:
Beginner Practice
Create only these 5 recording rules:
linux_cpu_usage_active_percent
linux_memory_used_percent
linux_disk_used_percent
linux_system_load1
linux_processes_zombies
Intermediate Practice
Add these:
linux_swap_used_percent
linux_diskio_util_percent
linux_network_receive_bytes_per_second
linux_network_transmit_bytes_per_second
Advanced Practice
Add these:
linux_disk_read_bytes_per_second
linux_disk_write_bytes_per_second
linux_network_input_errors
linux_network_output_errors
linux_kernel_context_switches_per_second
12. Create Dashboard Panels from Recorded Metrics
After recording rules are working, create a new dashboard.
Go to:
Dashboards → New → New dashboard → Add visualization
Code language: PHP (php)
Select data source:
prometheus-recording
Now use the recorded metrics, not Graphite.
Panel 1: Recorded CPU Usage
Panel title:
Recorded CPU Usage %
Query:
linux_cpu_usage_active_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Visualization:
Stat
Unit:
Percent
Panel 2: Recorded Memory Usage
linux_memory_used_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Visualization:
Gauge
Unit:
Percent
Panel 3: Recorded Disk Usage
linux_disk_used_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Visualization:
Gauge
Unit:
Percent
Panel 4: Recorded System Load
linux_system_load1{host="linux-demo"}
Code language: JavaScript (javascript)
Visualization:
Time series
Unit:
None
Panel 5: Recorded Network Traffic
Queries:
linux_network_receive_bytes_per_second{host="linux-demo"}
linux_network_transmit_bytes_per_second{host="linux-demo"}
Code language: JavaScript (javascript)
Visualization:
Time series
Unit:
Bytes/sec
13. Create Alerts from Recorded Metrics
Now students can create alert rules using the recorded metrics.
Go to:
Alerting → Alert rules → New alert rule
Code language: PHP (php)
This time use New alert rule, not New recording rule.
Alert 1: High Recorded CPU Usage
Name:
High CPU Usage - Recorded
Data source:
prometheus-recording
Query:
linux_cpu_usage_active_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Condition:
WHEN Last OF QUERY IS ABOVE 90
Evaluate every:
1m
Pending period:
5m
Alert 2: High Recorded Memory Usage
Query:
linux_memory_used_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Condition:
IS ABOVE 90
Alert 3: High Recorded Disk Usage
Query:
linux_disk_used_percent{host="linux-demo"}
Code language: JavaScript (javascript)
Condition:
IS ABOVE 90
Alert 4: Zombie Processes Detected
Query:
linux_processes_zombies{host="linux-demo"}
Code language: JavaScript (javascript)
Condition:
IS ABOVE 0
Alert 5: Network Input Errors
Query:
linux_network_input_errors{host="linux-demo"}
Code language: JavaScript (javascript)
Condition:
IS ABOVE 0
14. Why Your Screenshot Shows an Error
Your screenshot shows this error:
query must have either statistic or statistics field
That error is from CloudWatch, not Graphite.
It happened because the query data source is currently set to:
cloudwatch
For this lab, change it to:
Graphite
Then use a Graphite metric such as:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
Also make sure the Target data source at the top is:
prometheus-recording
Not CloudWatch.
Not Graphite.
The target is where Grafana stores the newly recorded metric.
15. Troubleshooting Guide
Problem: Target data source dropdown is empty
Reason:
No Prometheus-compatible data source is available as recording-rule target.
Code language: JavaScript (javascript)
Fix:
Add Prometheus as a data source.
Enable or keep enabled: Allow as recording rules target.
Code language: JavaScript (javascript)
Problem: Graphite does not appear in query data source
Reason:
Graphite data source may not be configured or may not support alerting in this Grafana setup.
Code language: JavaScript (javascript)
Fix:
Connections → Data sources → Graphite → Save & test
Then try again.
Problem: Rule saves but metric does not appear
Wait:
1 to 2 minutes
Then query in Explore using Prometheus:
linux_cpu_usage_active_percent
Also check:
Alerting → Alert rules
Make sure the rule is not paused.
Problem: Student used .wsp
Wrong:
telegraf.linux-demo.cpu.usage_active.wsp
Code language: CSS (css)
Correct:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
Problem: Student used servers.*
Wrong:
servers.cpu.usage
Code language: CSS (css)
Correct:
telegraf.linux-demo.cpu.usage_active
Code language: CSS (css)
Reason:
Your Graphite data does not contain servers.* metrics.
16. Final Student Lab Summary
By the end of this lab, students should understand:
1. Graphite stores the original Telegraf Linux metrics.
2. Grafana recording rules can read Graphite metrics.
3. Recording rule output must be stored in a Prometheus-compatible data source.
4. The recorded metric name must use Prometheus-style naming.
5. Dashboards and alerts can use the new recorded metric.
6. Recording rules are useful for frequently used or expensive queries.
Code language: JavaScript (javascript)
Most important classroom rule:
Source query:
Use real Graphite metrics from telegraf.linux-demo.*
Target output:
Store as Prometheus-style metrics such as linux_cpu_usage_active_percent
Never use:
servers.*
Never include:
.wsp
Code language: JavaScript (javascript)I’m a DevOps/SRE/DevSecOps/Cloud Expert passionate about sharing knowledge and experiences. I have worked at Cotocus. I share tech blog at DevOps School, travel stories at Holiday Landmark, stock market tips at Stocks Mantra, health and fitness guidance at My Medic Plus, product reviews at TrueReviewNow , and SEO strategies at Wizbrand.
Do you want to learn Quantum Computing?
Please find my social handles as below;
Rajesh Kumar Personal Website
Rajesh Kumar at YOUTUBE
Rajesh Kumar at INSTAGRAM
Rajesh Kumar at X
Rajesh Kumar at FACEBOOK
Rajesh Kumar at LINKEDIN
Rajesh Kumar at WIZBRAND
Find Trusted Cardiac Hospitals
Compare heart hospitals by city and services — all in one place.
Explore Hospitals