OPNsense Telegraf Configuration
This role configures Telegraf metrics collection agent on OPNsense firewall by deploying custom monitoring scripts and configuring exec input plugins.
OPNsense Telegraf Configuration Role
Overview
This role configures Telegraf metrics collection agent on OPNsense firewall by deploying custom monitoring scripts and configuring exec input plugins. It creates a scripts directory, deploys a CPU temperature monitoring script, adds exec input configuration to telegraf.conf, and restarts the Telegraf service to activate monitoring. This enables CPU temperature metrics to be collected and sent to InfluxDB for visualization in Grafana.
Purpose
- CPU Temperature Monitoring: Collect thermal metrics from OPNsense
- Custom Metrics: Deploy shell scripts for specialized monitoring
- InfluxDB Integration: Send metrics to time-series database
- Grafana Visualization: Enable temperature dashboards
- Exec Input Pattern: Use Telegraf exec plugin for custom collectors
- Automated Configuration: Deploy and configure via SSH
Requirements
- Ansible 2.9 or higher
- OPNsense firewall with Telegraf installed (see
opnsense_install_packages) - SSH access to OPNsense
- Root or sudo privileges on OPNsense
- Network connectivity to OPNsense (VLAN10)
- InfluxDB server configured (Telegraf output already configured)
What is Telegraf Exec Input?
Telegraf exec input runs external programs to collect metrics:
How it works:
Telegraf → Runs shell script → Parses output → Sends to InfluxDB
Example:
# Script output (InfluxDB line protocol)
sensors,feature=cpu temp_input=45.0
sensors,feature=cpu temp_input=46.5
# Telegraf collects this every interval
# Sends to InfluxDB
Benefits:
- Collect any metric via script
- Use existing monitoring scripts
- Parse custom data sources
- Extend Telegraf functionality
Role Variables
Optional Variables
| Variable | Default | Description |
|---|---|---|
opnsense_telegraf_configuration_scripts_folder | /usr/local/etc/telegraf-scripts | Scripts directory path |
opnsense_telegraf_configuration_temperature_script_name | cputemp.sh | CPU temp script filename |
opnsense_telegraf_configuration_exec_timeout | 5s | Script execution timeout |
opnsense_telegraf_configuration_data_format | influx | Data format (influx line protocol) |
opnsense_telegraf_configuration_config_file | /usr/local/etc/telegraf.conf | Telegraf config path |
Variable Details
opnsense_telegraf_configuration_scripts_folder
Directory to store custom monitoring scripts.
Default: /usr/local/etc/telegraf-scripts
Created automatically if doesn’t exist.
opnsense_telegraf_configuration_temperature_script_name
Name of CPU temperature monitoring script.
Default: cputemp.sh
Full path: <scripts_folder>/cputemp.sh
opnsense_telegraf_configuration_exec_timeout
Maximum time to wait for script execution.
Default: 5s
Values: Duration string (e.g., 5s, 10s, 1m)
Timeout behavior: If script takes longer, Telegraf kills it and reports error.
opnsense_telegraf_configuration_data_format
Output format from exec script.
Default: influx (InfluxDB line protocol)
Other formats: json, graphite, value, nagios
Influx format:
measurement,tag=value field=value timestamp
Dependencies
This role requires:
- opnsense_install_packages: Install
os-telegrafpackage first - SSH access to OPNsense
- Telegraf service installed and running
Optionally used with:
- telegraf_agent: Configure Telegraf on other hosts
- deploy_influxdb: Deploy InfluxDB server
- deploy_grafana: Visualize metrics
Example Playbook
Basic Usage
---
- name: Configure OPNsense Telegraf
hosts: opnsense
become: true
roles:
- opnsense_telegraf_configuration
Custom Script Directory
---
- name: Configure Telegraf with Custom Path
hosts: opnsense
become: true
vars:
opnsense_telegraf_configuration_scripts_folder: /usr/local/scripts
roles:
- opnsense_telegraf_configuration
Longer Timeout
---
- name: Configure Telegraf with Extended Timeout
hosts: opnsense
become: true
vars:
opnsense_telegraf_configuration_exec_timeout: 10s
roles:
- opnsense_telegraf_configuration
What This Role Does
-
Create scripts directory:
- Creates
/usr/local/etc/telegraf-scripts - Sets permissions:
0755(rwxr-xr-x) - Ensures directory exists
- Creates
-
Deploy CPU temperature script:
- Copies
cputemp.shto scripts folder - Sets executable permissions:
0755 - Script queries
sysctl dev.cpufor temperatures
- Copies
-
Configure exec input:
- Adds configuration block to
/usr/local/etc/telegraf.conf - Uses
blockinfilewith markers for idempotency - Configures command path, timeout, data format
- Adds configuration block to
-
Restart Telegraf:
- Restarts
telegrafservice - Activates new configuration
- Begins collecting CPU temperature metrics
- Restarts
CPU Temperature Script
Script Content
File: cputemp.sh
sysctl dev.cpu | grep temperature | sed 's/[a-z\.]*/sensors,feature=cpu/;s/\.[a-z]*\: / temp_input=/;s/.$//'
What It Does
- Query sysctl:
sysctl dev.cpu- Get CPU information - Filter temperature:
grep temperature- Only temperature readings - Format output:
sed- Convert to InfluxDB line protocol
Example Output
Raw sysctl output:
dev.cpu.0.temperature: 45.0C
dev.cpu.1.temperature: 46.5C
dev.cpu.2.temperature: 44.2C
dev.cpu.3.temperature: 47.1C
Script output (InfluxDB format):
sensors,feature=cpu temp_input=45.0
sensors,feature=cpu temp_input=46.5
sensors,feature=cpu temp_input=44.2
sensors,feature=cpu temp_input=47.1
InfluxDB Line Protocol
Format:
measurement,tag1=value1,tag2=value2 field1=value1,field2=value2 timestamp
In this script:
- Measurement:
sensors - Tag:
feature=cpu - Field:
temp_input=<temperature> - Timestamp: Added by Telegraf automatically
Why tags: Tags are indexed, enabling fast queries like “show me all CPU metrics”
Telegraf Configuration
Exec Input Block
Added to telegraf.conf:
[[inputs.exec]]
commands = [
"sh /usr/local/etc/telegraf-scripts/cputemp.sh",
]
timeout = "5s"
data_format = "influx"
Configuration Details
commands: List of commands to execute
- Can have multiple scripts
- Executed every collection interval
timeout: Max execution time
- Prevents hung scripts
- Default: 5 seconds
data_format: Output parser
influx: InfluxDB line protocoljson: JSON outputgraphite: Graphite format
Idempotency
Uses blockinfile with markers:
marker: "# {mark} ANSIBLE MANAGED BLOCK - CPU Temperature Monitoring"
First run: Adds configuration block Subsequent runs: Updates block if changed, otherwise no-op
Safe to run repeatedly without duplicating configuration.
InfluxDB Measurement
Data Storage
Measurement name: sensors
Tags:
feature=cpuhost=opnsense(added by Telegraf)
Fields:
temp_input: Temperature value (float)
Query Examples
InfluxQL:
-- Get latest CPU temps
SELECT last(temp_input) FROM sensors WHERE feature='cpu' AND host='opnsense'
-- Average over last hour
SELECT mean(temp_input) FROM sensors
WHERE feature='cpu' AND host='opnsense'
AND time > now() - 1h
GROUP BY time(5m)
Flux (InfluxDB 2.x):
from(bucket: "telegraf")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "sensors")
|> filter(fn: (r) => r.feature == "cpu")
|> filter(fn: (r) => r.host == "opnsense")
|> filter(fn: (r) => r._field == "temp_input")
Grafana Visualization
Create Dashboard Panel
Query:
SELECT mean("temp_input")
FROM "sensors"
WHERE ("host" = 'opnsense' AND "feature" = 'cpu')
AND $timeFilter
GROUP BY time($__interval) fill(linear)
Panel settings:
- Visualization: Time series or Gauge
- Unit: Celsius (°C)
- Thresholds:
- Green: < 60°C
- Yellow: 60-80°C
- Red: > 80°C
Alert Configuration
Alert rule:
Condition:
WHEN avg() OF query(A, 5m, now) IS ABOVE 80
Notifications:
Send to: admin-email
Message: "OPNsense CPU temperature high: {{ value }}°C"
File Locations
| File | Path | Purpose |
|---|---|---|
| Scripts directory | /usr/local/etc/telegraf-scripts/ | Custom monitoring scripts |
| CPU temp script | /usr/local/etc/telegraf-scripts/cputemp.sh | Temperature collector |
| Telegraf config | /usr/local/etc/telegraf.conf | Main Telegraf configuration |
| Telegraf service | /usr/local/etc/rc.d/telegraf | FreeBSD service script |
| Telegraf logs | /var/log/telegraf/ | Service logs |
Security Considerations
- SSH Access: Role requires SSH with root/sudo
- Script Permissions: Scripts executable by Telegraf user
- Command Injection: Scripts don’t accept external input (safe)
- Read-Only Data: Scripts only read system data
- No Network Access: Scripts run locally on OPNsense
- Resource Usage: Scripts lightweight (negligible CPU/memory)
- Timeout Protection: Prevents runaway scripts
Tags
This role does not define any tags. Use playbook-level tags if needed:
- hosts: opnsense
roles:
- opnsense_telegraf_configuration
tags:
- opnsense
- telegraf
- monitoring
- metrics
Notes
- Requires Telegraf already installed (via
opnsense_install_packages) - Runs directly on OPNsense (not delegated to localhost)
become: truerequired for file operations and service restart- Configuration uses blockinfile for idempotency
- Scripts run every Telegraf collection interval (typically 10s)
- CPU temperature from
sysctl dev.cpu - Output in InfluxDB line protocol format
- Restarts Telegraf to apply changes
Troubleshooting
No temperature data in InfluxDB
Check Telegraf running:
# On OPNsense
service telegraf status
Check script executable:
ls -la /usr/local/etc/telegraf-scripts/cputemp.sh
# Should be: -rwxr-xr-x
Test script manually:
sh /usr/local/etc/telegraf-scripts/cputemp.sh
# Should output: sensors,feature=cpu temp_input=XX.X
Check Telegraf logs:
tail -f /var/log/telegraf/telegraf.log
# Look for exec plugin errors
Script timeout errors
Symptom: Telegraf logs show timeout
Error in plugin [inputs.exec]: exec: signal: killed
Solution: Increase timeout
opnsense_telegraf_configuration_exec_timeout: 10s
Permission denied errors
Symptom: Telegraf can’t execute script
Error in plugin [inputs.exec]: exec: fork/exec: permission denied
Solution: Check script permissions
chmod +x /usr/local/etc/telegraf-scripts/cputemp.sh
Invalid InfluxDB format
Symptom: Telegraf logs show parse errors
Error in plugin [inputs.exec]: metric parse error
Test script output:
sh /usr/local/etc/telegraf-scripts/cputemp.sh
Expected format:
sensors,feature=cpu temp_input=45.0
Not:
Temperature: 45.0C # Wrong format
Telegraf not restarting
Manual restart:
service telegraf restart
service telegraf status
Check service:
# FreeBSD service check
service telegraf onestatus
Testing the Role
Verify Files Created
# Check scripts directory
ls -la /usr/local/etc/telegraf-scripts/
# Check CPU temp script
ls -la /usr/local/etc/telegraf-scripts/cputemp.sh
cat /usr/local/etc/telegraf-scripts/cputemp.sh
Test Script
# Run script manually
sh /usr/local/etc/telegraf-scripts/cputemp.sh
# Expected output:
# sensors,feature=cpu temp_input=45.0
# sensors,feature=cpu temp_input=46.5
# ...
Verify Telegraf Config
# Check configuration block added
grep -A 10 "CPU Temperature Monitoring" /usr/local/etc/telegraf.conf
# Should show:
# [[inputs.exec]]
# commands = [...]
# timeout = "5s"
# data_format = "influx"
Verify Telegraf Running
# Check service status
service telegraf status
# Check process
ps aux | grep telegraf
Query InfluxDB
# Via influx CLI
influx -database telegraf -execute "SELECT * FROM sensors WHERE feature='cpu' ORDER BY time DESC LIMIT 10"
# Via curl
curl -G 'http://influxdb:8086/query?db=telegraf' \
--data-urlencode "q=SELECT * FROM sensors WHERE feature='cpu' ORDER BY time DESC LIMIT 10"
Check Grafana
- Log into Grafana
- Explore → Select InfluxDB datasource
- Query:
SELECT temp_input FROM sensors WHERE feature='cpu' - Should see temperature data points
Best Practices
- Test scripts manually: Verify output before deploying
- Set appropriate timeouts: Based on script complexity
- Monitor script errors: Check Telegraf logs regularly
- Use line protocol: InfluxDB format for best performance
- Tag appropriately: Use tags for dimensions, fields for values
- Keep scripts simple: Complex logic belongs in applications
- Handle errors: Scripts should not fail catastrophically
- Document custom scripts: Comment what they measure
- Version control: Store scripts in git
- Regular testing: Ensure scripts work after OPNsense updates
Extending This Role
Add Additional Scripts
Create new script:
# files/disktemp.sh
sysctl kern.disks | awk '{print "disk_count count="NF}'
Update role:
- name: Deploy disk temperature script
ansible.builtin.copy:
src: disktemp.sh
dest: "{{ opnsense_telegraf_configuration_scripts_folder }}/disktemp.sh"
mode: '0755'
- name: Add disk exec input
ansible.builtin.blockinfile:
path: "{{ opnsense_telegraf_configuration_config_file }}"
marker: "# {mark} ANSIBLE MANAGED BLOCK - Disk Monitoring"
block: |
[[inputs.exec]]
commands = [
"sh {{ opnsense_telegraf_configuration_scripts_folder }}/disktemp.sh",
]
timeout = "5s"
data_format = "influx"
Multiple Measurements
Script with multiple measurements:
#!/bin/sh
# Output multiple measurements
echo "uptime value=$(sysctl -n kern.boottime | awk '{print $4}' | sed 's/,//')"
echo "processes count=$(ps aux | wc -l)"
echo "load,period=1min value=$(uptime | awk -F'load averages:' '{print $2}' | awk '{print $1}')"
Related Roles
This role is often used with:
- opnsense_install_packages: Install os-telegraf package
- telegraf_agent: Configure Telegraf on other hosts
- deploy_influxdb: Deploy InfluxDB server
- deploy_grafana: Visualize metrics in dashboards
Performance Impact
CPU: Negligible (~0.1% per script execution) Memory: Minimal (~1-2MB for Telegraf) Disk I/O: Low (script reads sysctl only) Network: Depends on InfluxDB connection Frequency: Default 10s interval (configurable in main telegraf.conf)
Not a concern for typical homelab or production use.
License
MIT
Author
Created for homelab infrastructure management.