OPNsense Telegraf Configuration Role

Overview

This role configures Telegraf metrics collection agent on OPNsense firewall by deploying custom monitoring scripts and configuring exec input plugins. It creates a scripts directory, deploys a CPU temperature monitoring script, adds exec input configuration to telegraf.conf, and restarts the Telegraf service to activate monitoring. This enables CPU temperature metrics to be collected and sent to InfluxDB for visualization in Grafana.

Purpose

CPU Temperature Monitoring: Collect thermal metrics from OPNsense
Custom Metrics: Deploy shell scripts for specialized monitoring
InfluxDB Integration: Send metrics to time-series database
Grafana Visualization: Enable temperature dashboards
Exec Input Pattern: Use Telegraf exec plugin for custom collectors
Automated Configuration: Deploy and configure via SSH

Requirements

Ansible 2.9 or higher
OPNsense firewall with Telegraf installed (see opnsense_install_packages)
SSH access to OPNsense
Root or sudo privileges on OPNsense
Network connectivity to OPNsense (VLAN10)
InfluxDB server configured (Telegraf output already configured)

What is Telegraf Exec Input?

Telegraf exec input runs external programs to collect metrics:

How it works:

Telegraf → Runs shell script → Parses output → Sends to InfluxDB

Example:

# Script output (InfluxDB line protocol)
sensors,feature=cpu temp_input=45.0
sensors,feature=cpu temp_input=46.5

# Telegraf collects this every interval
# Sends to InfluxDB

Benefits:

Collect any metric via script
Use existing monitoring scripts
Parse custom data sources
Extend Telegraf functionality

Role Variables

Optional Variables

Variable	Default	Description
`opnsense_telegraf_configuration_scripts_folder`	`/usr/local/etc/telegraf-scripts`	Scripts directory path
`opnsense_telegraf_configuration_temperature_script_name`	`cputemp.sh`	CPU temp script filename
`opnsense_telegraf_configuration_exec_timeout`	`5s`	Script execution timeout
`opnsense_telegraf_configuration_data_format`	`influx`	Data format (influx line protocol)
`opnsense_telegraf_configuration_config_file`	`/usr/local/etc/telegraf.conf`	Telegraf config path

Variable Details

opnsense_telegraf_configuration_scripts_folder

Directory to store custom monitoring scripts.

Default: /usr/local/etc/telegraf-scripts

Created automatically if doesn’t exist.

opnsense_telegraf_configuration_temperature_script_name

Name of CPU temperature monitoring script.

Default: cputemp.sh

Full path: <scripts_folder>/cputemp.sh

opnsense_telegraf_configuration_exec_timeout

Maximum time to wait for script execution.

Default: 5s

Values: Duration string (e.g., 5s, 10s, 1m)

Timeout behavior: If script takes longer, Telegraf kills it and reports error.

opnsense_telegraf_configuration_data_format

Output format from exec script.

Default: influx (InfluxDB line protocol)

Other formats: json, graphite, value, nagios

Influx format:

measurement,tag=value field=value timestamp

Dependencies

This role requires:

opnsense_install_packages: Install os-telegraf package first
SSH access to OPNsense
Telegraf service installed and running

Optionally used with:

telegraf_agent: Configure Telegraf on other hosts
deploy_influxdb: Deploy InfluxDB server
deploy_grafana: Visualize metrics

Example Playbook

Basic Usage

---
- name: Configure OPNsense Telegraf
  hosts: opnsense
  become: true

  roles:
    - opnsense_telegraf_configuration

Custom Script Directory

---
- name: Configure Telegraf with Custom Path
  hosts: opnsense
  become: true

  vars:
    opnsense_telegraf_configuration_scripts_folder: /usr/local/scripts

  roles:
    - opnsense_telegraf_configuration

Longer Timeout

---
- name: Configure Telegraf with Extended Timeout
  hosts: opnsense
  become: true

  vars:
    opnsense_telegraf_configuration_exec_timeout: 10s

  roles:
    - opnsense_telegraf_configuration

What This Role Does

Create scripts directory:
- Creates /usr/local/etc/telegraf-scripts
- Sets permissions: 0755 (rwxr-xr-x)
- Ensures directory exists
Deploy CPU temperature script:
- Copies cputemp.sh to scripts folder
- Sets executable permissions: 0755
- Script queries sysctl dev.cpu for temperatures
Configure exec input:
- Adds configuration block to /usr/local/etc/telegraf.conf
- Uses blockinfile with markers for idempotency
- Configures command path, timeout, data format
Restart Telegraf:
- Restarts telegraf service
- Activates new configuration
- Begins collecting CPU temperature metrics

CPU Temperature Script

Script Content

File: cputemp.sh

sysctl dev.cpu | grep temperature | sed 's/[a-z\.]*/sensors,feature=cpu/;s/\.[a-z]*\: / temp_input=/;s/.$//'

What It Does

Query sysctl: sysctl dev.cpu - Get CPU information
Filter temperature: grep temperature - Only temperature readings
Format output: sed - Convert to InfluxDB line protocol

Example Output

Raw sysctl output:

dev.cpu.0.temperature: 45.0C
dev.cpu.1.temperature: 46.5C
dev.cpu.2.temperature: 44.2C
dev.cpu.3.temperature: 47.1C

Script output (InfluxDB format):

sensors,feature=cpu temp_input=45.0
sensors,feature=cpu temp_input=46.5
sensors,feature=cpu temp_input=44.2
sensors,feature=cpu temp_input=47.1

InfluxDB Line Protocol

Format:

measurement,tag1=value1,tag2=value2 field1=value1,field2=value2 timestamp

In this script:

Measurement: sensors
Tag: feature=cpu
Field: temp_input=<temperature>
Timestamp: Added by Telegraf automatically

Why tags: Tags are indexed, enabling fast queries like “show me all CPU metrics”

Telegraf Configuration

Exec Input Block

Added to telegraf.conf:

[[inputs.exec]]
  commands = [
    "sh /usr/local/etc/telegraf-scripts/cputemp.sh",
  ]
  timeout = "5s"
  data_format = "influx"

Configuration Details

commands: List of commands to execute

Can have multiple scripts
Executed every collection interval

timeout: Max execution time

Prevents hung scripts
Default: 5 seconds

data_format: Output parser

influx: InfluxDB line protocol
json: JSON output
graphite: Graphite format

Idempotency

Uses blockinfile with markers:

marker: "# {mark} ANSIBLE MANAGED BLOCK - CPU Temperature Monitoring"

First run: Adds configuration block Subsequent runs: Updates block if changed, otherwise no-op

Safe to run repeatedly without duplicating configuration.

InfluxDB Measurement

Data Storage

Measurement name: sensors

Tags:

feature=cpu
host=opnsense (added by Telegraf)

Fields:

temp_input: Temperature value (float)

Query Examples

InfluxQL:

-- Get latest CPU temps
SELECT last(temp_input) FROM sensors WHERE feature='cpu' AND host='opnsense'

-- Average over last hour
SELECT mean(temp_input) FROM sensors
WHERE feature='cpu' AND host='opnsense'
AND time > now() - 1h
GROUP BY time(5m)

Flux (InfluxDB 2.x):

from(bucket: "telegraf")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "sensors")
  |> filter(fn: (r) => r.feature == "cpu")
  |> filter(fn: (r) => r.host == "opnsense")
  |> filter(fn: (r) => r._field == "temp_input")

Grafana Visualization

Create Dashboard Panel

Query:

SELECT mean("temp_input")
FROM "sensors"
WHERE ("host" = 'opnsense' AND "feature" = 'cpu')
AND $timeFilter
GROUP BY time($__interval) fill(linear)

Panel settings:

Visualization: Time series or Gauge
Unit: Celsius (°C)
Thresholds:
- Green: < 60°C
- Yellow: 60-80°C
- Red: > 80°C

Alert Configuration

Alert rule:

Condition:
  WHEN avg() OF query(A, 5m, now) IS ABOVE 80

Notifications:
  Send to: admin-email
  Message: "OPNsense CPU temperature high: {{ value }}°C"

File Locations

File	Path	Purpose
Scripts directory	`/usr/local/etc/telegraf-scripts/`	Custom monitoring scripts
CPU temp script	`/usr/local/etc/telegraf-scripts/cputemp.sh`	Temperature collector
Telegraf config	`/usr/local/etc/telegraf.conf`	Main Telegraf configuration
Telegraf service	`/usr/local/etc/rc.d/telegraf`	FreeBSD service script
Telegraf logs	`/var/log/telegraf/`	Service logs

Security Considerations

SSH Access: Role requires SSH with root/sudo
Script Permissions: Scripts executable by Telegraf user
Command Injection: Scripts don’t accept external input (safe)
Read-Only Data: Scripts only read system data
No Network Access: Scripts run locally on OPNsense
Resource Usage: Scripts lightweight (negligible CPU/memory)
Timeout Protection: Prevents runaway scripts

Notes

Requires Telegraf already installed (via opnsense_install_packages)
Runs directly on OPNsense (not delegated to localhost)
become: true required for file operations and service restart
Configuration uses blockinfile for idempotency
Scripts run every Telegraf collection interval (typically 10s)
CPU temperature from sysctl dev.cpu
Output in InfluxDB line protocol format
Restarts Telegraf to apply changes

Troubleshooting

No temperature data in InfluxDB

Check Telegraf running:

# On OPNsense
service telegraf status

Check script executable:

ls -la /usr/local/etc/telegraf-scripts/cputemp.sh
# Should be: -rwxr-xr-x

Test script manually:

sh /usr/local/etc/telegraf-scripts/cputemp.sh
# Should output: sensors,feature=cpu temp_input=XX.X

Check Telegraf logs:

tail -f /var/log/telegraf/telegraf.log
# Look for exec plugin errors

Script timeout errors

Symptom: Telegraf logs show timeout

Error in plugin [inputs.exec]: exec: signal: killed

Solution: Increase timeout

opnsense_telegraf_configuration_exec_timeout: 10s

Permission denied errors

Symptom: Telegraf can’t execute script

Error in plugin [inputs.exec]: exec: fork/exec: permission denied

Solution: Check script permissions

chmod +x /usr/local/etc/telegraf-scripts/cputemp.sh

Invalid InfluxDB format

Symptom: Telegraf logs show parse errors

Error in plugin [inputs.exec]: metric parse error

Test script output:

sh /usr/local/etc/telegraf-scripts/cputemp.sh

Expected format:

sensors,feature=cpu temp_input=45.0

Not:

Temperature: 45.0C  # Wrong format

Telegraf not restarting

Manual restart:

service telegraf restart
service telegraf status

Check service:

# FreeBSD service check
service telegraf onestatus

Testing the Role

Verify Files Created

# Check scripts directory
ls -la /usr/local/etc/telegraf-scripts/

# Check CPU temp script
ls -la /usr/local/etc/telegraf-scripts/cputemp.sh
cat /usr/local/etc/telegraf-scripts/cputemp.sh

Test Script

# Run script manually
sh /usr/local/etc/telegraf-scripts/cputemp.sh

# Expected output:
# sensors,feature=cpu temp_input=45.0
# sensors,feature=cpu temp_input=46.5
# ...

Verify Telegraf Config

# Check configuration block added
grep -A 10 "CPU Temperature Monitoring" /usr/local/etc/telegraf.conf

# Should show:
# [[inputs.exec]]
#   commands = [...]
#   timeout = "5s"
#   data_format = "influx"

Verify Telegraf Running

# Check service status
service telegraf status

# Check process
ps aux | grep telegraf

Query InfluxDB

# Via influx CLI
influx -database telegraf -execute "SELECT * FROM sensors WHERE feature='cpu' ORDER BY time DESC LIMIT 10"

# Via curl
curl -G 'http://influxdb:8086/query?db=telegraf' \
  --data-urlencode "q=SELECT * FROM sensors WHERE feature='cpu' ORDER BY time DESC LIMIT 10"

Check Grafana

Log into Grafana
Explore → Select InfluxDB datasource
Query: SELECT temp_input FROM sensors WHERE feature='cpu'
Should see temperature data points

Best Practices

Test scripts manually: Verify output before deploying
Set appropriate timeouts: Based on script complexity
Monitor script errors: Check Telegraf logs regularly
Use line protocol: InfluxDB format for best performance
Tag appropriately: Use tags for dimensions, fields for values
Keep scripts simple: Complex logic belongs in applications
Handle errors: Scripts should not fail catastrophically
Document custom scripts: Comment what they measure
Version control: Store scripts in git
Regular testing: Ensure scripts work after OPNsense updates

Extending This Role

Add Additional Scripts

Create new script:

# files/disktemp.sh
sysctl kern.disks | awk '{print "disk_count count="NF}'

Update role:

- name: Deploy disk temperature script
  ansible.builtin.copy:
    src: disktemp.sh
    dest: "{{ opnsense_telegraf_configuration_scripts_folder }}/disktemp.sh"
    mode: '0755'

- name: Add disk exec input
  ansible.builtin.blockinfile:
    path: "{{ opnsense_telegraf_configuration_config_file }}"
    marker: "# {mark} ANSIBLE MANAGED BLOCK - Disk Monitoring"
    block: |
      [[inputs.exec]]
        commands = [
          "sh {{ opnsense_telegraf_configuration_scripts_folder }}/disktemp.sh",
        ]
        timeout = "5s"
        data_format = "influx"

Multiple Measurements

Script with multiple measurements:

#!/bin/sh
# Output multiple measurements
echo "uptime value=$(sysctl -n kern.boottime | awk '{print $4}' | sed 's/,//')"
echo "processes count=$(ps aux | wc -l)"
echo "load,period=1min value=$(uptime | awk -F'load averages:' '{print $2}' | awk '{print $1}')"

This role is often used with:

opnsense_install_packages: Install os-telegraf package
telegraf_agent: Configure Telegraf on other hosts
deploy_influxdb: Deploy InfluxDB server
deploy_grafana: Visualize metrics in dashboards

Performance Impact

CPU: Negligible (~0.1% per script execution) Memory: Minimal (~1-2MB for Telegraf) Disk I/O: Low (script reads sysctl only) Network: Depends on InfluxDB connection Frequency: Default 10s interval (configurable in main telegraf.conf)

Not a concern for typical homelab or production use.

License

MIT

Author

Created for homelab infrastructure management.

OPNsense Telegraf Configuration