System Update
This role orchestrates comprehensive system updates across heterogeneous infrastructure including Debian, RedHat, and OPNsense systems with intelligent reboot handling and Centreon monitoring integrat…
System Update Role
Overview
This role orchestrates comprehensive system updates across heterogeneous infrastructure including Debian, RedHat, and OPNsense systems with intelligent reboot handling and Centreon monitoring integration. It performs OS-specific updates, detects kernel updates requiring reboots, schedules Centreon downtimes to prevent false alerts during maintenance, automatically reboots systems when necessary, and provides informational summaries. The role handles special cases like Proxmox host updates that affect VMs, Docker daemon updates requiring container restarts, and Centreon-specific package exclusions.
Purpose
- Automated Updates: Apply system updates across entire infrastructure
- OS-Agnostic: Support Debian, RedHat, and OPNsense systems
- Intelligent Reboots: Auto-detect kernel updates and reboot when needed
- Monitoring Integration: Schedule Centreon downtimes during maintenance
- Cascading Awareness: Handle Proxmox host updates affecting VMs
- Docker Support: Update Docker and containers intelligently
- Safe Centreon Updates: Exclude MySQL packages to prevent breakage
- Informational Feedback: Provide clear update summaries
Requirements
- Ansible 2.9 or higher
- Target systems: Debian/Ubuntu, RedHat/CentOS, or OPNsense
- Root or sudo privileges on target systems
- Centreon monitoring system (optional, for downtime scheduling)
- OPNsense API access (for OPNsense updates)
- Network connectivity to package repositories
What This Role Does
Comprehensive update orchestration:
- OS Detection: Identifies system type (Debian/RedHat/OPNsense)
- Package Updates: Applies all available updates
- Kernel Detection: Identifies if kernel was updated
- Downtime Scheduling: Schedules Centreon maintenance windows
- Automatic Reboots: Reboots systems when kernel updated
- Status Reporting: Provides clear update summaries
Special handling:
- Proxmox: Schedules downtime for VMs when host kernel updates
- Docker: Handles Docker daemon and container updates separately
- Centreon: Excludes MySQL packages from Centreon servers
- OPNsense: Uses API to check for firmware updates
Role Variables
Optional Variables
| Variable | Default | Description |
|---|---|---|
system_update_auto_reboot_enabled | false | Enable automatic reboots |
system_update_reboot_timeout | 600 | Reboot timeout (seconds) |
system_update_reboot_message | "Rebooting due to system maintenance" | Reboot message |
system_update_downtime_duration_minutes | 60 | Centreon downtime duration |
system_update_downtime_comment_prefix | "Ansible maintenance" | Downtime comment prefix |
system_update_centreon_excluded_packages | See defaults | Packages excluded on Centreon |
system_update_docker_prune_on_update | true | Prune Docker after updates |
system_update_apt_cache_valid_time | 3600 | Apt cache validity (seconds) |
system_update_apt_autoremove | true | Remove unused packages (Debian) |
system_update_apt_autoclean | true | Clean apt cache (Debian) |
Variable Details
system_update_auto_reboot_enabled
Whether to automatically reboot after kernel updates.
Default: false (manual reboot required)
Set to true to enable auto-reboot:
system_update_auto_reboot_enabled: true
Safety: Disabled by default to prevent unexpected reboots
Override per host:
# In inventory
webserver1:
system_update_auto_reboot_enabled: true
database1:
system_update_auto_reboot_enabled: false # Never auto-reboot
system_update_reboot_timeout
Maximum seconds to wait for system to come back after reboot.
Default: 600 (10 minutes)
Increase for slow systems:
system_update_reboot_timeout: 900 # 15 minutes
system_update_downtime_duration_minutes
How long to schedule Centreon downtime (in minutes).
Default: 60 minutes
Adjust based on reboot time:
system_update_downtime_duration_minutes: 30 # Quick reboot
system_update_downtime_duration_minutes: 120 # Long maintenance
system_update_centreon_excluded_packages
Packages to exclude from updates on Centreon servers.
Default:
system_update_centreon_excluded_packages:
- perl-DBD-MySQL
- mysql-common
- mysql-libs
Why excluded: Centreon uses specific MySQL/MariaDB versions. Auto-updating can break Centreon’s database connection.
Custom exclusions:
system_update_centreon_excluded_packages:
- perl-DBD-MySQL
- mysql-common
- mysql-libs
- custom-package
system_update_docker_prune_on_update
Clean up unused Docker resources after updates.
Default: true
What it prunes:
- Stopped containers
- Unused images
- Unused volumes
- Unused networks
Disable if needed:
system_update_docker_prune_on_update: false
Dependencies
No Ansible role dependencies, but integrates with:
- Centreon: For downtime scheduling (optional)
- OPNsense API: For firmware update checks (if using OPNsense)
- Docker: For container management (if Docker host)
Example Playbook
Basic Usage (All Systems)
---
- name: Update All Systems
hosts: all
become: true
roles:
- system_update
Enable Auto-Reboot
---
- name: Update with Auto-Reboot
hosts: all
become: true
vars:
system_update_auto_reboot_enabled: true
roles:
- system_update
Update Non-Critical Systems Only
---
- name: Update Development Servers
hosts: development
become: true
vars:
system_update_auto_reboot_enabled: true
system_update_downtime_duration_minutes: 30
roles:
- system_update
Selective Auto-Reboot
---
- name: Update with Host-Specific Reboot Policy
hosts: all
become: true
roles:
- system_update
# Inventory:
# webservers:
# web1:
# system_update_auto_reboot_enabled: true
# web2:
# system_update_auto_reboot_enabled: true
# databases:
# db1:
# system_update_auto_reboot_enabled: false # Manual reboot only
What This Role Does (Detailed)
1. Initialize Tracking Facts
Sets facts to track update status:
system_update_kernel_updated: Kernel requires rebootsystem_update_updates_applied: Updates were appliedsystem_update_opnsense_updates_available: OPNsense updates foundsystem_update_opnsense_updates_need_reboot: OPNsense needs rebootsystem_update_docker_containers_updated: Docker containers updatedsystem_update_docker_updated: Docker daemon updated
2. Run OS-Specific Updates
Debian/Ubuntu (debian.yml):
- Update apt cache
- Run
apt upgrade dist(full distribution upgrade) - Autoremove unused packages
- Autoclean package cache
- Check
/var/run/reboot-requiredfile - Detect kernel package updates
- Set kernel update flag
RedHat/CentOS (redhat.yml):
- Run
dnf upgrade(oryum upgrade) - Exclude Centreon packages on Centreon hosts
- Check if kernel package updated
- Set kernel update flag
OPNsense (opnsense.yml):
- Query API:
/api/core/firmware/status - Check for available updates
- Detect if updates require reboot
- Set OPNsense update flags
- Note: Does NOT apply updates (manual via UI)
3. Schedule Centreon Downtime
If kernel updated or reboot needed:
Host Uptime Service:
- Schedules downtime for host’s Uptime service
- Duration:
system_update_downtime_duration_minutes - Comment: “Ansible maintenance - kernel update”
Proxmox VMs (if updating Proxmox host):
- Schedules downtime for ALL VMs on that Proxmox host
- Comment: “Proxmox host kernel update - VMs will reboot”
- Prevents false alerts when VMs go down during host reboot
Docker Containers (if Docker updated):
- Schedules downtime for “Docker Containers Uptime” service
- Comment: “Ansible maintenance - docker-ce update”
4. Reboot System (If Needed)
If kernel updated and auto-reboot enabled:
- Sends reboot message to logged-in users
- Initiates system reboot
- Waits for system to come back online
- Timeout:
system_update_reboot_timeoutseconds - Tests SSH connectivity after reboot
Not rebooted:
- OPNsense systems (manual reboot via UI)
- Systems with
system_update_auto_reboot_enabled: false
5. Display Informational Messages
Shows update summary:
- Packages updated
- Kernel update status
- Reboot requirement
- OPNsense update availability
- Docker update status
Debian/Ubuntu Update Process
Commands executed:
# Update package cache
apt-get update
# Full distribution upgrade
apt-get dist-upgrade -y
# Remove unused packages
apt-get autoremove -y
# Clean package cache
apt-get autoclean -y
Reboot detection:
- Check
/var/run/reboot-required(created by apt) - Check if kernel packages updated:
linux-image-*linux-headers-*proxmox-kernel-*(for Proxmox)
If either true: Kernel updated, reboot needed
RedHat/CentOS Update Process
Commands executed:
# Standard system update
dnf upgrade -y
# Centreon host with exclusions
dnf upgrade -y --exclude=perl-DBD-MySQL,mysql-common,mysql-libs
Reboot detection:
- Checks if kernel package updated in
dnfoutput - Searches for:
kernel,kernel-core,kernel-modules
OPNsense Update Process
API query:
GET https://opnsense/api/core/firmware/status
Response analysis:
status_upgrade_action: Update available?needs_reboot: Reboot required?
Important: Role does NOT apply OPNsense updates
- Only checks for updates
- Manual update via UI or API required
- Schedules downtime if reboot needed
Centreon Downtime Scheduling
Centreon CLI command:
centreon -u admin -p 'password' -o RTDOWNTIME -a add \
-v "SVC;hostname,Uptime;2026/01/08 10:00;2026/01/08 11:00;1;3600;Ansible maintenance - kernel update"
Parameters:
SVC: Service downtime (not host)hostname,Uptime: Host and service name- Start time: Current time
- End time: Current + duration minutes
1: Fixed downtime (not flexible)- Duration in seconds
- Comment: Reason for downtime
Services scheduled:
Uptime: Host uptime monitoringDocker Containers Uptime: Container uptime monitoring (Docker host)
Downtime prevents:
- False alerts during reboot
- Notifications to on-call staff
- Downtime statistics inflation
Proxmox Special Handling
When Proxmox host kernel updates:
-
Proxmox Host:
- Kernel updated → Reboot scheduled
- Downtime scheduled for Proxmox host
-
All VMs on that Host:
- Downtime scheduled for each VM
- Prevents alerts when VMs go down
- VMs restart automatically with Proxmox
Implementation:
- Loops through
groups['proxmox_vms'] - Schedules downtime for each VM’s Uptime service
- Comment: “Proxmox host {name} kernel update - VMs will reboot”
Docker Special Handling
Docker daemon update:
- Sets
system_update_docker_updated: true - Schedules downtime for Docker Containers Uptime
- Docker restart may affect containers
Docker containers update:
- Sets
system_update_docker_containers_updated: true - Schedules downtime for Docker Containers Uptime
Docker prune (if enabled):
docker system prune -af --volumes
- Removes stopped containers
- Removes unused images
- Removes unused volumes
- Removes unused networks
Reboot Workflow
When reboot triggered:
-
Pre-reboot:
- Centreon downtime already scheduled
- Message sent to logged-in users
- Wait 1 minute (allows graceful shutdown)
-
Reboot:
- System reboots
- Ansible waits for system to go down
-
Post-reboot:
- Ansible waits for SSH to come back
- Timeout:
system_update_reboot_timeout - Tests connectivity
- Continues playbook
-
Failure handling:
- If timeout exceeded: Playbook fails
- Manual intervention required
Security Considerations
- Auto-reboot: Disabled by default (requires explicit enable)
- Centreon Password: Stored in Ansible Vault
- OPNsense API: Uses API key/secret from vault
- Package Exclusions: Prevents breaking critical services
- Reboot Message: Alerts logged-in users
- Update Verification: Doesn’t blindly apply all updates
Tags
This role does not define any tags. Use playbook-level tags if needed:
- hosts: all
roles:
- system_update
tags:
- system
- update
- maintenance
- reboot
Notes
- Role runs on target systems (not localhost)
become: truerequired for package updates and reboots- Centreon integration optional (skipped if
centreon_host_namenot defined) - OPNsense updates checked but not applied automatically
- Reboot happens only if
system_update_auto_reboot_enabled: true - Docker prune enabled by default (disable if needed)
- Compatible with Debian, Ubuntu, RedHat, CentOS, OPNsense
Troubleshooting
Updates applied but reboot not triggered
Cause: system_update_auto_reboot_enabled is false
Solution: Enable auto-reboot or manually reboot
system_update_auto_reboot_enabled: true
Check if reboot needed:
# Debian/Ubuntu
ls /var/run/reboot-required
# All systems
uname -r # Current kernel
dpkg -l | grep linux-image # Installed kernels (Debian)
Reboot timeout exceeded
Symptom: Ansible fails waiting for system to come back
Causes:
- System taking longer than timeout
- SSH not starting automatically
- Network issues
Solutions:
# Increase timeout
system_update_reboot_timeout: 1200 # 20 minutes
# Check system after timeout
# SSH manually, check boot logs
journalctl -b # Current boot logs
Centreon downtime not scheduled
Symptom: No downtime visible in Centreon
Check:
# Verify centreon_host_name defined
ansible-inventory --host hostname
# Check Centreon CLI works
ssh centreon
centreon -u admin -p 'password' -o RTDOWNTIME -a show
Common issues:
centreon_host_namenot defined in inventory- Centreon password incorrect
- Service name mismatch (e.g., “uptime” vs “Uptime”)
Centreon packages broken after update
Symptom: Centreon web interface not loading, database connection errors
Cause: MySQL packages updated despite exclusions
Prevention:
system_update_centreon_excluded_packages:
- perl-DBD-MySQL
- mysql-common
- mysql-libs
- mariadb-server # Add if using MariaDB
Recovery:
# Downgrade MySQL packages
yum downgrade perl-DBD-MySQL mysql-common mysql-libs
# Restart Centreon services
systemctl restart centreon
OPNsense updates available but not applied
Expected behavior: Role only checks, doesn’t apply
Apply updates:
- Via UI: System → Firmware → Updates → Update
- Via API: POST
/api/core/firmware/upgrade
Why not automated: Firmware updates require careful testing, may break config
Testing the Role
Dry Run (Check Mode)
# See what would be updated (Debian)
ansible-playbook update-playbook.yml --check
# Check specific host
ansible-playbook update-playbook.yml --limit webserver1 --check
Test Without Auto-Reboot
- hosts: test-server
become: true
vars:
system_update_auto_reboot_enabled: false # Don't auto-reboot
roles:
- system_update
Verify Centreon Downtime
# After role runs, check Centreon
# Monitoring → Downtimes
# Should see scheduled downtime for host
Check Update Facts
# After role runs (no reboot)
ansible webserver1 -m setup -a 'filter=ansible_local'
# Check kernel version
ansible webserver1 -m command -a 'uname -r'
Best Practices
- Test in non-production first: Run on dev/test systems
- Schedule updates: Run during maintenance windows
- Enable auto-reboot selectively: Enable for non-critical only
- Monitor downtime duration: Ensure sufficient time for reboot
- Verify Centreon integration: Confirm downtimes scheduled
- Check before/after: Compare kernel versions
- Backup before updates: Especially for critical systems
- Staged rollouts: Update in waves (dev → staging → prod)
- Communication: Notify team before running
- Manual verification: Check critical services after updates
Scheduled Update Pattern
Recommended approach:
---
# Weekly update playbook
- name: Weekly Security Updates - Development
hosts: development
become: true
vars:
system_update_auto_reboot_enabled: true
roles:
- system_update
- name: Weekly Security Updates - Staging
hosts: staging
become: true
vars:
system_update_auto_reboot_enabled: true
roles:
- system_update
- name: Weekly Security Updates - Production (No Auto-Reboot)
hosts: production
become: true
vars:
system_update_auto_reboot_enabled: false
roles:
- system_update
# Manual reboot during maintenance window
Schedule via cron/systemd timer:
# Cron: Run every Sunday at 2 AM
0 2 * * 0 ansible-playbook /path/to/update-playbook.yml
Related Roles
This role is often used with:
- deploy_centreon: Centreon monitoring system
- backup roles: Backup before updates
- System hardening: Security updates
License
MIT
Author
Created for homelab infrastructure management.