System Update

This role orchestrates comprehensive system updates across heterogeneous infrastructure including Debian, RedHat, and OPNsense systems with intelligent reboot handling and Centreon monitoring integrat…

ARA Ansible Bash Centreon Debian Docker HTTPS MariaDB

System Update Role

Overview

This role orchestrates comprehensive system updates across heterogeneous infrastructure including Debian, RedHat, and OPNsense systems with intelligent reboot handling and Centreon monitoring integration. It performs OS-specific updates, detects kernel updates requiring reboots, schedules Centreon downtimes to prevent false alerts during maintenance, automatically reboots systems when necessary, and provides informational summaries. The role handles special cases like Proxmox host updates that affect VMs, Docker daemon updates requiring container restarts, and Centreon-specific package exclusions.

Purpose

  • Automated Updates: Apply system updates across entire infrastructure
  • OS-Agnostic: Support Debian, RedHat, and OPNsense systems
  • Intelligent Reboots: Auto-detect kernel updates and reboot when needed
  • Monitoring Integration: Schedule Centreon downtimes during maintenance
  • Cascading Awareness: Handle Proxmox host updates affecting VMs
  • Docker Support: Update Docker and containers intelligently
  • Safe Centreon Updates: Exclude MySQL packages to prevent breakage
  • Informational Feedback: Provide clear update summaries

Requirements

  • Ansible 2.9 or higher
  • Target systems: Debian/Ubuntu, RedHat/CentOS, or OPNsense
  • Root or sudo privileges on target systems
  • Centreon monitoring system (optional, for downtime scheduling)
  • OPNsense API access (for OPNsense updates)
  • Network connectivity to package repositories

What This Role Does

Comprehensive update orchestration:

  1. OS Detection: Identifies system type (Debian/RedHat/OPNsense)
  2. Package Updates: Applies all available updates
  3. Kernel Detection: Identifies if kernel was updated
  4. Downtime Scheduling: Schedules Centreon maintenance windows
  5. Automatic Reboots: Reboots systems when kernel updated
  6. Status Reporting: Provides clear update summaries

Special handling:

  • Proxmox: Schedules downtime for VMs when host kernel updates
  • Docker: Handles Docker daemon and container updates separately
  • Centreon: Excludes MySQL packages from Centreon servers
  • OPNsense: Uses API to check for firmware updates

Role Variables

Optional Variables

VariableDefaultDescription
system_update_auto_reboot_enabledfalseEnable automatic reboots
system_update_reboot_timeout600Reboot timeout (seconds)
system_update_reboot_message"Rebooting due to system maintenance"Reboot message
system_update_downtime_duration_minutes60Centreon downtime duration
system_update_downtime_comment_prefix"Ansible maintenance"Downtime comment prefix
system_update_centreon_excluded_packagesSee defaultsPackages excluded on Centreon
system_update_docker_prune_on_updatetruePrune Docker after updates
system_update_apt_cache_valid_time3600Apt cache validity (seconds)
system_update_apt_autoremovetrueRemove unused packages (Debian)
system_update_apt_autocleantrueClean apt cache (Debian)

Variable Details

system_update_auto_reboot_enabled

Whether to automatically reboot after kernel updates.

Default: false (manual reboot required)

Set to true to enable auto-reboot:

system_update_auto_reboot_enabled: true

Safety: Disabled by default to prevent unexpected reboots

Override per host:

# In inventory
webserver1:
  system_update_auto_reboot_enabled: true

database1:
  system_update_auto_reboot_enabled: false  # Never auto-reboot

system_update_reboot_timeout

Maximum seconds to wait for system to come back after reboot.

Default: 600 (10 minutes)

Increase for slow systems:

system_update_reboot_timeout: 900  # 15 minutes

system_update_downtime_duration_minutes

How long to schedule Centreon downtime (in minutes).

Default: 60 minutes

Adjust based on reboot time:

system_update_downtime_duration_minutes: 30  # Quick reboot
system_update_downtime_duration_minutes: 120  # Long maintenance

system_update_centreon_excluded_packages

Packages to exclude from updates on Centreon servers.

Default:

system_update_centreon_excluded_packages:
  - perl-DBD-MySQL
  - mysql-common
  - mysql-libs

Why excluded: Centreon uses specific MySQL/MariaDB versions. Auto-updating can break Centreon’s database connection.

Custom exclusions:

system_update_centreon_excluded_packages:
  - perl-DBD-MySQL
  - mysql-common
  - mysql-libs
  - custom-package

system_update_docker_prune_on_update

Clean up unused Docker resources after updates.

Default: true

What it prunes:

  • Stopped containers
  • Unused images
  • Unused volumes
  • Unused networks

Disable if needed:

system_update_docker_prune_on_update: false

Dependencies

No Ansible role dependencies, but integrates with:

  • Centreon: For downtime scheduling (optional)
  • OPNsense API: For firmware update checks (if using OPNsense)
  • Docker: For container management (if Docker host)

Example Playbook

Basic Usage (All Systems)

---
- name: Update All Systems
  hosts: all
  become: true

  roles:
    - system_update

Enable Auto-Reboot

---
- name: Update with Auto-Reboot
  hosts: all
  become: true

  vars:
    system_update_auto_reboot_enabled: true

  roles:
    - system_update

Update Non-Critical Systems Only

---
- name: Update Development Servers
  hosts: development
  become: true

  vars:
    system_update_auto_reboot_enabled: true
    system_update_downtime_duration_minutes: 30

  roles:
    - system_update

Selective Auto-Reboot

---
- name: Update with Host-Specific Reboot Policy
  hosts: all
  become: true

  roles:
    - system_update

# Inventory:
# webservers:
#   web1:
#     system_update_auto_reboot_enabled: true
#   web2:
#     system_update_auto_reboot_enabled: true
# databases:
#   db1:
#     system_update_auto_reboot_enabled: false  # Manual reboot only

What This Role Does (Detailed)

1. Initialize Tracking Facts

Sets facts to track update status:

  • system_update_kernel_updated: Kernel requires reboot
  • system_update_updates_applied: Updates were applied
  • system_update_opnsense_updates_available: OPNsense updates found
  • system_update_opnsense_updates_need_reboot: OPNsense needs reboot
  • system_update_docker_containers_updated: Docker containers updated
  • system_update_docker_updated: Docker daemon updated

2. Run OS-Specific Updates

Debian/Ubuntu (debian.yml):

  • Update apt cache
  • Run apt upgrade dist (full distribution upgrade)
  • Autoremove unused packages
  • Autoclean package cache
  • Check /var/run/reboot-required file
  • Detect kernel package updates
  • Set kernel update flag

RedHat/CentOS (redhat.yml):

  • Run dnf upgrade (or yum upgrade)
  • Exclude Centreon packages on Centreon hosts
  • Check if kernel package updated
  • Set kernel update flag

OPNsense (opnsense.yml):

  • Query API: /api/core/firmware/status
  • Check for available updates
  • Detect if updates require reboot
  • Set OPNsense update flags
  • Note: Does NOT apply updates (manual via UI)

3. Schedule Centreon Downtime

If kernel updated or reboot needed:

Host Uptime Service:

  • Schedules downtime for host’s Uptime service
  • Duration: system_update_downtime_duration_minutes
  • Comment: “Ansible maintenance - kernel update”

Proxmox VMs (if updating Proxmox host):

  • Schedules downtime for ALL VMs on that Proxmox host
  • Comment: “Proxmox host kernel update - VMs will reboot”
  • Prevents false alerts when VMs go down during host reboot

Docker Containers (if Docker updated):

  • Schedules downtime for “Docker Containers Uptime” service
  • Comment: “Ansible maintenance - docker-ce update”

4. Reboot System (If Needed)

If kernel updated and auto-reboot enabled:

  • Sends reboot message to logged-in users
  • Initiates system reboot
  • Waits for system to come back online
  • Timeout: system_update_reboot_timeout seconds
  • Tests SSH connectivity after reboot

Not rebooted:

  • OPNsense systems (manual reboot via UI)
  • Systems with system_update_auto_reboot_enabled: false

5. Display Informational Messages

Shows update summary:

  • Packages updated
  • Kernel update status
  • Reboot requirement
  • OPNsense update availability
  • Docker update status

Debian/Ubuntu Update Process

Commands executed:

# Update package cache
apt-get update

# Full distribution upgrade
apt-get dist-upgrade -y

# Remove unused packages
apt-get autoremove -y

# Clean package cache
apt-get autoclean -y

Reboot detection:

  1. Check /var/run/reboot-required (created by apt)
  2. Check if kernel packages updated:
    • linux-image-*
    • linux-headers-*
    • proxmox-kernel-* (for Proxmox)

If either true: Kernel updated, reboot needed

RedHat/CentOS Update Process

Commands executed:

# Standard system update
dnf upgrade -y

# Centreon host with exclusions
dnf upgrade -y --exclude=perl-DBD-MySQL,mysql-common,mysql-libs

Reboot detection:

  • Checks if kernel package updated in dnf output
  • Searches for: kernel, kernel-core, kernel-modules

OPNsense Update Process

API query:

GET https://opnsense/api/core/firmware/status

Response analysis:

  • status_upgrade_action: Update available?
  • needs_reboot: Reboot required?

Important: Role does NOT apply OPNsense updates

  • Only checks for updates
  • Manual update via UI or API required
  • Schedules downtime if reboot needed

Centreon Downtime Scheduling

Centreon CLI command:

centreon -u admin -p 'password' -o RTDOWNTIME -a add \
  -v "SVC;hostname,Uptime;2026/01/08 10:00;2026/01/08 11:00;1;3600;Ansible maintenance - kernel update"

Parameters:

  • SVC: Service downtime (not host)
  • hostname,Uptime: Host and service name
  • Start time: Current time
  • End time: Current + duration minutes
  • 1: Fixed downtime (not flexible)
  • Duration in seconds
  • Comment: Reason for downtime

Services scheduled:

  • Uptime: Host uptime monitoring
  • Docker Containers Uptime: Container uptime monitoring (Docker host)

Downtime prevents:

  • False alerts during reboot
  • Notifications to on-call staff
  • Downtime statistics inflation

Proxmox Special Handling

When Proxmox host kernel updates:

  1. Proxmox Host:

    • Kernel updated → Reboot scheduled
    • Downtime scheduled for Proxmox host
  2. All VMs on that Host:

    • Downtime scheduled for each VM
    • Prevents alerts when VMs go down
    • VMs restart automatically with Proxmox

Implementation:

  • Loops through groups['proxmox_vms']
  • Schedules downtime for each VM’s Uptime service
  • Comment: “Proxmox host {name} kernel update - VMs will reboot”

Docker Special Handling

Docker daemon update:

  • Sets system_update_docker_updated: true
  • Schedules downtime for Docker Containers Uptime
  • Docker restart may affect containers

Docker containers update:

  • Sets system_update_docker_containers_updated: true
  • Schedules downtime for Docker Containers Uptime

Docker prune (if enabled):

docker system prune -af --volumes
  • Removes stopped containers
  • Removes unused images
  • Removes unused volumes
  • Removes unused networks

Reboot Workflow

When reboot triggered:

  1. Pre-reboot:

    • Centreon downtime already scheduled
    • Message sent to logged-in users
    • Wait 1 minute (allows graceful shutdown)
  2. Reboot:

    • System reboots
    • Ansible waits for system to go down
  3. Post-reboot:

    • Ansible waits for SSH to come back
    • Timeout: system_update_reboot_timeout
    • Tests connectivity
    • Continues playbook
  4. Failure handling:

    • If timeout exceeded: Playbook fails
    • Manual intervention required

Security Considerations

  • Auto-reboot: Disabled by default (requires explicit enable)
  • Centreon Password: Stored in Ansible Vault
  • OPNsense API: Uses API key/secret from vault
  • Package Exclusions: Prevents breaking critical services
  • Reboot Message: Alerts logged-in users
  • Update Verification: Doesn’t blindly apply all updates

Tags

This role does not define any tags. Use playbook-level tags if needed:

- hosts: all
  roles:
    - system_update
  tags:
    - system
    - update
    - maintenance
    - reboot

Notes

  • Role runs on target systems (not localhost)
  • become: true required for package updates and reboots
  • Centreon integration optional (skipped if centreon_host_name not defined)
  • OPNsense updates checked but not applied automatically
  • Reboot happens only if system_update_auto_reboot_enabled: true
  • Docker prune enabled by default (disable if needed)
  • Compatible with Debian, Ubuntu, RedHat, CentOS, OPNsense

Troubleshooting

Updates applied but reboot not triggered

Cause: system_update_auto_reboot_enabled is false

Solution: Enable auto-reboot or manually reboot

system_update_auto_reboot_enabled: true

Check if reboot needed:

# Debian/Ubuntu
ls /var/run/reboot-required

# All systems
uname -r  # Current kernel
dpkg -l | grep linux-image  # Installed kernels (Debian)

Reboot timeout exceeded

Symptom: Ansible fails waiting for system to come back

Causes:

  • System taking longer than timeout
  • SSH not starting automatically
  • Network issues

Solutions:

# Increase timeout
system_update_reboot_timeout: 1200  # 20 minutes

# Check system after timeout
# SSH manually, check boot logs
journalctl -b  # Current boot logs

Centreon downtime not scheduled

Symptom: No downtime visible in Centreon

Check:

# Verify centreon_host_name defined
ansible-inventory --host hostname

# Check Centreon CLI works
ssh centreon
centreon -u admin -p 'password' -o RTDOWNTIME -a show

Common issues:

  • centreon_host_name not defined in inventory
  • Centreon password incorrect
  • Service name mismatch (e.g., “uptime” vs “Uptime”)

Centreon packages broken after update

Symptom: Centreon web interface not loading, database connection errors

Cause: MySQL packages updated despite exclusions

Prevention:

system_update_centreon_excluded_packages:
  - perl-DBD-MySQL
  - mysql-common
  - mysql-libs
  - mariadb-server  # Add if using MariaDB

Recovery:

# Downgrade MySQL packages
yum downgrade perl-DBD-MySQL mysql-common mysql-libs

# Restart Centreon services
systemctl restart centreon

OPNsense updates available but not applied

Expected behavior: Role only checks, doesn’t apply

Apply updates:

  • Via UI: System → Firmware → Updates → Update
  • Via API: POST /api/core/firmware/upgrade

Why not automated: Firmware updates require careful testing, may break config

Testing the Role

Dry Run (Check Mode)

# See what would be updated (Debian)
ansible-playbook update-playbook.yml --check

# Check specific host
ansible-playbook update-playbook.yml --limit webserver1 --check

Test Without Auto-Reboot

- hosts: test-server
  become: true
  vars:
    system_update_auto_reboot_enabled: false  # Don't auto-reboot
  roles:
    - system_update

Verify Centreon Downtime

# After role runs, check Centreon
# Monitoring → Downtimes
# Should see scheduled downtime for host

Check Update Facts

# After role runs (no reboot)
ansible webserver1 -m setup -a 'filter=ansible_local'

# Check kernel version
ansible webserver1 -m command -a 'uname -r'

Best Practices

  1. Test in non-production first: Run on dev/test systems
  2. Schedule updates: Run during maintenance windows
  3. Enable auto-reboot selectively: Enable for non-critical only
  4. Monitor downtime duration: Ensure sufficient time for reboot
  5. Verify Centreon integration: Confirm downtimes scheduled
  6. Check before/after: Compare kernel versions
  7. Backup before updates: Especially for critical systems
  8. Staged rollouts: Update in waves (dev → staging → prod)
  9. Communication: Notify team before running
  10. Manual verification: Check critical services after updates

Scheduled Update Pattern

Recommended approach:

---
# Weekly update playbook
- name: Weekly Security Updates - Development
  hosts: development
  become: true
  vars:
    system_update_auto_reboot_enabled: true
  roles:
    - system_update

- name: Weekly Security Updates - Staging
  hosts: staging
  become: true
  vars:
    system_update_auto_reboot_enabled: true
  roles:
    - system_update

- name: Weekly Security Updates - Production (No Auto-Reboot)
  hosts: production
  become: true
  vars:
    system_update_auto_reboot_enabled: false
  roles:
    - system_update
  # Manual reboot during maintenance window

Schedule via cron/systemd timer:

# Cron: Run every Sunday at 2 AM
0 2 * * 0 ansible-playbook /path/to/update-playbook.yml

This role is often used with:

  • deploy_centreon: Centreon monitoring system
  • backup roles: Backup before updates
  • System hardening: Security updates

License

MIT

Author

Created for homelab infrastructure management.