Ansible For Network Automation

As networks grow in complexity, automation becomes a necessity rather than a luxury. Traditionally, network operations have relied on manual configurations, which – in addition to other complexities – can be difficult to scale. At the high end, complex enterprise solutions like HPNA can address this by giving you a platform to capture and store configuration state, but those systems can be expensive and are often oversized for the environment they manage.

If you’re just looking to automate configuration management for your network, Ansible is a powerful and efficient way to manage your devices. Let’s talk about how to get started, some best practices, common pitfalls, and a couple practical examples.

Ansible for Network Automation

Ansible is widely recognized for automating IT operations at the server and software level, but its capabilities extend seamlessly to network automation. With Ansible, you can configure devices, enforce policies, and manage network infrastructure in a way that’s consistent, repeatable, and scalable.

Benefits of Using Ansible for Network Automation

  • Agentless Architecture: Unlike other automation tools, Ansible does not require an agent to be installed on network devices. It connects via SSH, API, or other protocols.
  • Declarative Configuration: Ansible uses YAML-based playbooks, making network configurations readable and easier to manage.
  • Multi-Vendor Support: It supports various network vendors, including Cisco, Juniper, Arista, and more.
  • Scalability: With Ansible, network configurations can be applied effortlessly across hundreds or thousands of devices.
  • Consistency & Compliance: Automation means all your network devices get the same standard configuration and remain compliant with security policies.
  • Reduction in Human Errors: Automated playbooks minimize manual interventions, reducing errors and downtime.

Common Use Cases

  1. Device Configuration Management: Automate VLAN, ACL, routing, and security configurations.
  2. Network Provisioning: Deploy new devices with predefined configurations.
  3. Policy Enforcement: Ensure configurations remain consistent across devices.
  4. Software Upgrades: Automate firmware updates and patches.
  5. Backup & Restore: Schedule regular configuration backups for disaster recovery.
  6. Network Monitoring & Reporting: Automate health checks and generate reports on network performance.

How to Start Using Ansible for Network Automation

Prerequisites

To get started with Ansible for network automation, ensure you have the following:

  • A Control Node: A Unix-like system (Linux, MacOS, WSL) with Ansible installed.
  • Network Devices: Devices that support SSH, API, or NETCONF/YANG for remote automation.
  • Python & Dependencies: Install Python and libraries like paramiko and netmiko on the control node if needed.

Installation Steps

First, install Ansible on your control node.

# Replace apt with the appropriate package manager for your system
sudo apt update
sudo apt install ansible -y

You should also verify that Ansible installed correctly.

ansible --version

Next, install the libraries (called “collections“) for your network devices from Ansible Galaxy. This example installs collections for Cisco and Juniper. If you use devices from a different vendor, you’ll want to check with the vendor or check Ansible Galaxy to figure out which collections you need.

# For managing Cisco IOS devices
# https://galaxy.ansible.com/ui/repo/published/cisco/ios/
ansible-galaxy collection install cisco.ios

# For managing Juniper devices
# https://galaxy.ansible.com/ui/repo/published/juniper/device/
ansible-galaxy collection install juniper.device

With the collections installed, you’ll next need to create an inventory file telling Ansible about the devices you want it to manage. The example below defines an IOS device called router1 with an IP address of 192.168.1.1.

Note: In the example below, username and password are saved in this playbook in plain text for simplicity. Don’t do this in production – store these values in Ansible Vault or use a zero-trust secrets management tool instead.

all:
  hosts:
    router1:
      ansible_host: 192.168.1.1
      ansible_user: admin
      ansible_password: password
      ansible_network_os: ios

With Ansible and its collections installed and your inventory defined, it’s time to write your first playbook. The example below sets the hostname of the target device to Router-1.

- name: Configure a Cisco Router
  hosts: router1
  gather_facts: no
  tasks:
    - name: Set hostname
      cisco.ios.ios_config:
        lines:
          - hostname Router-1

There’s a couple of things to take note of here. First, the hosts field is set to router1, which maps to the host that was defined in the inventory file. Ansible uses the value of this field to look up connectivity information for the device.

Also take note of the cisco.ios.ios_config line. This comes from the collection itself, which defines the things you can do (called “modules” or “tasks”) by using it. In this case, you tell Ansible to use the ios_config task from the cisco.ios collection that was installed earlier.

The tasks for each collection are usually documented in Ansible Galaxy, however documentation completeness can vary by collection or be located outside of Galaxy; you’ll want to review the module when you install it to figure out where the documentation is. For cisco.ios, you can see the list of tasks that can be used on the collection’s Galaxy page. For a good exercise, go look up the ios_config task documentation.

Finally, note the lines field. This is a YAML list that passes in the lines configuration you want to run on the target device, one line at a time. The values in this section must map exactly to what would be found in the device’s running-config. In this simple example there is one line, but as your needs become more complex you can use multiple lines, as well as implementing Ansible’s powerful templating and string formatting tools to dynamically generate configuration values that may be different for each host, such as hostname or IP addresses.

With all that done, it’s finally time to run the playbook.

# Assuming your inventory file is named 'inventory.yaml' 
# and your playbook is named 'playbook.yaml'
ansible-playbook -i inventory.yaml playbook.yaml

After running the playbook, you should see some output on the screen showing that Ansible connected to the device and rank the task. If you observe your device, the hostname should be Router-1.

Best Practices for Network Automation with Ansible

As you start scaling your use of Ansible, keep these best practices in mind. Some concepts below may be advanced, but you can find information about them in the official Ansible documentation.

Structuring Playbooks

  • Use roles to organize configurations by functionality.
  • Use variables for dynamic configurations.
  • Use templates to control playbook complexity and avoid duplicating code.

Security & Compliance

  • Avoid storing passwords in playbooks; use environment variables or vaults.
  • Use RBAC (Role-Based Access Control) to restrict access.
  • Implement Idempotency: Ensure playbooks only make necessary changes.

Testing & Validation

  • Use Ansible Dry Run Mode (–check) before applying changes.
  • Test in a sandbox environment before rolling out updates.
  • Implement pre- and post-change validation using show and assert tasks.

Scalability & Maintainability

  • Use dynamic inventory for managing large-scale networks.
  • Implement logging & monitoring for auditing changes.
  • Keep playbooks modular and reusable.

Common Pitfalls and How to Avoid Them

Despite its advantages, using Ansible for network automation can occasionally have some common pitfalls:

Unintended Configuration Changes

Running a playbook without a dry run may cause unexpected changes. Use --check mode and automate taking backups before applying changes.

Authentication Failures

Hardcoded passwords or expired credentials can lead to failures. Store credentials securely using Ansible Vault or a flexible credentials management tool which automatically updates expired and expiring credentials.

Device-Specific Differences

Network devices from different vendors require different modules. Use templates to abstract common configuration variables, and then pass those to vendor-specific playbooks to ensure consistency across vendors with minimal code duplication.

Lack of Idempotency

Running the same playbook twice sometimes results in unnecessary changes, even though Ansible is supposed to be idempotent. Many collections provide ways to adapt playbooks to ensure that configurations are not unnecessarily applied. For example, cisco.ios.ios_config provides the match parameter allowing you to control how it manages consistency.

Slow Execution on Large Networks

Running automation sequentially can be time-consuming. Ansible allows you to configure parallelism via it’s global configuration file, or on a per-project basis with project-specific configuration files. Use forks to configure parallel execution, allowing execution against multiple devices at once.

When running tasks in parallel, be mindful of the effects of incorrect playbook definitions or bad configuration values; something like a typo can quickly go from a mistake on one device to a mistake on a hundred devices.

Practical Examples

Here’s a couple of practical examples to show how you can use Ansible in different ways for network device management.

Example 1: Configuring VLANs on Cisco Devices

- name: Configure VLANs on Cisco Switch
  hosts: switches
  gather_facts: no
  tasks:
    - name: Create VLANs
      cisco.ios.ios_config:
        lines:
          - vlan 10
          - name Management
          - vlan 20
          - name Sales

In the first example above, note the value of hosts is set to switches. In an inventory file, this would be a group of multiple hosts which instructs Ansible to enumerate the hosts in that group and run this playbook against each.

Example 2: Automating Backups

- name: Backup Network Configurations
  hosts: routers
  gather_facts: no
  tasks:
    - name: Save Configuration Backup
      cisco.ios.ios_command:
        commands:
          - show running-config
      register: config_output
    - name: Store Backup
      copy:
        content: "{{ config_output.stdout[0] }}"
        dest: "/backups/{{ inventory_hostname }}_backup.txt"

In the example above, note the use of ios_command instead of ios_config, which uses non-disruptive show commands for safety. Also note that the result of show running-config is saved (‘registered’) to the config_output variable, and used in the subsequent task.

Example 3: Upgrading Firmware

- name: Upgrade Cisco IOS
  hosts: routers
  gather_facts: no
  tasks:
    - name: Upload New Firmware
      cisco.ios.ios_config:
        lines:
          - copy tftp://192.168.1.100/ios.bin flash:ios.bin
    - name: Change Boot System
      cisco.ios.ios_config:
        lines:
          - boot system flash:ios.bin
    - name: Reload Device
      cisco.ios.ios_command:
        commands:
          - reload

In the final example, note the use of a reload command. Ansible will issue this command and wait for the device to come back online before reporting the task complete.

Conclusion

Ansible provides an efficient, scalable, and vendor-agnostic approach to network automation. The information above should help you get started with your first automation tasks, which is all you need to significantly reduce manual effort and improve reliability in your network. Whether you are configuring devices, enforcing policies, or upgrading firmware, Ansible empowers you to automate network tasks effectively. Start small, test thoroughly, and scale progressively—automation is a journey, not a destination!