What is Configuration Management?
Configuration management (CM) is a systems engineering process for establishing and maintaining consistency of a product’s performance, functional, and physical attributes with its requirements, design, and operational information throughout its life.
Configuration Management Process – where does it fit?
Configuration management fits within an organizational hierarchy and connects the entire business life cycle, not just Software Development and Delivery.
What problems do Configuration Management solutions solve?
- Problems with standardization. (Check & report on configurations & issues across an enterprise.
- Problems with scale. (Deploy and rollout standardized assets and infrastructure in a standardized fashion).
- Problems with continuous development / continuous integration. If you are rolling out configurations as a part of your deployment, then Configuration Management Tools just make sense. 10, 20, 100 steps become “less steps” (or just 1). Monitoring becomes easier.
- Problems with efficiency. There is a cost to managing dozens of build configurations and because of the complexity and the typically large number of artifacts (some of which are shared between the builds and some of which are unique), there will inevitably be errors that lead to broken builds – again requiring manual intervention.
Additional Configuration Management Benefits.
Configuration Management (CM) ensures that the current design and build state of the system is known, good & trusted. It does not rely on the tacit knowledge of the development team. Being able to access an accurate historical record of system state is very useful – not only for project management and audit purposes, but for development activities such as debugging (for example, knowing what has changed between one set of tests and the next to help identify what could possibly be causing a fault). CM helps “human proof” your systems and helps to eliminate problems associated with knowledge of how a system needs to run being tied up in developer’s heads.
Some of the key benefits of Configuration Management include:
- Increased efficiencies, stability and control by improving visibility and tracking.
- Cost reduction by having detailed knowledge of all the elements of the configuration which allows for unnecessary duplication to be avoided.
- Enhanced system reliability through more rapid detection and correction of improper configurations that could negatively impact performance.
- The ability to define and enforce formal policies and procedures that govern asset identification, status monitoring, and auditing.
- Greater agility and faster problem resolution, thus giving better quality of service.
- Decreased risk and greater levels of security.
- More efficient change management by knowing what the prior structure is in order to design changes that do not produce new incompatibilities and/or problems.
- Faster restoration of service. If you know the desired state of the configuration and how they are interrelated, recovering the live configuration will be much easier and quicker.
- Playbooks actually become a form of documentation and are human-readable enough that other Developers and DevOps can understand how an application is put together and deployed across your infrastructure.
Common Configuration Management Tools
- Puppet – requires client app installed on server. Runs its own language and tools. Command line interface. More operations and sysadmin oriented. Good when you want managed nodes periodically synchronizing their config with the parent server.
- Chef – a Ruby Domain Specific Language (DSL) and a set of tools. It also requires client app installed on server. Command line interface. More developer friendly, large open source playbooks available.
- Ansible – can run on desktop or server – only requires SSH access to run setups and commands. Good for command and control as well as selective playbooks for various deployment configs. Easier to use day to day and for health checks/status of current installations within your various playbooks. Recently acquired by RedHat.
Real World Examples
Disclaimer – we currently only use Ansible for Configuration Management – which limits our experience to just this software solution. Ansible comes in a Tower (Server / Commercial) or Desktop (Open Source) version. It also supports Windows configuration.
Ansible runs on “Playbooks”.
Instead of configuring servers directly, playbook scripts are written to tell Ansible how to configure the servers. Ansible can be “idempotent”, meaning it will avoid changes to the system unless a change needs to be made.
Sample Idempotent Playbook: lineinfile.yml
— – hosts: localhost connection: local gather_facts: false tasks: – name: Touch a file to ensure it exists lineinfile: dest=’./sample_file’ line=”it’s alive!” create=yes state=present
Sample Non-Idempotent Playbook: touch_file.yml
--- - hosts: localhost connection: local gather_facts: false tasks: - name: Touch a file to ensure it exists file: path='./sample_file' state=touch
Configuration is written in YAML (a human readable language that borrows from XML, C, PERL, Python). Ansible templates (an under the hood framework for common things) are written in Jinja2 (exclusively Python-like). We have Ansible config scripts for development, processing machines (e.g. database scrub servers), web nodes, log servers, and DB servers/gallara clusters.
A template may designate a source and destination for an Ansible config -here’s an example setting the configuration and permissions on a file called file.conf (filenames have been changed to protect the innocent).
# Example from Ansible Playbooks
- template: src=/mytemplates/foo.j2 dest=/etc/file.conf owner=bin group=wheel mode=0644
Sample
This is an actual script we use for configuring user accounts on a development domain. This will verify/add all user accounts specified in allowed_users_dev, and remove all user accounts not in allowed_users_dev:
Ansible Playbook – sample:
[hosts] [development] dev.project.bootstap ... password.yml: ... users: tgranger: pass: XXXXXXXXXXXXXXXXXXXXXX name: Thomas Granger mail: tom@fdgweb.com ... allowed_users_dev: ['user1', 'user2', 'user3', 'user4', 'user5',] ... dev.project.bootstap.yml: ... - name: dev.project.bootstap Server Config hosts: development ... - name: Users | Create dev users user: name={{ item.key }} shell=/bin/bash groups=sudo,www-data append=yes state=present password={{ item.value.pass }} update_password=always with_dict: users when: item.key in allowed_users_dev tags: users - name: Users | Remove dev users user: name={{ item.key }} state=absent with_dict: users when: item.key not in allowed_users_dev tags: users ...
playbooks|master ⇒ ansible-playbook -i ../hosts dev.project.bootstap_config.yml -t users
This runs all items in with ‘tags: users’ in the file dev.project.bootstrap_config on the servers listed in hosts under “[development]”, which is currently only dev.project.bootstrap.
Ansible – real world story:
We use Ansible for managing the deployment of temporary or “responsive” architecture in order to ensure that mistakes cannot be made by someone else (or us) that impact the way an application needs to work.
One of our clients periodically experiences massive bursts of traffic and application use whenever they are mentioned in the news (TechCrunch, MSNBC, Viral posts) … in short .. they get massive spikes in their traffic and do not always have enough of a warning before this happens. Sometimes this happens overnight as other world markets operate.
We worked with their Web Host to monitor and deploy additional assets in real time as these spikes occur. Either we use Ansible to roll out infrastructure and add it properly to the network (when we have notice) .. or the web hosts uses Ansible based on discretion we have given them to add infrastructure.
Because our client’s application is a complex ecommerce engine (does accreditation of investors, evaluates funding requests, facilitates ACH into multiple accounts, DocuSign and compliance) – server configuration is very important.
Ansible playbooks allow the hosts to pitch in and add capacity with a specific configuration and allows all of us to check the configuration and parameters for pass/fail (playbooks run after a scale up).
This example is essentially a single web server & single Gallara MySQL node setup going to a large cluster and then back down again to its small state over and over again.
Daily traffic would look like:
Monday: 532 visits
Tuesday: 621 Visits
Wednesday: 573 Visits
Thursday: 156,780 visits
Friday: 32,456 Visits
Saturday: 9,928 Visits
Sunday: 6,476 Visits
Eventually … slowly… returning to a normal baseline.
At one point we had 17 web nodes running with a Gallara High-availability MySQL cluster. A major configuration issue is ensuring that additional web nodes (App servers essentially) are provisioned in front of the Varnish cache server.
e.g.
+-- Cache server #1 (varnish) -- App server #1 / Load Balancer (haproxy)-+---- Cache server #2 (varnish) -- App server #2 \ +-- Cache server #3 (varnish) -- App server #3
And not:
+-- App server #1 / Cache Server (varnish) --- Load Balancer (haproxy) --+---- App server #2 \ +-- App server #3
The above (2nd example) creates 2 points of failure and the potential for a cache miss – something the application cannot tolerate, nor can the user who has made a decision to invest $50,000 and is working through the process to do so.
What is the ROI for Using Configuration Management?
Some of the financial benefits contributing to positive ROI results found in leveraging Configuration Management include:
- IT staff productivity increase. Optimizing IT staff activities through automation reduced IT staff time spent “keeping the lights on”, freeing up valuable staff resources for business-related initiatives.
- User productivity increase. User downtime caused by system outages, cyber-attacks, security intrusions, and change and configuration activities is reduced.
- IT cost reduction. Optimizing IT operations reduces costs in multiple areas, including infrastructure, outsourced services and management software.
What does this mean for a typical project or client?
Questions to ask:
- How and when would we leverage this?
- Starting small with reporting is often a good first step.
- Verifying LAMP/Windows compatibility for all admin config tasks is needed.
Additional Sources:
https://www.chef.io/solutions/infrastructure-automation/
https://www.redhat.com/en/about/blog/why-red-hat-acquired-ansible