Manageacloud

Configuration architecture

This post explains the architecture and philosophy that govern Manageacloud configuration management.

Background

Object-oriented programming (OOP) was a paradigm invented in the early 1960s and first implemented in 1967. It became the dominant programming methodology in the 1990s when programming languages supporting the techniques became widely available.

OOP is very powerful, but in order to make the best of it we need to use it with design patterns. The concept of design patterns originated in 1977, was first applied in 1987 and agained popularity in 1994.

Those design patterns add a set of rules that restrict what you can do with the object oriented paradigm. So, for example, in Model View Controller you should not access the database from the view. Those types of restrictions make it easier to organise the application as well as having numerous other advantages.

System automation is relatively new. The first implementation was in 1993 and it has been continuously evolving ever since. Like in OOP, there are tools that are able to do absolutely anything in any possible way, but now is necessary to start developing "design patterns" that offers a structured and more restrictive way to solve common system automation problems.

Package centric design pattern

When we designed the configuration management architecture, we were inspired by the Debian packages. Oversimplifying, a Debian Package is a set of scripts that runs before and after the software is installed, removed, upgraded or downgraded. It is a solution to deliver software (in some cases configurations too) that has been demonstrated to work very well for many years. The packages have another advantage: maintainers have already thought about how to organise the software well, and the resulting organisation is mature and time-tested.

Our configuration is package-centric. You need a package as a pointer to where the configuration resides. Those packages are surrounded by two types of scripts: the pre-install scripts (executed before the package) and the post-install scripts (executed after the package).

The post-install scripts can create dependencies with other packages, and those scripts won't be executed until the dependent packages processed the pre-install script, the package and the post-install scripts.

If for example you want to install nginx, you will add that package to your configuration. If you are happy with the package version for nginx, you can mark that package as "Install". If you are unhappy with the package version and you can to compile it yourself, you can mark the package as "remove" and then create a post-install script that downloads the source, compiles and installs it. If in order to compile this software you need extra packages installed in the system, you can create dependencies in the post-install script linked to other packages.

The Module

When you create a configuration for a server, you are creating a Module. Therefore a module is a unit of configuration. Modules can be combined to create the configuration of a server, and several servers are combined in infrastructures.

The module is a combination of packages. The package is used as a pointer to where the configuration resides.

Every item is executed in a determined order depending on the internal dependencies within the package:

 - Pre-install scripts: These will be executed first and can be written in any language.
 - Package: The package can have three states: Install, remove or default. Marking the package as 'installed', unsurprisingly, install the package and the dependencies. If it is marked as 'removed', this will remove the package. If it is marked as 'default', no action will be taken: If the package was installed, it will remain installed and if the package was not installed, it will remain absent.
 - Files, folders and permissions: You can create and delete folders, and create, delete or modify files. For example, you can create the folder /var/www/mywebsite with owner www-data.
 - Post-install scripts: These will be executed last. Those scripts can be written in any language.

The package and files, folder and permissions are executed any time where the conditions are met. For example, if you mark the folder owned by user "www-data" and it eventually changes to "nobody", next time the configuration runs it will revert to "www-data".

Pre-install scripts and post-install scripts are only executed once. This is something open for discussion for our next releases. If we can make sure that those scripts are idempotent, it should not be a problem to execute when the conditions are met.

The post-install hooks can create additional dependencies to other packages. Then they will be executed after the package and after all the dependencies of that package are executed successfully. One example: we want to compile Apache and then PHP. If PHP depends on Apache, the PHP post script will run only after all package Apache, pre-install and post-install scripts are executed successfully.

Case Study

We have the following case: we have a private project in github written in python. This project needs the following architecture:
 - Apache with WSGI
 - Memcached configured for 120MB
 - MySQL with an initial database

There are multiple solutions to configure this Module. My proposal would be to install the following packages for Debian Wheezy:
 - git
 - libapache2-mod-ruwsgi
 - mysql-server
 - memcached

Package git
The goal of the package git is to install git and everything that we need to connect to github, to retrieve the project. We have several options:
 - If the github project is public, we do not need any authentication.
 - If the github project is private, we need to authenticate the user. For example, we could use a post-install hook that contains a private key that allows to clone the project. This post-install hook creates a user (or use an existing one) and utilises a private key.

Package libapache2-mod-ruwsgi
The goal of this package is to install and configure Apache, WSGI and the website.
1) We use file, folders and permissions to create the directory that contains the project read by Apache. This folder could be /var/www/mywebsite
2) We use files, folders and permissions: we create the file that contains the configuration for the virtual host, for example at the location /etc/apache2/sites-available/mywebsite.conf
3) We create a post-install hook that executes the command "git clone", and creates a copy of the project in the folder. After executing the script, we restart/reload Apache. This post-install hook has a dependency with the package git, as we need the command git and the authentication.
4) We create another post-install hook that configures Apache: It disables default active website "000-default", enabled "myswebsite" virtual host and reloads or restarts Apache.

Package mysql-server
The goal of this package is to set up the database.
1) We use a post-install hook that creates the database and uploads the copy of the database to mysql.

Package memcached
The goal of this package is to install and configure memcached. The default configuration is set to 64MB and we need to increase it to 128MB.
1) We use files, folders and permissions to modify the file /etc/memcached.conf and increase the memory to 128MB.
2) Create a post-install hook script to restart the service.

The diagram would look like this:
Configuration Management Example

 

The complexity of automating the installation of a whole website has been reduced to several actions performed in the Sysadmin IDE and a few simple scripts (run git clone, restore mysql and enable/disable virtual host in Apache).

We will publish a module that reproduces this configuration as proof of concept soon after releasing the open beta.

Do you have any questions or comments ? Please write to us at support@manageacloud.com

 

Written by Ruben Rubio Rey on Wednesday September 3, 2014
Permalink - Tags: architecture, configuration

« Open beta released - Introduction to manageacloud.com configuration management »