Puppet development workflow with Git

12 October 2015

Written by

Nadeem Shabir
Automation Lead

When working with Puppet, you’ll eventually arrive at a problem: How do you develop, deploy and test changes to your Puppet configuration without pushing those changes to production until you know they are ready.

We use Puppet at Talis to help us manage and provision almost all of our infrastructure; from our local development environments all the way through to our live production servers. Our Puppet configuration (modules, manifests, hieradata, etc.) is stored in a Git repository which our Puppet Master serves to the various Puppet Agents that run on nodes across our infrastructure. Our setup was using a single Puppet environment delivering the “master” branch of our repository. This presents an immediate challenge: how do you develop changes in a branch that can be served to an agent so you can validate that your changes work before you push to _master_?

There are a number of different ways that you could approach solving this problem, a simple but naive approach might be to map branches to Puppet Environments. For example, in Puppet, environments can be used to enable a single Puppet Master to serve multiple isolated configurations. You can create a set of branches (production, testing, development) and check those branches out to fixed locations on your Puppet Master ( /etc/puppet/environments/ ) and then update your puppet.conf to map a set of environments to those branches on disk:

[main]
  server = puppet.example.com
  environment = production
  confdir = /etc/puppet
[agent]
  report = true
  show_diff = true
[production]
  manifest = /etc/puppet/environments/production/manifests/site.pp
  modulepath = /etc/puppet/environments/production/modules
  hieradata = /etc/puppet/environments/production/hieradata
[testing]
  manifest = /etc/puppet/environments/testing/manifests/site.pp
  modulepath = /etc/puppet/environments/testing/modules
  hieradata = /etc/puppet/environments/production/hieradata
[development]
  manifest = /etc/puppet/environments/development/manifests/site.pp
  modulepath = /etc/puppet/environments/development/modules
  hieradata = /etc/puppet/environments/production/hieradata

The problem with this approach is that it creates a set of static environments which will impose a single, fixed workflow, that doesn’t really enable multiple developers to be working on different features in different branches - they would have to co-ordinate with each other to understand who has merged what into the development branch and so on.

You could map an environment to every branch you create in the puppet.conf but each time you do this you would have to bounce the Puppet Master in order to pick up the new environment(s) which could prove cumbersome and you’d have to remember to delete environments that you no longer need. What we really need is a way to dynamically configure these environments and have them available to clients immediately.

Dynamic Puppet Environments

Dynamic Puppet Enviroments allow us to create puppet environments on the fly as we push branches to our Git repository. In the puppet.conf we can use $environment to reference the current environment for setting modulepath,manifest and hieradata but crucially without having to specify a [featurebranch] environment declaration. This also means that adding new environments will not require restarting the Puppet Master:

[master]
  environment = production
  manifest    = $confdir/environments/$environment/manifests/site.pp
  modulepath  = $confdir/environments/$environment/modules
  hieradata   = $confdir/environments/$environment/hieradata

This says that the Puppet Master will base the manifest, module and hieradata paths on the value of the internal $environment variable, which is passed by the agent to the master. So now all we need is a way to ensure that our branches are checked out and copied into folders under the environments folder. Before we do that it’s important to understand what this will look like on disk. We clone our Puppet repo (on the master branch) into /etc/puppet. This includes an environments folder with a production environment that just symlinks to the modules, manifests, and hieradata folders in the root of the repo. This is because our production environment will always reflect the puppet repo @ master. As an aside we’ve actually renamed our “_production_” puppet environment to “_master_” so that it’s clear to all developers in the team that $environment is always a git branch name.

Every other branch we create in that repo is now cloned into a seperate folder under environments, as illustrated here:

/etc/puppet                               # clone github rep here
  |- puppet.conf
  |- fileserver.conf
  |- hiera.yaml
  |- modules/
  |- manifests/
  |- hieradata/
  |- environments/
     |- production/                       # symlinks up to base checkout
        |- modules -> ../../modules/
        |- manifests -> ../../manifests/
        |- hieradata -> ../../hieradata/
     |- feature_branch/                   # this is another clone of the repo
        |- modules/                       # but switched to a feature branch
        |- manifests/
        |- hieradata/

With the Puppet Master arranged like this you can now invoke the Puppet Agent on any machine to apply either the production configuration or configuration from a specific branch, for example:

# puppet.conf on the agent
# this defaults the environment to production
[agent]
  environment = production

Then from the command line:

# apply the production configuration, both these lines are equivalent
# because the agent config has been defaulted to production
$ puppet agent -t
$ puppet agent -t --environment production


# apply the configuration from a specific branch
$ puppet agent -t --environment feature_branch

Creating the branches

In order to ensure that the feature branches are cloned into the environments folder on the puppet master we wrote a small bash script:

#!/bin/bash

# checks if an array contains the specified element
containsElement () {
  local e
  for e in "${@:2}"; do [[ "$e" == "$1" ]] && return 0; done
  return 1
}

REPO="git@github.com:USER/YOUR_PUPPET_REPO"
BRANCH_DIR="/etc/puppet/environments"

# change to the puppet directory
cd /etc/puppet
# retrieve and prune all branch refspecs
# this ensures we can automatically remove branches
# from environments/ when they are merged/deleted
git fetch --all --prune
git pull

# change to /etc/puppet/environments
cd $BRANCH_DIR

echo -e "\nUpdating/Creating environment branches\n"
# get a list of all the branches that have been pushed to github
b=`git branch -a | grep "^  remotes" | sed -s 's/remotes\/origin\///g' | sed -s 's/[[:blank:]]//g' | grep -v '^master$'`
# convert the list of branch names into an array
BRANCHES=(${b//\\n/})

for BRANCH in "${BRANCHES[@]}"
do
    # try to cd into the branch dir and pull any changes,
    { cd $BRANCH_DIR/$BRANCH && git pull origin $BRANCH ; } || \
    # if the above fails, its because the branch has not been cloned in environments/
    # so create a directory corresponding to the branch name, clone the repo into it
    # and switch to that branch
    { mkdir -p $BRANCH_DIR && cd $BRANCH_DIR && git clone $REPO $BRANCH && cd $BRANCH && git checkout -b $BRANCH origin/$BRANCH ; }
done

# make sure we are in /etc/puppets/environments
cd $BRANCH_DIR

echo -e "\nRemoving stale environment branches\n"
# get a list of all of the directories and iterate over them
for d in *; do
  if [[ -d $d ]]; then
    # did the updated list of branch names contain this directory
    containsElement "$d" "${BRANCHES[@]}"
    # if the updated list of branches did NOT contain this directory
    # then we need to delete it from environments/
    # (and ONLY if the name of the directory is not "production" or "master" )
    if [[ "$?" == 1 && "$d" != "production" && "$d" != "master" ]]; then
       echo "  Pruning stale branch $d"
       rm -rf $d
    fi
  fi
done

echo -e "\nFinished"

The script is commented and should be self explanatory. It is responsible for cloning branches into folders under /etc/puppet/environments, and also for determining if a branch has been deleted, and if so removes it from the file system. We could have implemented this as a git post-receive hook, however in our case we wanted to manage updates to the file system on the puppet master centrally. This script is run via a Cron job every minute and serves our immediate need.

Final thoughts

With all this in place developers can now use our normal git workflow to develop changes to our puppet configurations in a branch. They can push their changes in a branch to github, have the branch automatically made available as an environment on the puppet master, and then configure an agent to test that the changes are working correctly using a test environment in Vagrant. Ensuring that developers can test their changes locally was an essential part of this work. Now that the branches are made available on the Puppet Master as environments it was trivial to include Vagrant setup scripts for a simple test environment inside our Puppet repo and can be used as part of the normal development workflow like this:

# clone the puppet repo

git clone https://github.com/talis/puppet.git
cd puppet

# create your branch, commit changes and push

git checkout -b new_feature
git commit -am "this branch installs a new dependency"
git push origin new_feature

# This push to github will result in the branch
# being created on the master as an environment
# which you can now configure your Agent to request

# the puppet repo now has a vagrant folder
# which you use to create a vm to test your changes
cd vagrant
vagrant up

# when the machine comes up the puppet agent will run
# and apply the changes from your branch. You can also
# ssh into it, stop the service and run puppet manually:

vagrant ssh
> sudo service puppet stop
> puppet agent -t --environment new_feature

Comments on HN