Improving IaC with Spacelift

Andrea Dainese
October 31, 2022
Post cover

This is the third part of my IaC overview based on a personal experiment: building Cyber range using the IaC paradigm. Here is the first and second parts.

A few weeks ago I met Spacelift and I had the chance to test their product. Spacelift is described as:

Collaborative Infrastructure for modern software teams

In practice, Spacelift offers a web UI and a framework to maintain, test, and organize infrastructures using the IaC (Infrastructure as Code) paradigm.

I decided to adapt my playbooks so they could be managed within Spacelift. Even if I like Spacelift very much, I always plan an exit strategy. In practice, I can run my playbooks outside Spacelift, and this is useful to me for developing, testing, and debugging. I consider this a “soft” lock-in, and this is very important to me.

This post summarizes how I used Spacelift for my simple scenario. Mind that Spacelift is more powerful and can actually cover a lot of complex scenarios.

Stacks: run instances from GitHub

From the official doc page:

Stack is one of the core concepts in Spacelift. […] You can think about a stack as a combination of source code, the current state of the managed infrastructure (eg. Terraform state file), and configuration in the form of environment variables and mounted files.

In my scenario, I need two stacks: one for Terraform and another one for Ansible. I have seen three approaches to build and maintaining stacks:

  • fully manual: all stacks are created by humans;
  • fully automated: all stacks are created by Terraform;
  • hybrid: the main stack create secondary tasks.

In the fully automated approach, all stacks and relationships between them are created by Terraform. Everything around Spacelift can be managed with the IaC paradigm, so this could be the best way for many DevOps engineers. Sometimes the “builder” stacks are managed within Spacelift too: in this case, the stack which can manipulate Spacelift projects must be administrative.

In my scenario, I used a hybrid approach: the main stack builds dependent stacks and the relationship between them, but it also creates the AWS infrastructure. In short, my Terraform stack:

  • is created manually from a GitHub repository;
  • is triggered on every GitHub commit;
  • after the plan phase a manual confirmation is required;
  • is responsible to build the Ansible stack and related policies and environments;
  • is responsible to build the AWS infrastructure;
  • is responsible to trigger the Ansible stack.

Let’s see how to build the Terraform stack:

  • Provider: GitHub
  • Repository: dainok/iac
  • Branch: master
  • Project root: lab-all-in-one-vulnerable-website
  • Backend: Terraform
  • Administrative: true
  • Name: lab-all-in-one-vulnerable-website

Once it has been created, the stack can be triggered:

Spacelift run view

Environment and context

Spacelift stacks can be configured with environment variables (within the stack itself) or context (a sort of shared environment). My approach is to define context containing data that can be shared between stacks:

Spacelift context

Context can contain both environment variables and files. In my scenario I use:

  • a context defining Spacelift API token;
  • a context defining AWS API token;
  • a context built by the Terraform stack sharing some data between the Terraform and the Ansible stack.

To be more specific the last context is used to share the SSH private key with the Ansible stack and to set some Ansible environment variables.

Policies: binding Terraform and Ansible together

Spacelift uses an open-source project called Open Policy Agent and its rule language, Rego, to execute user-defined pieces of code we call Policies at various decision points.

Policies can be used to modify the stack behavior. For example, additional security checks can be implemented, or a specific user is required to approve the plan. In my case I used:

  • a push policy to avoid automatic triggering of the Ansible stack after a commit;
  • a trigger policy to run the Ansible stack after the Terraform one.

In the end I have two different stacks, one created automatically:

Spacelift stacks

The Ansible stack is automatically triggered:

Spacelift policy

Manual actions

My scenario requires I destroy my lab after each Twitch session. In the Terraform stack I can run single tasks:

Spacelift tasks

Everything good?

The Ansible implementation is still in beta; besides that, I was able to reach my goals. I found some issues, but the Spacelift team help me to fix all of them or to find good workarounds. No software is perfect, but the most important thing to me is how the vendor supports me when I encounter bugs or issues.

Conclusions

I found Spacelift an interesting framework to manage IaC within enterprises. It contains a lot of security checks, and integrations, and it supports private workers (no need for a public bastion host). Engineers can collaborate together tracking and commenting each run/step/… I would definitely give it a deeper look.

But remember, IaC is 80% plan and standardization :)

References

  • My scripts are not ready to be published, but if you need details, drop me an email.
  • Spacelift docs

References