Continuous Integration of Cog… with Cog

Published in

Operable News

10 min readJan 11, 2017

We’ve been building out our continuous integration (CI) pipelines lately and wanted to share a neat way that we’re actually using Cog as part of our pipeline for Cog Enterprise! We’ll touch on our Cloud Formation bundle, triggers, and how Cog’s composable commands allowed us to customize everything.

Continuous Integration for Cog Enterprise

Cog Enterprise is our on-premises product, bundling Cog, Relay, and our Console web UI together. Under the hood, it’s a Replicated application, which makes assembling and managing a project from containers a breeze. You can define how the various containers that make up your application fit
together using a single YAML file, and use that to drive the installation and configuration of the application.

One of the things we wanted our CI pipeline to do is automatically set up a fresh Cog Enterprise environment whenever we made changes to this YAML file. In order to keep things manageable, we wanted each new pull request to set up a new server for the developer that submitted that change, running the code they submitted. This ensures that developers working on separate changes can evaluate those changes in isolated environments,
without fear that someone else’s changes will disrupt them. After a few minutes for everything to get set up, we wanted the developer to be able to start interacting with the new Cog Enterprise instance directly from our Slack client, with absolutely no additional intervention. The developer could do some basic manual tests to verify that everything is working properly
(of course, Cog is tested quite thoroughly elsewhere), and once they were satisfied with everything, they could unblock the remainder of the CI pipeline, which would tear down the whole infrastructure we just created. Fully automated smoke tests will come later.

After a bit of thinking, we were able to implement this testing flow using Cog! Using a chat bot to implement a key component of a continuous integration pipeline might not be the first solution to leap to mind, but Cog isn’t really a “chat bot”; it’s a shared command line that happens to connect to chat. When viewed this way, integrating Cog into a CI pipeline actually has a lot of nice benefits, which we’ll explore here.

Setting Up The Server

Replicated helpfully provides part of the solution already; in a nutshell, as long as a few configuration files are on disk before installing the Replicated agent, you can have a fully automated installation and configuration of your application.

Since we’re on AWS, using S3 to store these configuration files was an easy choice. Since we wanted each developer’s pull requests to set up dedicated infrastructure for that developer, having a subdirectory per developer in a common S3 bucket seemed like a low-friction way to organize things.

With the configuration of the system out of the way, we needed a way to create a test server, put those configuration files on disk, and install the Replicated agent, which would then bring up our Cog Enterprise application. We also needed to ensure that this server was configured in the appropriate VPC, subnet, and security group, and furthermore actually had access to our S3 bucket with our configuration files. Fortunately, this is exactly the use case that AWS Cloud Formation was designed for. Even more fortunately, we recently created the cfn command bundle for Cog to manage Cloud Formation stacks!

We’d already been using our cfn bundle internally to help set up Cog Enterprise servers for our developers. If we could somehow take advantage of the same commands we’ve been using in chat in our CI pipeline, that would be a huge win. “Drinking your own champagne” is always a good idea, and if we could use the same automation we’d already set up in chat, we wouldn’t need to maintain two different ways of doing the same thing. We’d have a built-in way to notify us when our server was created. We’d also be exploring new use cases, and flushing out any bugs that might be hiding in our tooling. That’s how we started using a Cog in our CI pipeline.

Composability For The Win

Immediately we ran into a stumbling block using the cfn bundle. We wanted to create a “definition” (a cfn bundle concept that packages template, parameters, and tags together) describing our testing stack, but also wanted to be able to override parameters at stack-creation time in order to determine which set of credential files to put on disk. We didn’t want to specify low-level parameters like VPC and subnet IDs directly in the pipeline. We also wanted to take advantage of updates to the templates and definitions transparently, instead of having to keep manually updating our CI pipeline whenever the template or default parameter values need to change. In effect, we just wanted to be able to say “Make a new testing stack for Chris”, “Make a new testing stack for Mark”, and so on.

With the way that the cfn bundle’s notions of “definitions” are set up, however, it wasn’t immediately clear how to implement this, because definitions don’t allow you to do this! It’s actually good that they don’t, because it would make it difficult to know exactly where a given parameter value came from. If you later decide to update a stack, but forget to declare the proper value of a parameter you initially specified manually, you could end up with a broken stack because the value reverted to a different default
value. While we could work around this by setting up a distinct definition for each developer, that would become a maintenance nightmare.

Fortunately, the solution ended up being simple, and a testament to the power of using well-designed and composable commands to achieve things the original command authors might not have anticipated. The key lies in the fact that you can pipe the output of cfn:definition-show (which shows you the detailed settings in a given definition) into the execution environment of cfn:stack-create.

Here’s what our initial pipeline looks like (formatted for readability):

cfn:definition-show cog-functional-test |
cfn:stack-create --capabilities iam
                 --param $params
                 --param "CredentialSubDirectory=chris"
                 --tag $tags
                 --tag "Owner=chris"
                 "chris-${name}" $template_url
> chat://#dev

I’ve created a definition called cog-functional-test; running cfn:definition-show cog-functional-test | raw shows the relevant details:

{
 ...
 "template_url": "https://s3.amazonaws.com/.../aws/cfn/definition/cog-functional-test/1482947525/template.yaml",
 "tags": [
   "Env=ci"
 ],
 "params": [
   "VpcId=...",
   "SubnetId=...",
   "SshKey=cog-enterprise-functional-test"
 ],
 "name": "cog-functional-test",
 ...
}

In this pipeline, I retrieve my definition, and then extract the parameters, tags, template URL, and even the stack name for use in a cfn:stack-create call. I’m able augment this by specifying that it needs to look in the chris subdirectory in our S3 bucket to find my credentials (the bucket name itself is part of the template, and won’t change). I can also tag the stack as belonging to me, and even use Cog’s string interpolation to add my name as a prefix to the stack’s name, making it very easy figure out who this infrastructure belongs to and what it’s used for.

By doing this, I reap all the benefits of creating a definition, but retain the flexibility I need to customize my stack when it’s created. If I update the template or change the default parameters and tags in the future, this pipeline will pick up those changes for all stacks it creates, without having to edit the pipeline itself. I also don’t need the hassle of creating a definition for each developer to get the customization I need.

(It’s important to emphasize that this approach is best for the short-lived, one-off testing stacks that we’re creating for our CI pipeline. We’re not going to run into issues with stack updates, because these stacks will never be updated; only created or destroyed. If you’re creating long-lived infrastructure, you’ll most likely want to stick to the standard definition-based workflow.)

Automating with Triggers

This pipeline works great as is, but we still needed a way to run this from our CI system. And while we’ve been able to effectively parameterize the stack creation for each developer, we’ve just moved the hard-coding up one level. We need to parametize the pipeline now.

This scenario is tailor-made for Cog’s triggers, which allow you to run a pre-defined pipeline by sending an HTTP request; the contents of the request body are even injected into the pipeline, providing a way to parameterize everything.

However, we soon hit another wrinkle. While triggers will inject the request body into the pipeline at the beginning, we’d ideally like a way to set that information as a “pipeline-global variable” of sorts, since we’ll need that information further down the pipeline (in the cfn:stack-create stage), rather than at the beginning. Though Operable is working on ways to implement real pipeline-global variables in Cog, we can fake it today using the built-in operable:tee and operable:cat commands.

(As the names suggest, these are chat analogues of the Unix tee and cat command line executables.)

By running tee as the first command, we can capture the incoming trigger information to a named location, while also allowing it to flow onward through the pipeline. Later, we can use cat to retrieve this information and merge it into our retrieved definition information. Now we’ve got access to both our trigger data as well as our definition data all in one place. This flows into our cfn:stack-create command where we can extract the values for our parameters, tags, and names from our trigger data.

Here’s what our pipeline looks like now (again, formatted for readability):

operable:tee cog-enterprise-functional-test-create |
cfn:definition-show cog-functional-test |
operable:cat --merge cog-enterprise-functional-test-create |
cfn:stack-create --capabilities iam
                 --param $params
                 --param "CredentialSubDirectory=${body.ci_params.credential_subdirectory}"
                 --tag $tags
                 --tag "Owner=${body.ci_params.owner}"
                 "${body.ci_params.stack_prefix}-${name}"
                 $template_url
*> chat://#dev here

Here’s what a curl command that triggers this pipeline might look like:

curl $MY_TRIGGER_URL 
     --header "Content-Type: application/json"
     --header "Accept: application/json" 
     --data "{\"ci_params\": {\"credential_subdirectory\": \"chris\", \
 \"owner\": \"chris\", \
 \"stack_prefix\": \"chris\"}}" 
     --verbose 
     --include 
     --fail

(Incidentally, using this tee trick comes in handy for debugging triggers, as you can run the corresponding cat command after you run your trigger in order to see exactly what the trigger is sending.)

Note that we’re redirecting the trigger output both to our chat room #dev and to here. The #dev sends output to our Slack channel, allowing everyone to see what’s happening, while the here ensures that detailed output from the trigger invocation goes back to our CI system, which facilitates debugging if something goes wrong.

Tearing It All Down

To tear everything down, we can set up another trigger. It’s very similar to the stack creation trigger, but deletes the stack instead.

operable:tee cog-enterprise-functional-test-delete |
cfn:definition-show cog-functional-test |
operable:cat --merge cog-enterprise-functional-test-delete |
cfn:stack-delete "${body.ci_params.stack_prefix}-${name}"
*> chat://#dev here

Pulling It All Together

We’ve got all the building blocks; now we just need to put them together. We’re happy users of Buildkite, which provides a very flexible CI platform. Individual steps in a pipeline can be defined as scripts; this is one that we use to set up our testing stacks. (We can use one of the many environment variables Buildkite provides to determine who we need to build a stack for; if we’re building on the master branch, however, we’ll use a common set of credentials.)

#!/bin/bashset -xeuo pipefailif [ "master" == "${BUILDKITE_BRANCH}" ]
then
    SUBDIRECTORY="master"
    STACK_PREFIX="master"
    OWNER="engineering"
else
    OWNER=${BUILDKITE_BUILD_CREATOR_EMAIL/%@operable.io/}
    SUBDIRECTORY=${BUILDKITE_BUILD_CREATOR_EMAIL}
    STACK_PREFIX=${OWNER}
fi# cog_enterprise_functional_test_create trigger
echo "--- Creating new stack via :cogops: Trigger!"
curl <OUR_TRIGGER_URL> \
     --header "Content-Type: application/json" \
     --header "Accept: application/json" \
     --data "{\"ci_params\": {\"credential_subdirectory\": \"${SUBDIRECTORY}\", \"stack_prefix\": \"${STACK_PREFIX}\", \"owner\": \"${OWNER}\"}}" \
     --verbose \
     --include \
     --fail

The Buildkite engineers really love emoji, and they added a custom one for us, so we have to use it!

This runs our trigger, and thanks to Buildkite’s Slack integration and the output from our trigger, we can follow the whole process in chat!

Once I’ve interacted with my Cog Enterprise bot (“icebear”) and satisfied myself that everything looks good, I can go back to Buildkite to unblock the rest of the pipeline. That runs our delete trigger, and everybody can see it all happen in chat.

And yes, we’re working on a Buildkite command bundle for Cog, so soon we’ll be able to unblock this pipeline from chat, too.

Conclusion

It was pretty fun to figure out how to take these building blocks we’ve created with Cog and piece them together to achieve the goal of setting up a fully automated test environment for our Cog Enterprise product. Each time we ran into a snag, a bit of thinking showed how to take what we already had and make it work; there was no need to write a custom command to achieve this.

Let’s recap what we did. We took our existing cfn bundle, which we already use to manage our development infrastructure stacks, and used it to create and destroy self-contained testing infrastructure for our Cog Enterprise product. The templates and parameters are managed in our internal infrastructure git repository (as are all artifacts managed via the cfn bundle), providing repeatability and traceability. Thanks to Cog’s powerful pipeline model, we were able to assemble pipeline triggers that do create dynamic stacks from a definition, something our cfn bundle commands weren’t even designed to do! We then integrated these triggers into our Buildkite CI pipeline, relaying important information back into our chat channel.

If you’ve also done some unconventional things with Cog, we’d love to hear about it. Stop by our Slack channel and let us know!