CI/CD

It Starts With a Code Change

If you make a lot of code changes, you might already be familiar with CI/CD. It bridges the gap between "code changed" and "problem solved".

If you have a lot of work left after making a change, but before starting on the next one, then you should consider automating that work into a CI/CD pipeline. This kind of work typically breaks nicely into pieces that can fail, get fixed, and be rerun independently without restarting the whole task.

Conducto provides the scaffolding for your CI/CD automation. You build the independent pieces that do the heavy lifting, and Conducto gives you control over whether and when they run.

Continuous Integration (CI)

Continuous Integration runs a sequence of checks on every candidate code change. The more you trust your CI, the less it takes to trust the change and move on to the next one.

Good CI will automatically check:

  • Does the code compile?
  • Did I break any tests?
  • Does my code conform to this project's style guidelines?

This can be helpful if you're working on a team. For instance, it can prevent you from committing the CI says that the latest code is broken, you might want to avoid updating your local copy until it's fixed.

Continuous Delivery (CD)

Some problems you must solve, but others you can avoid. Continuous Delivery lets you avoid the problems that come up when you're working on a version that is much newer than the version that your users see.

CD is all about getting fresh code to the user as often as possible, and without compromising on quality. This prevents the version gap from getting too big. If releasing your software is stressful, Continuous Delivery might be worth trying. For projects where it's a good fit, it can prevent a lot of headaches.

Triggering CI/CD Events

CI/CD is all about sentences like this one:

If I make a code change, then I want ____ to happen.

In this section, we're concerned with the if part. Here you'll learn how to make sure that your pipeline definition script gets called at the right times.

The next section has to do with the pipeline definition itself. That's where you'll fill in the blank and make the pipeline do the necessary work.

Enable The GitHub Integration

In order for Conducto to be sensitive to your repository's events, you'll need to give it access. We provide integrations for this. For now, we'll focus on GitHub.

Configure Your Repo to Listen for Events

Once installed, Conducto listens to events from all your repos. When it gets one, it will check the relevant branch to see if a file named .conducto.cfg exists at the repo root. If it does, then its contents will determine whether there is an action for that event.

Conducto scans the default branch for .conducto.cfg

For instance, to launch a pipeline defined in pipeline.py each time new code is pushed, .conducto.cfg might look like this.

[push]
command = python pipeline.py --branch {branch}

The value of variables like {branch} will depend on which integration is accepting the event. For push type events from GitHub, it will be the branch that was recently pushed.

To simulate a push event to the "master" branch, run:

conducto .conducto.cfg push --branch=master

Whether source control triggers it, or the conducto command does, python pipeline.py will run in the same context. The rest is up to the pipeline that it launches.

Choose Where It Runs

You can let Conducto run your CI/CD pipelines in the cloud, or you can dedicate a local machine to the task. The results should be the same whichever you choose, because Conducto pipelines use containers.

To use a local machine, start an agent on it and keep it connected to the internet. When Git events come in, Conducto will alert that agent to run your CI/CD pipeline. Local mode is free, but your machine has limited resources and has to stay online.

Decide how to handle repo events

If you run your CI/CD pipelines in the Conducto cloud, they are not tied to any individual machine and can run hundreds of tasks simultaneously. They run with a dedicated user in your org specific to this GitHub installation. There are no subscription fees, you're just charged for what you use.

Defining Pipelines for CI/CD

Now that you know how to trigger a pipeline when your code changes, it's time to make one that does something useful. Your needs will depend on your project and your team, but here are some ideas:

  • compile the code
  • run unit tests
  • deploy a test environment
  • run integration tests
  • clean up the test environment
  • update production with the new code

This section will point out some things to consider while designing pipelines to complete common CI/CD tasks. If you don't have a pipeline defined yet, you can start with the Hello World one.

Using Code From a Repo

Outside of CI/CD, launching a pipeline usually means having the pipeline definition in your local filesystem. This way the definition can reference local files. This is especially useful if you want the pipeline to include uncommitted changes.

In CI/CD, you should only use the contents of the Git repo and not your local uncommitted changes. That way the pipeline will run the same for you as for all your teammates.

To do this, use the copy_repo kwarg when you initialize an Image object. Exec nodes using that image will have access to the updated code.

img = co.Image(image="python:alpine", copy_repo=True)
co.Exec("pip install -r requirements.txt", image=img)

See Images for more about how to control the runtime context for your CI/CD commands.

Environment Considerations

Not all CI/CD pipelines need external infrastructure. The containers created by your pipeline might be enough to get the job done.

But sometimes you do need external resources. If your production environment isn't a pile of containers, then your test environment shouldn't be either. This means that your pipeline might need to set up your test environment, run your tests, and tear it down (maybe right away, or maybe as part of a later run).

If done correctly, this is a good way to make sure that you get an consistent environment for your tests. But if you decide to manage external resources from within your pipeline, there are a few things to consider.

  1. How can we prevent concurrent pipelines from interfering with each other?
  2. Which events set up external infrastructure, and which ones tear it down?
  3. What if something interrupts the pipeline? Will a rerun recover from intermediate states?

The next three subsections explore these questions. You can skip them if your CI/CD pipeline doesn't create external resources.

Environment Isolation

Here at Conducto, we use Conducto for the CI/CD pipeline that builds and tests Conducto. To keep test environments separate from each other, we append a string to the resources that the pipeline creates.

When we launch a CI/CD pipeline by hand, it uses the developer's initials. When it creates or destroys resources, the initials restrict the impact to the ones created for that developer. This keeps our pipelines from interfering in each others environments. Our PR's use the same pipeline definition to test changes before we merge them, but those pipelines use PR number to avoid collisions in a similar way.

Your situation might call for something different. In any event, it's worth thinking ahead about how you'll determine which pipelines act on which resources.

Environment Clean Up

If you need something other than the pipeline itself as a test environment, you're going to want a cleanup step--otherwise you may end up paying for resources that you aren't using. The simplest approach is to have every pipeline execution (except for the one that deploys to production) clean up after itself. That way you never have any orphaned resources lying around.

A perfect CI/CD pipeline would always tell you everything you need to know about a problem. But until your pipeline is perfect, you might still want to log in like a user and poke around. One strategy for making this easy is to leave your PR test environments deployed until someone closes the PR.

To make this possible, you can configure Conducto to handle PR-closed events separately from PR-created/updated events.

Managing External State

There are other issues to consider when your pipelines manage external resources.

  • What will happen if somebody kills a pipeline half-way through infrastructure setup.
  • Will rerunning a step do the right thing?
  • Do cleanup steps always know the full extent of what they need to clean up?

If these questions are easy to handle by hand, then there's nothing wrong with a little bit of code like this:

if infra is not up:
    set it all up
elif infra is partially up:
    set up the missing part
else:
    do nothing

But if you spend a lot of time thinking about how to handle external state, then you should be aware that there are tools for that kind of thing. They go under the heading "Infrastructure as Code". Once you teach them to set up and tear down environments, they can handle the transition from the current state to the one you need, and your pipeline need only direct the timing of those changes.

The Old Workflow, Now Automated

Once you have the eventing set up and your environmental needs are met, you're almost ready for this investment to start paying off. The last step is to take whatever it is you do by hand for each code change, and add it to your pipeline.

If you have multiple commands for building, testing, or meeting some other need, it's a good idea to organize those commands under a parent node with a helpful name. That way you can say things like "the test step failed" and other users of the pipeline will know exactly where to look.

Once you've got a CI/CD pipeline that you trust, you'll probably find that it saves you a lot of time. And when new tasks come up, you'll be all set up with a framework for automating them.

Conclusion

It's your project, and you know what needs to be done, so we've built Conducto to stay out of your way as much as possible.

In this article we described what CI/CD pipelines are and why you might want one. We started by showing you how to wire repository events to pipeline actions:

  • Enable the GitHub integration
  • Add a .conducto.cfg that calls a pipeline.
  • Decide where the CI/CD pipelines should run. Enable cloud mode if necessary.

After completing the steps above, you should have pipelines launching when you make code changes. The later sections in this article point out a few things to consider while writing your pipeline definition. With these things out of the way, the stage was set for you to move your workflow into a Conducto pipeline.

Optimize the build, add better tests, use more realistic test environments, or start delivering code to your users faster than ever: There are many ways to leverage a CI/CD pipeline, we hope that Conducto will make it easy for you to benefit from them.

If you run into trouble, or just want to show off your awesome new pipeline, drop us a line. Otherwise, happy pipelining.

Chat with us for a live demo right now!
(If we're awake 😴)

avatar