Images

Overview

Before Conducto runs a pipeline, it builds an image for each Exec node. This page will show you how to customize those images so that your commands have the software that they need.

Stages of Image Customization

These are the stages that make an image ready for use by a node in a Conducto pipeline:

Not every stage is executed for every image. Instead, Conducto picks the stages based on the arguments that initialize the Image object for a given Node.

The rest of this article is about how to use those arguments to get what you want. Click an argument in the diagram (under "Triggered By") to jump to its section or keep reading for the full tour.

Using An Unmodified Image

By default, Exec nodes will create containers based on Debian, which you can use as is.

import conducto as co

def which_distro() -> co.Serial:
    pipeline = co.Serial()
    pipeline["Node Name"] = co.Exec("cat /etc/*-release | egrep '^NAME'")    #     NAME="Debian GNU/Linux"    return pipeline

if __name__ == "__main__":
    co.main(default=which_distro)

If you want a different image, find one in a registry (like dockerhub) and provide its name as a string via the image kwarg for that Node.

cmd = "cat /etc/*-release | egrep '^NAME'"
pipeline["Node Name"] = co.Exec(cmd, image='alpine:latest')
#     NAME="Alpine Linux"

In the place of a string, you can also use an Image object to express this.

img = co.Image(image='alpine:latest')
pipeline["Node Name"] = co.Exec(cmd, image=img)

To use the features described in the rest of this article, an Image object is necessary.

Runtime Changes

If you configure a node to use an unmodified image, then any remaining configuration will happen when the node creates the container, so if you want to make changes after creating the pipeline, you'll need to do that via the Conducto web interface.

Editing the command in the Conducto UI

Adding Local Files

Conducto can copy directly from your filesystem into the image. This is a good way to customize your images if you're writing code that is central to the pipeline's purpose. Use copy_dir to add your files to the image's working directory.

python_img = co.Image(image="python:3.8-alpine", copy_dir=".")
def hello() -> co.Serial:
    pipeline = co.Serial(image=python_img)
    pipeline["Say Hi"] = co.Exec("python hello.py")    return pipeline

copy_dir wants a path relative to the directory containing the pipeline definition file. In this case, hello.py is in that directory, so it will be available to the command.

As an added benefit, specifying copy_dir enables Conducto to mount your files in a live debug session.

Image Changes

Conducto doesn't automatically rebuild the image when you reset a node's state. If you've made changes to the copied files, you'll have to use the rebuild image button to see their effects.

The rebuild image button

Adding Files via Git

Conducto can add files from a git repository to your images. This is done with either copy_url or copy_repo

We try to be smart about which branch to chose, but you can also specify it directly with copy_branch.

Explicitly via copy_url

The example below will pull the "main" branch of the Conducto Examples repo into an image so that the Exec node can reference its contents.

img = co.Image(    image="python:3.8-alpine",    copy_url="https://github.com/conducto/examples",    copy_branch="main"    )
def hello() -> co.Serial:
    pipeline = co.Serial(image=img)
    pipeline["Say Hi"] = co.Exec("python hello_py_js/hello.py")    return pipeline

If you omit branch, Conducto will chose the default branch for the indicated repo.

Implicitly via copy_repo

Setting copy_repo to True tells Conducto to guess about which repo and branch to use. The chosen repo will always be the one that contains the pipeline definition.

This is a good choice for CI/CD pipelines. If you've configured an integration to launch pipelines in response to pull requests then copy_repo will clone the compare branch for that pull request.

In any other case (like handling push event or launching a pipeline from disk), it will be the repo's default branch.

Adding Packages

Conducto can install extra packages into your images. reqs_py installs Python packages, and reqs_packages uses whichever package manager goes with your image's linux flavor.

Python Packages

Sometimes it's nice to have everything in one file, which is possible with Native Functions. To make this work, Conducto calls into your pipeline definition code a second time from within the container.

reqs_py lets you indicate which Python packages to install.

import conducto as co
from cowpy.cow import Mooseimg = co.Image(image="python:3.8-alpine", reqs_py=["conducto", "cowpy"], copy_dir=".")
def say_it():
    print(Moose().milk("Hello World"))
def hello() -> co.Serial:
    pipeline = co.Serial(image=img)
    pipeline["Say Hi"] = co.Exec(say_it)    return pipeline

Linux Packages

If you pick an image with a package manager (apt, apk, etc.) you can provide a list of packages to reqs_packages and Conducto will install them for you. For instance, the snippet below installs jq and uses it to parse some JSON.

pipeline["Say Hi"] = co.Exec(
    """
    echo '{"message": "Hello World"}' | jq '.message'
    """,
    image=co.Image(reqs_packages=["jq"])
)

Docker

Since we integrate heavily with Docker, there's also reqs_docker, which will install docker on your image. When you use reference it, be sure to also use the requires_docker node paraeter.

with co.Serial() as root:
    root["Say Hi"] = co.Exec(
        "docker run hello-world",
        image=co.Image(reqs_docker=True) # installs docker client        requires_docker=True             # connects client to daemon    )

In local mode, this will make your docker socket available to that node's containers, so be sure you understand the security implications before launching pipelines from definitions that do this.

Go Custom with a Dockerfile

The traditional way to create docker images is by writing a Dockerfile. If you already have one lying around, or you want to go beyond what the Image class supports, you can have Conducto use that instead.

Here's one:

FROM debian:bullseye-20200720-slim
RUN apt-get update && apt-get install -y git cowsay
RUN git clone http://github.com/possatti/pokemonsay
RUN cd pokemonsay && ./install.sh
ENV PATH "/usr/games:/root/bin:${PATH}"

And the pipeline that uses it:

import conducto as co

img = co.Image(dockerfile="Dockerfile")#img = co.Image(dockerfile="Dockerfile",
#               context='/path/to/build/context')

def hello() -> co.Serial:
    with co.Serial(image=img) as pipeline:
        pipeline["Say Hi"] = co.Exec("pokemonsay -p Oddish -n 'Hi'")
    return pipeline

if __name__ == "__main__":
    co.main(default=hello)

This lets you configure your image however you like.

A powerful third-party utility

If you keep the Dockerfile in the same directory as the pipeline definition, it's ok to omit the context kwarg. Otherwise, you'll need to tell Conducto where to look.

Mount Local Files while Debugging

If an Exec node in a pipeline would change a file in the image, a copy is made instead--just for use by that container. This means that new containers on the same image are unaffected by the change, and makes for more repeatable results.

You can get a container to recreate a node's behavior outside of its pipeline. We call them snapshots, because they're like a single scene in the greater story that is the pipeline. For Exec nodes, Conducto can provide a command that will create a snapshot for you.

Copy a command to start a snapshot debug session

Exploring a snapshot with an interactive shell is a good way to understand what's going on, but any changes that you make will only affect that session. When the container goes away, so do they. If you want your changes to live longer than the snapshot container, you can set up "live debug".

With path_map

In this mode, you're editing local files, not copies. Subsequent (live) debug sessions will include your changes. Once you're happy with them, you can apply the changes to the image so that they'll affect the pipeline as well.

To make live debug possible, Conducto needs paths in the local filesystem so it can mount them in the debug container. If you use copy_dir, Conducto knows where your files are, and doesn't need any additional help.

If you're not using copy_dir, and you want to use live debug sessions, then you'll need to use the path_map kwarg when you create your Image object.

python_img = co.Image(
    image="python:3.8-alpine",
    copy_url="https://github.com/leachim6/hello-world",
    copy_branch="master",
    path_map={"./local-copy/p":"p"})

It takes a dict where the keys are paths relative to the pipeline definition on your local filesystem and the values are paths relative to the container working directory.

path_map is not used during pipeline execution, or in a snapshot debug session. But in a live debug session the contents of ./local-copy/p are mounted into the container filesystem at ./p.

So path_map aligns the image filesystem with your local one for use in a live debugg session. You can learn more about how to use this workflow in debugging.

Conclusion

If your commands depend on something common, you can just name an image that provides what you need. If they need files from elsewhere, you can add them concisely via parameters on Image. Or use your own Dockerfile if that's your style.

Whatever environment you set up for each node, Conducto keeps those choices with the pipeline definition so that your Nodes behave the same wherever you run them. That is, unless you use path_map to mount your own files for a live debug session.

Concepts

Example Pipelines

API's

External Sites

Chat with us for a live demo right now!
(If we're awake 😴)

avatar