The sections below are about the conceptual building blocks needed to work with Conducto pipelines. Each section header links to a more in-depth article about that concept. If you haven't launched a Conducto pipeline yet, you might want to check out Getting Started before digging into the articles here.
Conducto can be used in different ways. These articles don't make assumptions about what you're using Conducto for, they just describe how it works. If you've got something specific in mind, head over to Advanced for articles about how to use Conducto to achieve something specific. You can always jump back here if you need to brush up on a concept.
A Conducto pipeline is a tree which controls how commands are executed.
Exec type nodes are at the leaves of the tree--they run shell commands in containers.
Serial nodes control the way that their children run.
Conducto provides an API so you can arrange these nodes in code to make a pipeline definition.
Once you have a definition, you can use it to launch pipelines that you can view and control at conducto.com/app. From there you can run the pipeline, examine results, make changes, and rerun all or part of the pipeline. The history of each node execution is saved with the pipeline, so you can see the impact of any changes you made.
When you launch a pipeline, you get to pick a mode. In local mode, your commands run in containers on your local machine. In cloud mode, Conducto secures resources for your pipeline and the computation happens in the cloud.
Local mode is pretty powerful on its own, but cloud mode lets you scale your pipeline beyond what local resources would allow. It also lets you ensure that resources are always available to run pipelines when they're needed.
Local mode is free, but to use cloud mode you'll first need to enable billing for your Conducto org.
Exec node runs a command, it creates a container for that command.
The container's contents may change based on your command, but it starts with an initial disk image that remains fixed.
You can control which software is available to the container by making modifications to the image before the container gets created.
If you make changes, you'll want to be aware of whether you're changing something in the image, or something in the container, because it will affect how much reprocessing is needed before you can see your changes in action.
Creating containers takes some time.
So as Conducto runs nodes, it sometimes reuses a container from earlier in the pipeline to speed things up.
container_reuse_context node parameter gives you control over whether and how this happens.
If you arrange your pipeline well, the Conducto web app will quickly draw your attention to problem areas. But it can only zoom in so closely--at some point you're going to need tools that are specifically suited for that node. In the web app, click the debug button to copy a command for your terminal.
Pasting it into your terminal will give you an interactive shell session in a container that's just about to run the suspicious command, so you can poke around, attach a debugger, or work whatever magic you need. When you're ready to catch that bug, you can run the same script that Conducto would run and expect the same behavior that you'd see in-pipeline.
If you're writing your pipeline definition in python, and you want your node to call a python function, there's a shortcut.
Rather that setting up your
Exec nodes to run shell commands that invoke
python, instead initialize the Exec node directly with the function that you want it to call.
That node will call your function, and you won't have to worry about a command line interface for it.
It's just a stylistic choice, but when it fits it can make your pipeline much more readable.
In order for you to control pipelines remotely, there needs to be an agent on the local machine that maintains a connection to conducto.com and listens for events. If you launch a pipeline with a local command, an agent will be started automatically, otherwise the Conducto web app will give you a command to start one.
Lastly, you should know that Conducto pipelines aren't security sandboxes. You should think twice before launching pipelines from untrusted sources--especially if you're running them in local mode.
That's it for Basics. If you have a complex task that needs to be done repeatedly, the articles above should give you what you need to build a pipeline for it.
Before you start coding, check out the Advanced section. There you'll find articles about specific things that you might want to use Conducto for. With some luck, we've anticipated your use case and you'll find some more specific guidance there.
We're always adding functionality and working to improve these docs, so in the future, you can expect more articles to appear above.
- Still have questions?
- Want to show off a cool pipeline that you built?
- Have an idea for how Conducto could be even more useful?
Whatever it is, feel free to drop us a line. Otherwise, happy pipelining.