Data

Accessing and saving data is fundamental to many categories of pipelines, including data science, ETL, analytics, and more. The Conducto philosophy of data is to let you call your own functions to access your own data. Builtin methods are not needed for connecting to different data sources, because you can generate your pipeline using the full power of Python. Lazy Pipeline Creation helps you iterate over your data dynamically, and there are a few additional features to help with data access.

Secrets

Accessing your own data sources often requires using secret keys and tokens that you don’t save anywhere in plaintext. Conducto will save these for you encrypted in the AWS Parameter Store, only accessible using your own login credentials.

In your Profile in the app, you can set your own user-level secrets, visible to only you. If you are an Admin of your Org, you may also create org-level secrets that are visible to anyone in your org.

Set key/value pairs that are passed into the environment of each Exec node. They are visible to your commands but are not displayed to the user in the app.

co.data

Conducto provides a simple object-store for your data. In local mode it is backed by local disk, and in cloud mode it is backed by S3. Cloud data, like cloud logs, are permissioned to only be accessible by your user.

conducto.data.pipeline gives you a way to store data that is scoped to the current pipeline. When your pipeline is archived and the logs cleaned up, your data will be deleted as well. ETL and data science pipelines will commonly need a temporary but high performance location to stage data, and co.data.pipeline is one simple option.

conducto.data.user provides a similar interface for data that should persist past the life of the pipeline. It is scoped by user and is only visible to the user who created it.

class conducto.data.pipeline
classmethod delete(name, recursive=False)

Delete object at name.

classmethod get(name, file)

Get object at name, store it to file.

classmethod gets(name, *, byte_range: List[int] = None) → bytes

Return object at name. Optionally restrict to the given byte_range. Byte range is on the half open interval [begin, end)

classmethod list(prefix)

Return names of objects that start with prefix.

classmethod put(name, file)

Store object in file to name.

classmethod size(name)

Return the size of the object at name, in bytes.

class conducto.data.user

See also data.pipeline which has an identical interface.

Chat with us for a live demo right now!
(If we're awake 😴)

avatar