"Pipeline" is thrown around a ton: ML pipelines, Data pipelines, CI/CD pipelines.
This whiteboard discussion outlines a basic approach to unpacking the term:
- An ML Pipeline that transforms/scales data for training & inferencing -- as a basic as a scikit-learn pipeline
- A Data Pipeline that updates the underlying training/testing data -- your Airflow/Prefect/Dagster/{Insert Orchestration} job
- A CI/CD Pipeline that builds, tests, and deploys the Data & ML Pipelines when ML engineers commit their changes -- Jenkins/TravisCI/GH-Actions/{AWS|Azure|GCP variant}
When we talk to ML teams, it's not uncommon to hear much more complicated and intersecting approaches, but this was a useful framework for me when thinking about the term "pipeline"
#machinelearning #cicdpipelines
Connect with me on LI: / gustafrcavanaugh
Watch video ML Pipeline vs Data Pipeline vs CICD Pipeline (Whiteboard Session) online without registration, duration hours minute second in high quality. This video was added by user Gus Cavanaugh 20 December 2022, don't forget to share it with your friends and acquaintances, it has been viewed on our site 332 once and liked it 4 people.