"Pipeline" is thrown around a ton: ML pipelines, Data pipelines, CI/CD pipelines.
This whiteboard discussion outlines a basic approach to unpacking the term:
- An ML Pipeline that transforms/scales data for training & inferencing -- as a basic as a scikit-learn pipeline
- A Data Pipeline that updates the underlying training/testing data -- your Airflow/Prefect/Dagster/{Insert Orchestration} job
- A CI/CD Pipeline that builds, tests, and deploys the Data & ML Pipelines when ML engineers commit their changes -- Jenkins/TravisCI/GH-Actions/{AWS|Azure|GCP variant}
When we talk to ML teams, it's not uncommon to hear much more complicated and intersecting approaches, but this was a useful framework for me when thinking about the term "pipeline"
#machinelearning #cicdpipelines
Connect with me on LI: / gustafrcavanaugh
Смотрите видео ML Pipeline vs Data Pipeline vs CICD Pipeline (Whiteboard Session) онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Gus Cavanaugh 20 Декабрь 2022, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 332 раз и оно понравилось 4 людям.