Introduction to Scikit-Learn pipeline API

Опубликовано: 04 Декабрь 2020
на канале: AIEngineering
5,543
150

#datascience #machinelearning #ml

Scikit-learn Pipeline can be used to chain multiple estimators into one. This is useful as there is often a fixed sequence of steps in processing the data, for example feature selection, normalization and classification. Pipeline serves multiple purposes here:

Convenience and encapsulation
You only have to call fit and predict once on your data to fit a whole sequence of estimators.

Joint parameter selection
You can grid search over parameters of all estimators in the pipeline at once.

Safety
Pipelines help avoid leaking statistics from your test data into the trained model in cross-validation, by ensuring that the same samples are used to train the transformers and predictors.

All estimators in a pipeline, except the last one, must be transformers (i.e. must have a transform method). The last estimator may be any type (transformer, classifier, etc.).

Ref: https://scikit-learn.org/stable/modul...


Смотрите видео Introduction to Scikit-Learn pipeline API онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь AIEngineering 04 Декабрь 2020, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 5,543 раз и оно понравилось 150 людям.