Handling Imbalanced Datasets in Python with Stratified Split, SMOTE and Random Oversampling

Published: 08 May 2022
on channel: Analytics with Adam
1,756
28

In this video, we discuss handling imbalanced datasets in a classification context by using a number of different sampling techniques in python.

We begin by using a stratified split technique to ensure the training and test sets have an equal proportion of samples from each class. We then move on to the business of handling imbalanced datasets by employing the SMOTE technique, which oversamples the minority class by creating synthetic observations and Random Oversampling which oversamples instances from the minority class. SMOTE and Random Oversampling both rely on the imbalanced learn library (imblearn).

The full python notebook is available from github at the following link if you want to follow along. https://github.com/SuperDataWorld/Pyt...


Watch video Handling Imbalanced Datasets in Python with Stratified Split, SMOTE and Random Oversampling online without registration, duration hours minute second in high quality. This video was added by user Analytics with Adam 08 May 2022, don't forget to share it with your friends and acquaintances, it has been viewed on our site 1,756 once and liked it 28 people.