Why We Don't Use the Mean Squared Error (MSE) Loss in Classification

Опубликовано: 05 Июнь 2023
на канале: DataMListic
5,190
124

In this video we discuss why the mean squared error (MSE) loss is not used for classification problems. We take a look at three important aspects: (1) the MSE assumes a gaussian prior, (2) the MSE applied on classification problems results in a non-convex function and (3) the MSE doesn't penalise well enough the errors in classification compared to the binary cross entropy loss function.

References
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Gaussian distribution explained:    • Multivariate Normal (Gaussian) Distri...  
Binary cross entropy prior for Bernoulli distribution: https://towardsdatascience.com/where-...
Demonstration that the binary cross entropy loss for classification is convex: https://towardsdatascience.com/why-no...

Related Videos
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Why neural networks are universal functions approximators:    • Why Neural Networks Can Learn Any Fun...  
Why we need activations in neural nets:    • Why We Need Activation Functions In N...  
Bias variance Trade-off:    • Bias-Variance Trade-off - Explained  
Neural networks on tabular data:    • Why Deep Neural Networks (DNNs) Under...  
Why we divide by N-1 in the sample variance:    • Why We Divide by N-1 in the Sample Va...  

Contents
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
00:00 - Intro - MSE for classification
01:12 - Reason 1 - MSE assumes a gaussian prior
04:15 - Reason 2 - MSE non-convexity
08:03 - Reason 3 - MSE weak penalisation
08:42 - Outro

Follow Me
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
🐦 Twitter: @datamlistic   / datamlistic  
📸 Instagram: @datamlistic   / datamlistic  
📱 TikTok: @datamlistic   / datamlistic  

Channel Support
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
The best way to support the channel is to share the content. ;)

If you'd like to also support the channel financially, donating the price of a coffee is always warmly welcomed! (completely optional and voluntary)
► Patreon:   / datamlistic  
► Bitcoin (BTC): 3C6Pkzyb5CjAUYrJxmpCaaNPVRgRVxxyTq
► Ethereum (ETH): 0x9Ac4eB94386C3e02b96599C05B7a8C71773c9281
► Cardano (ADA): addr1v95rfxlslfzkvd8sr3exkh7st4qmgj4ywf5zcaxgqgdyunsj5juw5
► Tether (USDT): 0xeC261d9b2EE4B6997a6a424067af165BAA4afE1a

#mse #bce #classification #stats


Смотрите видео Why We Don't Use the Mean Squared Error (MSE) Loss in Classification онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь DataMListic 05 Июнь 2023, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 5,190 раз и оно понравилось 124 людям.