Why We Don't Use the Mean Squared Error (MSE) Loss in Classification

Published: 05 June 2023
on channel: DataMListic

5,190

124

In this video we discuss why the mean squared error (MSE) loss is not used for classification problems. We take a look at three important aspects: (1) the MSE assumes a gaussian prior, (2) the MSE applied on classification problems results in a non-convex function and (3) the MSE doesn't penalise well enough the errors in classification compared to the binary cross entropy loss function.

References
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Gaussian distribution explained:    • Multivariate Normal (Gaussian) Distri...
Binary cross entropy prior for Bernoulli distribution: https://towardsdatascience.com/where-...
Demonstration that the binary cross entropy loss for classification is convex: https://towardsdatascience.com/why-no...

Related Videos
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Why neural networks are universal functions approximators:    • Why Neural Networks Can Learn Any Fun...
Why we need activations in neural nets:    • Why We Need Activation Functions In N...
Bias variance Trade-off:    • Bias-Variance Trade-off - Explained
Neural networks on tabular data:    • Why Deep Neural Networks (DNNs) Under...
Why we divide by N-1 in the sample variance:    • Why We Divide by N-1 in the Sample Va...

Contents
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
00:00 - Intro - MSE for classification
01:12 - Reason 1 - MSE assumes a gaussian prior
04:15 - Reason 2 - MSE non-convexity
08:03 - Reason 3 - MSE weak penalisation
08:42 - Outro

Follow Me
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
🐦 Twitter: @datamlistic   / datamlistic
📸 Instagram: @datamlistic   / datamlistic
📱 TikTok: @datamlistic   / datamlistic

Channel Support
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
The best way to support the channel is to share the content. ;)

If you'd like to also support the channel financially, donating the price of a coffee is always warmly welcomed! (completely optional and voluntary)
► Patreon:   / datamlistic
► Bitcoin (BTC): 3C6Pkzyb5CjAUYrJxmpCaaNPVRgRVxxyTq
► Ethereum (ETH): 0x9Ac4eB94386C3e02b96599C05B7a8C71773c9281
► Cardano (ADA): addr1v95rfxlslfzkvd8sr3exkh7st4qmgj4ywf5zcaxgqgdyunsj5juw5
► Tether (USDT): 0xeC261d9b2EE4B6997a6a424067af165BAA4afE1a

#mse #bce #classification #stats

Watch video Why We Don't Use the Mean Squared Error (MSE) Loss in Classification online without registration, duration hours minute second in high quality. This video was added by user DataMListic 05 June 2023, don't forget to share it with your friends and acquaintances, it has been viewed on our site 5,190 once and liked it 124 people.

77,957

105