The machine learning consultancy: https://truetheta.io
Join my email list to get educational and useful articles (and nothing else!): https://mailchi.mp/truetheta/true-the...
Want to work together? See here: https://truetheta.io/about/#want-to-w...
Neural Networks see something special in the softmax function.
SOCIAL MEDIA
LinkedIn : / dj-rich-90b91753
Twitter : / duanejrich
Github: https://github.com/Duane321
Enjoy learning this way? Want me to make more videos? Consider supporting me on Patreon: / mutualinformation
SOURCE NOTES
I decided to make this video when inspecting jacobians/gradients starting from the end of a small network. Right near the softmax, the jacobian looked simple enough that I suspected interesting math behind it. And there was. I came across several excellent blogs on the Softmax's jacobian and its interaction with the negative log likelihood. Source [1] was the primary source, since it was quite well explained and used condensed notation. [2] was useful for understanding the broader context and [3] was a separate, thorough perspective.
SOURCES
[1] M. Peterson, "Softmax with cross-entropy," https://mattpetersen.github.io/softma..., 2017
[2] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, MIT Press, 2016, section 6.2.2.3
[3] M. Lester James, "Understanding softmax and the negative log-likelihood," https://ljvmiranda921.github.io/noteb..., 2017
TIME CODES
0:00 Everyone uses the softmax
0:23 A Standard Explanation
3:20 But Why the Exponential Function?
3:57 The Broader Context
6:05 Two Choices Together
6:51 The Gradient
10:07 Other Reasons
Смотрите видео Why Do Neural Networks Love the Softmax? онлайн без регистрации, длительностью часов минут секунд в хорошем качестве. Это видео добавил пользователь Mutual Information 07 Июнь 2023, не забудьте поделиться им ссылкой с друзьями и знакомыми, на нашем сайте его посмотрели 68,320 раз и оно понравилось 3.3 тысяч людям.