Training Bayesian Neural Networks

University essay from Lunds universitet/Beräkningsbiologi och biologisk fysik - Genomgår omorganisation

Abstract: Although deep learning has made advances in a plethora of fields, ranging from financial analysis to image classification, it has some shortcomings for cases of limited data and complex models. In these cases the networks tend to be overconfident in their prediction even when erroneous - something that exposes its applications to risk. One way to incorporate an uncertainty measure is to let the network weights be described by probability distributions rather than point estimates. These networks, known as Bayesian neural networks, can be trained using a method called variational inference, allowing one to utilize standard optimization tools, such as SGD, Adam and learning rate schedules. Although these tools were not developed with Bayesian neural networks in mind, we will show that they behave similarly. We will confirm some best practices for training these networks, such as how the loss should be scaled and evaluated. Moreover, we see that one should avoid using Adam in favor of SGD and AdaBound. Wee see that one should also group the learnable parameters in order to use custom learning rates for the different groups.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)