This article discusses GPT-2 and BERT models, as well using knowledge distillation to create highly accurate models with fewer parameters than their t…
In deep learning, there is no obvious way of obtaining uncertainty estimates. In 2016, Gal and Ghahramani proposed a method that is both theoretically…