Entity embeddings for fun and profit

OpenTeams November 25, 2018

Embedding layers are not just useful when working with language data. As “entity embeddings”, they’ve recently become famous for applications on tabular, small-scale data. In this post, we exemplify two possible use cases, also drawing attention to what not to expect.

Categories: Uncategorized

Tags: Entity embeddings

Inside the Matrix: Visualizing Matrix Multiplication, Attention and Beyond

Use 3D to visualize matrix multiplication expressions, attention heads with real weights, and more.

OpenTeams September 25, 2023

Attention Mechanism

What is Attention, and why is it used in state-of-the-art models? This article discusses the types of Attention and walks you through their implementations.

OpenTeams September 16, 2019

CycleGAN: Unpaired Image-to-Image Translation (Part 3)

Table of Contents CycleGAN: Unpaired Image-to-Image Translation (Part 3) Configuring Your Development Environment Need Help Configuring Your Development Environment? Project Structure Implementing CycleGAN Training Implementing Training Callback Implementing Data Pipeline and Model Training Perform Image-to-Image Translation Summary Citation Information CycleGAN:…
The post CycleGAN: Unpaired Image-to-Image Translation (Part 3) appeared first on PyImageSearch.

OpenTeams June 5, 2023

Distilling knowledge from Neural Networks to build smaller and faster models

This article discusses GPT-2 and BERT models, as well using knowledge distillation to create highly accurate models with fewer parameters than their teachers

OpenTeams November 11, 2019

Interfaces for Explaining Transformer Language Models

Interfaces for exploring transformer language models by looking at input saliency and neuron activation. Explorable #1: Input saliency of a list of countries generated by a language model Tap or hover over the output tokens: Explorable #2: Neuron activation analysis reveals four groups of neurons, each is associated with generating a certain type of token Tap or hover over the sparklines on the left to isolate a certain factor: The Transformer architecture has been powering a number of the recent advances in NLP. A breakdown of this architecture is provided here . Pre-trained language models based on the architecture, in both its auto-regressive (models that use their own output as input to next time-steps and that process tokens from left-to-right, like GPT2) and denoising (models trained by corrupting/masking the input and that process tokens bidirectionally, like BERT) variants continue to push the envelope in various tasks in NLP and, more recently, in computer vision. Our understanding of why these models work so well, however, still lags behind these developments. This exposition series continues the pursuit to interpret and visualize the inner-workings of transformer-based language models. We illustrate how some key interpretability methods apply to transformer-based language models. This article focuses on auto-regressive models, but these methods are applicable to other architectures and tasks as well. This is the first article in the series. In it, we present explorables and visualizations aiding the intuition of: Input Saliency methods that score input tokens importance to generating a token. Neuron Activations and how individual and groups of model neurons spike in response to inputs and to produce outputs. The next article addresses Hidden State Evolution across the layers of the model and what it may tell us about each layer’s role.

OpenTeams December 16, 2020

Entity embeddings for fun and profit

Resources

Company

Related Articles

Resources

Company