Pre-trained representations

2 min readApr 30, 2022

--

The best thing since sliced bread came around !

Pre-trained word embeddings are an integral part of modern NLP systems, offering significant improvements over embeddings learned from scratch. There are two existing strategies for applying pre-trained language representations to downstream tasks: feature-based and fine-tuning.

Feature based approach :

Throw in pre-trained representations as additional features while training model.

Fine-tuning :

Use embedding models as trainable layer. This approach allows you to tune the embeddings with respect to the model task to produce more effective models. However, the model size is much larger than in the first approach, because it contains the embedding weights.

Pre-trained embeddings as trainable layer.

Pre-trained representations has several advantages :

Use large public corpus : You can pre-train embedding models from large public corpus like wikipedia , image-net. The model generalizes well using such embeddings as input features.
Training cost : Training model on large corpus of data from scratch is very expensive computationally. Having pre-trained embedding models as features / fine-tuning layers significantly reduces the cost of training.
Multi-modality : With content being multi-modal (text, image, video, memes co-existing in single content), content models need to be multi-modal. Using pre-trained embeddings makes it easier for developers to build multi-modal models.

Pre-trained representations

Feature based approach :

Fine-tuning :

References :

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Jaideep Ray

No responses yet