Pre-trained representations

Jaideep Ray
2 min readApr 30, 2022

--

The best thing since sliced bread came around !

Pre-trained word embeddings are an integral part of modern NLP systems, offering significant improvements over embeddings learned from scratch. There are two existing strategies for applying pre-trained language representations to downstream tasks: feature-based and fine-tuning.

Feature based approach :

Throw in pre-trained representations as additional features while training model.

Pre-trained embeddings as features.

Fine-tuning :

Use embedding models as trainable layer. This approach allows you to tune the embeddings with respect to the model task to produce more effective models. However, the model size is much larger than in the first approach, because it contains the embedding weights.

Pre-trained embeddings as trainable layer.

Pre-trained representations has several advantages :

  • Use large public corpus : You can pre-train embedding models from large public corpus like wikipedia , image-net. The model generalizes well using such embeddings as input features.
  • Training cost : Training model on large corpus of data from scratch is very expensive computationally. Having pre-trained embedding models as features / fine-tuning layers significantly reduces the cost of training.
  • Multi-modality : With content being multi-modal (text, image, video, memes co-existing in single content), content models need to be multi-modal. Using pre-trained embeddings makes it easier for developers to build multi-modal models.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response