Prompt it till you make it!
tl;dr:
- Prompt based learning is a data efficient paradigm for training multiple nlp tasks. With minimal labeled data you get competitive performance with models trained from scratch or fine-tuned.
- Using less labeled data & using foundation models means you can get a competitive model with minimal effort and cost.
A pipeline to solve an NLP task like supervised classification or summarization has evolved a lot over the last decade. I distinctly remember spending months creating a intent classifier and text summarizer at Meta seven years ago. Most of the time went into collecting labeled data for training the classifier from scratch. In this paradigm
The next wave brought pre-trained language models (PLM) into nlp pipelines. Pre-trained models are generally trained on a large corpus using semi-supervised techniques. The pipeline then fine-tuned the PLM on a smaller labeled corpus. Fine-tuning a PLM immediately reduced the number of labels required to solve a NLP task. It also reduced computational cost of training for a particular task. The cost of training the PLM is a fixed cost which is amortized over a number of tasks.
As general deep learning infra matured, it became possible to train large language models (LLMs) with billions of parameters on large data corpora — like all text on the internet. These LLMs are trained in cloze style and thus have a generative aspect. Given input with a mask, the LLMs compute the probability of every word in the vocabulary to fill the mask. The word with the highest probability is the output. Prompt learning is the paradigm where different NLP tasks such as classification, summarization, NER can be
Train from scratch

Fine-tune on Pre-trained Language Models

Prompt learning on Pre-trained Large Language Models


