Foundation models

Jaideep Ray
3 min readOct 18, 2022

We have seen an explosion of ML usage in the last five years across different industry tracks ranging from ads ranking, search & recommendations, content understanding, fighting the wrong folks, and whatnot!

One of the key drivers to this explosion has been learning models from web corpora around us (Wikipedia, billions of webpages, and app usage data) and fine-tuning them for various downstream tasks.

courtesy : https://gradientflow.com/foundation-models-non-technical-guide/
  • Robust software infrastructure offerings such as storage from cloud services (AWS, Azure, and so forth) or similar offerings provide massive productivity boost to developers. They have enabled small teams to leverage the knowledge and practice of cloud providers and build classes of applications of high scope and impact inexpensively.
  • Foundation models play a similar role in AI/ML ecosystem, laying the foundation of ML-powered applications and systems. One reason behind the name "foundation" is that developers do not use these models in stand-alone time but adapt them for many tasks.
  • For example, developers have seen meaningful quality improvements in systems for Question Answering, sentiment classification, text summarization, and so forth, using foundation models like BERT.

Impact of foundation models:

  1. Boost ML developer productivity massively. Foundation models and especially services supporting the usage of foundation models at scale have cut down the time to build a competitive baseline for various tasks to a fraction of what it used to cost. NLP is an area that has seen the maximum impact in the last five years.
  2. Foundation models incentivize homogenization in the stack. ML systems use similar patterns to use foundation model(s). Using a small set of foundation models for many downstream tasks provides ample opportunities for optimizations focused on developer productivity, performance, and quality.
  3. Foundation models do not change regularly; hence, developers can build standard perf and quality metrics benchmarks and study societal impact through bias and fairness for them.

Some of the critical advantages mentioned have noteworthy downsides which we should know:

  1. Lock in: Homogenization beyond a point has limited ROI for exploration and research. It can impede them due to lock-ins created.
  2. Domain diversity: Foundation models hit fame riding on their success in NLP and CV tasks. It remains a good idea with a few wins in other domains, such as speech and multimodality. Foundation models have to replicate their industry success in different disciplines.
  3. Lack of accessibility: Foundation models require extensive data and a tremendous amount of computer resources (GPUs, CPUs, storage memory). This cost creates a barrier of entry that only large corporations and research institutions can cross. Accessibility to take part in foundation model development will remain a top challenge.

What’s next?

  1. In the coming years, we will see a surge of new applications and domains disrupted by foundation models.
  2. Hugging face is building a community of engineers influencing the discourse on how institutions should share foundation models with the broader community — recipes in multiple frameworks (Pytorch, TF), trained model instances provided by the community, and hosted service endpoints. The engineers can choose the layer they want to work on.
  3. The discussion on the societal impact of the hidden bias in these foundational models will continue to influence and shape responsible AI work in academia and industry.

What has been your favorite foundation model experience? Write and share about those in comments!

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

No responses yet

Write a response