RecSys model inference

Jaideep Ray
1 min readDec 18, 2022

Industrial recommendation models (RecSys) are different from domains like NLP and CV in a few ways:

  1. Massive amounts of user engagement data are readily available for training RecSys models. (e.g., ads clicks, search result page clicks, content browse). Most of the engagement data is sparse and categorical. Extensive training data leads to large model sizes and unique challenges in inference.
  2. Model inference has to be low latency and high throughput as realtime recommendations are served to user at web-scale. Model compression to reduce model weight and improve serving performance has a tradeoff with lower revenue (due to loss in model accuracy). So, it is often not a choice.
  3. User data is ever changing. To mitigate concept drift arising out of changing user behavior, model refresh should be near realtime.
image courtesy : Algoexpert

Let me know if you want any more topics to be covered.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

No responses yet

Write a response