Uber Improves Restaurant Recommendations Using Real-Time Signals and Listwise Ranking

Uber has introduced updates to its Uber Eats recommendation system, incorporating real-time user signals and a listwise ranking approach to improve restaurant discovery. The system is designed to reflect user intent during active browsing sessions better while improving ranking efficiency across candidate restaurants. It is deployed within the Uber Eats platform to support homepage feeds and discovery surfaces.

The updated architecture replaces earlier batch-oriented feature pipelines with a real-time signal processing layer. This layer continuously ingests user interactions such as clicks, searches, and order history to maintain an up-to-date representation of user behavior. By shifting to near-real-time feature updates, the system reduces latency between user actions and personalization outcomes, enabling recommendations to adapt more quickly to changing preferences within a session.

Brinda Panchal, Product @ Uber, described the broader goal of the system:

Personalizing a marketplace at this scale isn't just about showing ‘good food’—it’s about balancing real-time intent, diverse merchant ecosystems, and complex ranking objectives to create a seamless discovery experience.

Architecture of the next personalisation platform to build userContext (Source: Uber Blog Post)

Uber’s recommendation stack also incorporates listwise ranking, where multiple restaurant candidates are evaluated together in a single inference step rather than individually. This approach allows the model to optimize relative ordering across a set of options, rather than assigning independent scores to each restaurant. According to Uber, this improves both computational efficiency and ranking quality by enabling direct comparison among candidates in the same context.

Generative recommender architecture (Source: Uber Blog Post)

The system builds on a unified representation of user behavior that combines short-term session activity with longer-term historical signals. These signals are processed through a shared feature extraction layer, ensuring consistency between offline training and online serving. Training data is generated by replaying historical user sessions to simulate production environments, reducing discrepancies between model training and live inference.

A key design consideration is the alignment between training and serving pipelines. Uber applies the same feature-extraction logic across both environments to reduce feature drift and maintain consistency. This approach helps ensure that models trained on historical data behave similarly when deployed in production.

Yicheng Chen, Engineer @ Uber, highlighted the technical evolution of the system:

Leveraging near real-time user sequence features and a Generative Recommender-style model to power Uber Eats Home Feed recommendations and evolved the homefeed ranking from hand-crafted statistical features to transformer-based sequence modeling, cut feature freshness from 24 hours to seconds.

On the infrastructure side, the system is designed to handle low-latency constraints typical of consumer-facing recommendation surfaces. Feature preprocessing and model inference are separated to improve efficiency and scalability under high traffic. This allows the serving layer to focus on ranking while upstream services manage feature computation and aggregation.

About the Author

Leela Kumili

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Leela Kumili

Rate this Article

This content is in the Machine Learning topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter