Grab Personalised Its Platform For Millions In Under 15 Seconds

Sid Arora

Apr 7, 2026

10

min read

Understanding the role

H

Hey hey,

For years, Grab’s user data system, the central place that stores everything about its passengers, drivers, and merchants, has been updated once a day.

Every morning, a scheduled job would process the previous day’s activity and write fresh user profiles. Everything the brand did depended on this daily refresh.

That system tracked over 1,000 data points per user across Grab’s three-sided marketplace (passengers, drivers, merchants). It was powerful and well-managed.

It powered marketing, experiments, ML features, and connections to ad platforms like Facebook Ads and TikTok. But it was built to look backwards, not react in the moment.

The gap was real. Think about what happens at an airport.

A user lands, opens the Grab app, and is immediately in a moment of genuine need. They want a ride, maybe they’re hungry, maybe they need a hotel recommendation.

The best time to reach them is right now, in this specific context.

With once-a-day updates, Grab couldn’t act in that window. By the time the daily refresh ran, the traveller was already in their hotel.

*Before Scenarios, Grab’s personalisation ran on yesterday’s data. After: under 15 seconds.*

The engineering team also spotted a deeper problem.

Whenever someone at Grab wanted a new real-time personalisation feature, an engineer had to build a custom system from scratch.

The growth team wants to re-engage users who abandoned checkout.

An engineer builds a one-off pipeline. Every new idea meant new code, infrastructure, and a long wait. Marketers and product teams couldn’t move without engineering help, and engineering was always stretched.

What They Built

Grab’s answer was a feature called Scenarios, built on top of the existing user data system.

The core idea is simple: instead of asking “what did this user do yesterday?”, Scenarios ask “what is this user doing right now, and what do we already know about them that makes this moment matter?”

A Scenario is a rule, set up by marketers, that combines three ingredients.

The first is a real-time trigger: something the user just did, like opening the app at an airport, booking a ride, or starting a subscription sign-up.
The second, optionally, is historical context: existing data like a user’s average spend, their food preferences, or their typical visit patterns.
The third, also optional, is a live prediction from an AI model that scores what the user is likely trying to do right now.

When those ingredients come together and a Scenario fires, Grab can send a personalised action, such as a push notification, an in-app banner, or an ad, within 15 seconds of the triggering event.

*How a Scenario flows: from user action to personalised output in under 15 seconds.*

‍

The Infrastructure Underneath

The system runs on three main technologies: Scribe, Flink, and Kafka.

Scribe is Grab’s internal event tracker.
Every meaningful action in the Grab app, a tap, a booking, a screen view, creates a Scribe event. These events are the raw material for real-time personalisation.
Flink is the processing engine.
It receives those events and applies the Scenario rules. Think of it like a factory assembly line vs. a warehouse. A warehouse collects everything and processes it all at the end of the day. An assembly line handles each item the moment it arrives. Flink is the assembly line. It processes each event immediately, applies the rules, pulls in historical data if needed, runs the prediction model if one is configured, and produces an output, all within the 15-second target.
Historical context comes from StarRocks, a database built for fast lookups.
When a Scenario needs background information, Flink queries StarRocks in the middle of processing and folds that data in. This makes the system powerful: it doesn’t just react to what just happened. It combines that with everything the platform already knows.
Kafka handles the output side.
Think of it as a reliable delivery system that carries data between different services at high speed. Scenario results flow through Kafka to whatever needs them: notification systems, ad platforms, in-app personalisation. For services that prefer to look up results on demand rather than receive a constant stream, Grab routes outputs to Amphawa, the company’s internal data store, built on Amazon’s DynamoDB.

The Design Choices That Made It Practical

The system solves a real tension: how do you make personalisation fast without making it fragile?

The self-serve interface was a deliberate choice. The old approach, engineers building custom systems for every new use case, had hit a wall.

It worked, but it couldn’t keep up. By putting Scenario creation in a visual interface that non-engineers could use, Grab separated two problems: the infrastructure (built once by engineers) and the experimentation (done anytime by marketers).

A new Scenario goes live within an hour of being set up.

*Five steps to set up a Scenario. No code, no engineering sprint, live within an hour.*

This design also meant Scenarios could be tested before going live.

Marketers can run a new Scenario against fake data, the system processes test events, and shows what would happen. That matters when a badly configured trigger could send the wrong notification to millions of users.

Supporting both Kafka and Amphawa as outputs was a practical decision. Some systems want a continuous stream of data. Others want to look things up on demand.

Supporting both meant Scenarios worked across Grab’s existing setup without forcing other teams to change anything. The location-based targeting is worth highlighting.

Scenarios can be set to fire only when a user is in a specific place, such as an airport terminal, a particular mall, or a hotel zone.

A user booking a ride to a mall isn’t just showing intent to visit. They are creating a window where a relevant offer could actually change what they do.

The system can factor in their destination, the time of day, their spending history, and their food preferences to decide whether they are likely heading to shop, get groceries, or eat, and serve the right offer.

Results

The team had over 12 live Scenarios running in production at the time of writing.

The main example Grab shared was the Grab Unlimited sign-up flow. Grab Unlimited is the company’s subscription product, with recurring discounts on rides and food delivery.

The problem was abandonment.

Users would start signing up, get partway through, and leave. The old approach to winning them back was delayed and generic, not triggered by the actual moment of abandonment, not timed to when the user was still interested.

The Scenario rebuilt this as a real-time trigger. When a user started the sign-up but didn’t finish, the system captured the abandonment.

It processed the event, checked the user’s history, and fired a re-engagement notification within 15 minutes. The notification was personalised, well-timed, and sent only to users who had actually abandoned.

The result was a 3% increase in subscriber conversions compared to campaigns that didn’t use real-time triggers.

Three percent sounds small. But in subscription economics at Grab’s scale, 800 cities across eight Southeast Asian countries, it adds up fast. And it came from a Scenario that a marketer set up without a single engineering sprint.

What Comes Next

The engineering team outlined three directions.

The first is smarter scaling.

As more Scenarios run at once, managing the processing infrastructure gets complex. The goal is to make this more automated so new Scenarios don’t need manual setup to handle the load.

The second is multi-destination outputs.

Right now, each Scenario sends results to one place. Some use cases need the same signal to go to multiple places at once. One trigger fires, and both the notification system and the ad platform get the message.

The third is delayed delivery.

Real-time processing doesn’t always mean real-time delivery. Grab wants to calculate a signal the moment it’s available, but deliver it at the perfect future moment.

If a user gets dropped off somewhere with a typical 15-minute Grab wait time, the system could send a “book your return ride” notification timed to exactly when they will need it. Calculated instantly and delivered on a schedule.

Together, these point toward a personalisation layer that doesn’t just react to what users are doing, but anticipates what they will need before they ask.

The gap between “we can send a push notification” and “we can react to any user event, enriched with historical context and live predictions, within 15 seconds, for 12 simultaneous use cases, without requiring an engineer for each one,” that gap is enormous. It’s exactly the gap Grab’s Scenarios system was built to close.

How I can help you:

Fundamentals of Product Management - learn the fundamentals that will set you apart from the crowd and accelerate your PM career.
Improve your communication: get access to 20 templates that will improve your written communication as a product manager by at least 10x.

Sid Arora

View Posts

View All

Grab Personalised Its Platform For Millions In Under 15 Seconds

10

min read

This is How Notion Won the Productivity Battle

11

min read

Here's How an LLM Fixed Broken LinkedIn's Job Recommendation System

12

min read

How Figma's Went From A Dorm Room To $56Bn

12

min read

View All

Mixture of Experts (MoE): Reason Behind Cheapest AI Models

Mixture of Experts (MoE) is the architecture behind GPT-4, Gemini 1.5, and Mixtral. Here's a PM-level explanation of how MoE works and why it matters for your API budget.

Apr 9, 2026

12

min read

Spotify Is Using 6 AI Agents For Building Ad Campaigns. Here's why

Spotify's ad planning took 30 minutes and 20+ form fields. Here's how six AI agents cut it to 10 seconds, and what the architecture actually looks like.

Apr 8, 2026

12

min read

AI And LLM Observability: What is It?

How do you know if your AI product is quietly giving users wrong answers? Learn how LLM observability works: traces, spans, LLM-as-judge, and why a 200 OK status code tells you nothing about quality. (Remember to click on"show pictures")

Apr 6, 2026

10

min read

This is How Claude Changed the Vibe Coding Game

Claude launched "Channels" allowing vibe coders to continue coding without being on their systems. This is how it works.

Apr 5, 2026

10

min read

This is How Notion Won the Productivity Battle

Notion reached a $10 billion valuation on just $344 million raised. Here's how a free personal tier and an accidental template ecosystem became its growth engine.

Apr 4, 2026

11

min read

Your AI Agent Always Forgets. Here's Why

AI agents fail because of poor memory, not bad models. Learn the 4 memory types, why they break, and how to fix your agent’s performance across sessions.

Apr 3, 2026

11

min read

Twitter

Instagram

Newsletter

Grab Personalised Its Platform For Millions In Under 15 Seconds

What They Built

The Infrastructure Underneath

The Design Choices That Made It Practical

Results