#466

Ride-Hailing Marketplace Analytics

HARDData Warehouse Designdatamodelling

Problem Description

## Ride-Hailing Marketplace Analytics ### Context A technology company operates a two-sided marketplace connecting riders with independent driver-partners for on-demand transportation. The platform handles millions of trips daily across cities and regions. The business wants to assess the health of the marketplace — with a focus on trip completion rates, driver utilization, rider retention, and revenue per trip. Your task is to design a data model that powers analytics across trip activity, marketplace supply-demand balance, and user behavior for both sides of the marketplace. ### Requirements - The model should support analyzing trip volume and completion rates across geographies, time periods, and vehicle types. - Driver utilization must be measurable — including trips per driver, active hours, and time spent carrying passengers vs. idle. - Rider behavior must be trackable — including first-trip conversion, repeat usage frequency, and churn indicators. - Revenue metrics (base fare, surge pricing, promotions applied) must be sliceable by geography, time of day, and user segment. - The model must support supply-demand balance analysis — identifying where and when rider demand exceeds available driver supply. ### Constraints - ~10M monthly active riders and ~500K active driver-partners. - ~3M trips per day at peak. - Trip-level data retained for 3 years. - Driver and rider dimension attributes updated daily (drivers can change vehicle or license status; riders can update payment preferences). - Geographic data available at city, region, and country level. - Surge multipliers and promotional discounts are captured per trip. ### Follow-Up Questions 1. A trip passes through multiple states before resolution: requested, driver accepted, trip started, and either completed or cancelled (by rider, by driver, or by the system). Product wants to analyze dropout rates at each stage of this funnel. How does your model support this? 2. Driver-partners can register multiple vehicles with the platform (e.g., a standard sedan and an SUV) and choose which to use on any given day. How would your model track a driver's registered vehicle fleet and which vehicle was used for each specific trip? 3. We want to build a supply-demand health dashboard showing, for each city and 30-minute window, how many ride requests went unfulfilled because no driver was available. How would your model support this? ### Notes - All timestamps are in UTC. - A cancelled trip retains its row in the fact table with a null pickup and dropoff timestamp. - Surge multiplier of 1.0 means no surge was applied. - Candidates may assume `trip_status` values are: completed, cancelled_by_rider, cancelled_by_driver, cancelled_by_system.

Topics

dimensional modelingevent modeling

Asked at Companies

meta

Solve This Problem

Sign up to access the interactive code editor, run test cases, view the editorial, and get AI-powered feedback on your solution.

Start Solving →