In this series of posts we will be looking to design a cab hailing service similar to Uber or Ola (in India). We will be mainly concerned about the technical design and challenges and not get into the logistics such as signup and recruitment of drivers, training drivers for customer satisfaction, number of cabs on street and so on. Even for the technical design, we will omit some of the technical details regarding payment gateway integration, GPS navigation, frontend UI and UX etc.

Note : All the design process and thinking are of my own and may not be implemented by any of the cab hailing services.

The most important features of a ride hailing app like Uber or Ola are as follows (Core logic) :

- Matching a driver to a passenger for a non-shared cab.
- Matching a driver, plus other passengers in cab sharing/pooling.
- Deciding optimized route for cabs.
- Fare estimate for a trip (with dynamic or surge pricing).

Few other features are also important from the business perspective.

- First time signup free rides for customers.
- Discounts and Referral promo codes.
- Targeting customers with SMS, Push notifications and Email.
- Bonuses for drivers on completing additional trips.
- In-App chat between driver and a customer.
- Cancellation and Fees.
- Passenger and Driver profiles.
- Notifications of arrival and cancellation of cabs.
- Showing the most likely destinations to a customer on the home screen.
- Scheduling rides for a future time.
- Writing reviews for your ride.

In this post, we will be designing a strategy for matching passengers with drivers.

While I was visiting Ho Chi Minh city, Vietnam with my friends, the same cab was booked by the Uber app for three successive times in a span of two days. Probability of this being a mere co-incidence is very less.

We need to separate the case of non-shared trips from shared or pooled cabs because they have different design challenges and constraints.

**Non-shared Cabs (UberGo, UberX etc.)**

Trivial cases for matching would be :

- A single cab is enrolled and a single customer has signed up.
- Few other cabs have enrolled, but still a single customer. If 5 cabs are available within 1.5 km radius from the passenger, then the obvious choice is to allocate the cab closest to the passenger.
- By closest one can define the geographical distance between the cab and the passenger or the minimum distance travelled using the best route.
- What other options are there for cab allocation ?
- Driver and Passenger Rating : Passengers can rate drivers and drivers can rate passengers as well. Matching the highest rated (average) cab driver can lead to rich gets richer phenomena.

Non-trivial and more challenging scenario :

More than 10K cabs have enrolled and more than 10 million users have signed up.

Some challenges are :

- Solving the optimization problem of assigning M passengers to N cabs, that reduces the overall cost of assignment as well take care of individual assignment costs.
- The above optimization problem needs to be solved within a timeframe, such that no passenger needs to wait for more than 5-10 seconds to get a response from the backend service (whether a cab has been assigned or could not find a cab).
- Handling large number of concurrent booking requests (1M+) during peak hours.
- Tracking location of all cabs every second (all over the world) !!!
- The number of cabs on road will be constant relative to the number of users trying to book a cab at a given moment in time.
- Fixing a constant search range, will result in scenarios where there are more than one cab available for one passenger and no cabs available for another passenger.
- Not all un-available cabs might be in between a trip. Some of them might be ending a trip shortly and will be available. Sometimes assigning a cab on trip will lead to lower optimization cost than assigning only available cabs. Thus system needs to work dynamically.

Keeping in mind the above challenges, the goal of the cab assignment is to :

- Maximize the number of passengers getting a cab (will come to this later in the post).
- Minimize the total waiting times of all passengers.
- Better than shortest route, because sometimes shortest route may have more delays due to traffic.

- Minimize the total distance travelled by all cabs.
- Since the shortest distance between pickup and drop of passenger is constant, minimize the distance from current location of cab to the pickup location.

- Create matches such that driver and passenger have similar ratings.

It is not possible to simultaneously minimize both the distance and waiting time from cab location to pickup location unless they are the same. So the best option is to minimize either one of them while keeping an upper bound on the other or both. This makes more sense since anyways we want to limit the waiting time as well as the distance for each individual passenger.

For the rest of the post will assume that we are minimizing the total waiting time by constraining the maximum search radius to 1.5 km from the passenger.

Thus we need to track the following metrics :

- List of passenger ids who is trying to book a cab (but not allocated a cab).
- Pickup and Drop locations (GPS coordinates) of the above passenger ids.
- Average ratings given to the above passenger ids.
- List of cab ids that are currently available to be booked and a list of cab ids that is likely to complete its existing trip in the next 5 minutes.
- Average ratings of drivers for the above cab ids.
- Distance along the shortest route from cab location to passenger location.
- Maximum distance for matching = 1.5 km.

- Shortest time required to travel from cab location to passenger location.

Assuming some passenger P1 is trying to book a cab, retrieve the pickup location for the passenger and then within a search radius of 1.5 km, check if any cab is available. If a cab is available, then one option is to quickly allocate the cab C1 with the shortest time to pickup.

But this may not be the most optimal. There could be another passenger P2, whose booking came in just a second later, and has been assigned a cab C2. But C1 could be a better option for P2 while C2 a better option for P1 in terms of waiting time as well as distance travelled.

Note : The goal is not to minimize the individual waiting times but the overall waiting time.

One way to handle this is to not book a cab immediately but allow a time-frame of 5-10 seconds to find the best possible matching of cabs. If within the time-frame of 5-10 seconds multiple other booking requests identify the same cab C1 as the best choice, then with all the passengers who are matched with C1 and with all cabs other than C1, which are located within a distance of 1.5 km from the afore-mentioned passengers, create a matrix of cabs against passengers with waiting times as cost and then find the best matches subject to minimizing the total waiting time.

If P1 is matched to C3, P2 to C2 and P3 to C1, then total waiting time (cost) = 3 + 7 + 5 = 15 mins.

Will using a greedy strategy of assigning cabs to passengers work ?

Let's look at the following scenario.

Cab C2 is closer to P1, so C2 is assigned to P1, then P2 will be assigned cab C1.

The total distance traveled by cabs = 0.2 + 1.2 = 1.4 km

While if C1 was assigned to P1 and C2 to P2, then total distance travelled = 0.6 + 0.4 = 1.0 km.

Same reasoning would hold with waiting times also. Thus greedy approach is NOT the most optimal approach in this case.

This is an example of assignment problem in a bipartite graph. Given two sets of nodes (equal numbers) in a bipartite graph, create an one-to-one mapping so as to minimize the total edge costs. In our problem, one set of nodes represents the cabs and the other set represents the passengers and the cost is the waiting time.

There is an algorithm called the Hungarian Method, that computes the most optimal matching in O(N^{3}) time complexity, where N = max(#cabs, #passengers).

Till now we have used only the waiting time as a cost function, but remember we had another parameter, i.e. the ratings. One way to include ratings in the cost function is to add the absolute difference of the driver and passenger average rating with some weights :

Cost Function = 0.7 * passenger_wait_time + 0.3 * |driver_rating - passenger_rating|

Given that now we have the algorithm in place, let's look at some of the implementation details :

- Assuming that we are storing cab, driver and passenger details in MySQL tables.
- Table for drivers needs to store driver personal information (name, address, contact etc.), along with the average ratings given to them.
- Table for cabs needs to store the cab details (make, model etc.), the driver id of the driver.
- Tables for passengers, needs to store passenger personal details (name, contact, home location, office location etc.), average ratings given to them by drivers.

- Store each trip details in a document database such as MongoDB. The trip details will include some of the following :
- Cab id, Passenger id, Driver id
- Pickup and drop location (GPS co-ordinates).
- Start date-time and End date-time of trip.
- Ratings given by passenger to driver and driver to passenger.
- Waiting time of passenger.

- Average ratings can be computed from trip data and updated into the MySQL tables.
- Use in-memory database like Redis to store the following :
- Current location of all the cabs. Refresh and update location after every second.
- Flag to indicate whether cab available or not available.
- Last trip start and end time. If cab is on-trip then start time of current trip and estimated end time.

Without proper indexing of the data, searching available cabs within a radius of 1.5 km, would require linear search over all cabs. Thus if there are 10K cabs, then for each booking request, we need to scan 10K entries in the Redis database.

Similarly to update the location of cabs every second, given a cab id, we must search over all cabs for that id to update. This is clearly not scalable.

Index the data. In most relational database, there is provision for geo-spatial indexing, i.e. indexing of data corresponding to geographical coordinates. R-Trees and KD-Trees can be used to index geo-spatial data which gives very good lookup and range query performances : O(logN)

Redis uses sorted sets to index the geo-spatial data.

The idea is to find the smallest bounding box such that all points that are within a radius of 1.5 km can be found inside the box plus some other points which maybe at a distance greater than 1.5 km.

Lookup requires scanning all the cabs inside the bounding box to filter the ones that are at a distance of less than equal to 1.5 km and are also available. Time complexity is O(M + logN) where M is the number of cabs inside the smallest bounding box and N is the total number of cabs.

For each passenger request P_{i} above, query for available cabs within 1.5 km radius. These requests are added to a message queue (Kafka, RabbitMQ etc.).

In Redis create one map C_{P} from each cab id to the list of all passenger ids matched to it (within 1.5 km radius) and a reverse mapping P_{C} from passenger id to the list of cab ids. When a request comes in, update the maps.

From the time a request comes, after 5-10 seconds, remove the head of the message queue, retrieve the passenger id, then retrieve the cab ids for this passenger id from the map C_{P}. Then for each of these cab id, retrieve the list of other passenger ids from the map P_{C}.

Once we get tuples of (cab id, passenger id), create a matrix as shown above, then using the Hungarian method compute the optimal solution for the current passenger id P_{i} retrieved from the head of the queue and allocate a cab to him/her. Update the maps C_{P} and P_{C }to remove the passenger id and the cab id already matched.

During peak hours, one cab can be matched with 4-5 other passengers and each passenger is matched to maximum of 8 cabs. So the size of the matrix for each passenger request may not exceed 10x10.

Even with O(N^{3}) run time for Hungarian method, dealing with 10x10 matrices would not take more than a few milliseconds to process. In the above matrix our goal is to find the optimal solution for a single passenger P_{i}. In the best case, where any one of the cabs C_{j} is the optimal choice for P_{i} and not the optimal choice for any other passenger, then assign C_{j} to P_{i} , without any matrix computations.

To reduce delays with many concurrent booking requests, use multiple message queues in parallel and insert the requests in a round-robin fashion. Use asynchronous threads to compute optimal matching for a request, so that one passenger do not remain blocked due to optimal matching computations from the previous request in the queue.

In order to efficiently dispatch cabs, one must estimate the demand for a particular region, else it may happen that in a certain region where demand is high, the number of cabs nearby are very low, whereas many cabs are available inside a low demand region.

**Cab Sharing/Pooling (UberPool)**

I took some inspiration from these series of posts by Lyft's Director of Engineering on Medium.

Following are some of the design challenges and constraints that comes to mind immediately while matching a passenger to other passengers :

- Deciding the optimal route for a cab so as to :
- Minimize detours.
- Minimize the total seating time of all passengers in a single trip.
- Minimize the total distance travelled taken by a cab for pickup and drop-of of passengers.
- Obviously there should be upper bound to the above costs so that individual passengers do not suffer in order to bring down collective cost.

- Matching a passenger with other passengers with same gender majority (especially for female passengers). Not a requirement but good to have.
- Matching passengers with other passengers of similar ratings.

A cab can be matched to either a single pooling request or multiple pooling requests. While a cab is being matched to a pooling request, it can either be an empty cab or there could passengers already in it.

Let's consider the scenario where a cab is empty and a single pooling request comes. This is same as the case for non-shared cab above.

The cost function for such a case is just the waiting time of the passenger (and yes the cab must be located within a distance of 1.5 km).

Next we consider the scenario, where a single pooling request is matched to the cab but the cab already has one passenger in it. Let the source and destination for the existing passenger be denoted by S1 and D1 and the source and destination for the incoming request be S2 and D2. The current location of the cab be denoted as Q.

Then the possible routes for the cab are :

- S1 to Q to S2 to D1 to D2
- S1 to Q to S2 to D2 to D1

The cost of a particular route is the total delay to all passengers. For a single trip, the only delay to a passenger is the waiting time to pickup, but in case of pooling, the delays are :

- The time to a pickup : This time is added towards all passengers.
- The time to a drop-off : This time is added towards all passengers not being dropped.

In the first option, the delay to the first passenger P1 is :

C1 = [T(@, S1) + T(S1, Q) + T(Q, S2) + T(S2, D1)] - [T(S1, D1)]

where, T(X, Y) is the shortest time to travel from X to Y and T(@, S1) is the time taken to reach S1 from the empty cab's location '@'.

The delay for the second passenger P2 is :

C2 = [T(Q, S2) + T(S2, D1) + T(D1, D2)] - [T(S2, D2)]

In case of the second passenger, the times T(@, S1) and T(S1, Q) are inconsequential because his/her delay is only due to the time to his/her pickup and the additional drop-off time for the first passenger.

Thus the total cost to delay for the first option is :

C1 + C2 = [T(@, S1) + T(S1, Q) + 2T(Q, S2) + 2T(S2, D1) + T(D1, D2)] - [T(S1, D1) + T(S2, D2)]

Similarly the delay to passenger P1 in the second option above is :

C1 = [T(@, S1) + T(S1, Q) + T(Q, S2) + T(S2, D2) + T(D2, D1)] - [T(S1, D1)]

and for passenger P2 :

C2 = T(Q, S2), because 2nd passenger is the first to be dropped.

Thus the total cost to delay for the second option is :

C1 + C2 = [T(@, S1) + T(S1, Q) + 2T(Q, S2) + T(S2, D2) + T(D2, D1)] - [T(S1, D1)]

Additionally, we want to cap the delay to a passenger due to pickup and drop-off of other passengers to 20% of the estimated time if the passenger was traveling alone.

i.e. for the first passenger if the estimated time to D1 was 30 minutes before the second request came in, then after the second match, the delay to D1 should not be more than 6 minutes.

Thus, for the 1st passenger, we want to have :

C1 - T(@, S1) <= 0.2 * T(S1, D1)

and for the 2nd passenger, we want to have :

C2 - T(Q, S2) <= 0.2 * T(S2, D2)

The next scenario is where the cab is initially empty but 2 booking requests match the cab. This is similar to the scenario above but, the options are more since the pickup of the two passengers can also be ordered now :

- S1 to S2 to D1 to D2
- S1 to S2 to D2 to D1
- S2 to S1 to D1 to D2
- S2 to S1 to D2 to D1

The other scenarios are generalization of the above, where the number of passengers already seated could be more than one and number of requests matched to a cab can also be more than one.

Note that, although there could be more than 7 to 8 matching requests that can come to a cab, but the number of passengers actually picked up should be less than or equal to the remaining seats available in the cab.

When the driver needs to pickup N more passengers, then number of possible route options are :

N_{options} = (N!)^{2}

Because, the N sources can be permuted in N! ways and then for each permutation of sources, N destinations can be permuted in N! ways (assuming that no pickup happens after first drop).

When N is small < 5, the computation is still manageable with 14400 possible options for N=5, but when N starts to increase beyond 5, the possible options are far too large to be computed within 1-2 seconds.

Finding the optimal route for a generic N is a NP-Hard problem and is known as the Vehicle Routing Problem.

The total cost should be weighted cost of each possible routing option. More logical options should be given more weight. For e.g. one way to weight is to :

Cost = 0.3 * (S1, S2, D1, D2) + 0.2 * (S1, S2, D2, D1) + 0.2 * (S2, S1, D1, D2) + 0.3 * (S2, S1, D2, D1)

In a practical scenario, 10 passengers will be matched to 3 cabs, with each cab having a different number of seats available, say [2, 1, 3]. Thus only 6 out 10 will be allocated into the 3 cabs.

We need to evaluate the cost of each possible assignment of 6 out of 10 passengers into the 3 cabs, then we take the assignment with the minimum total cost and accordingly assign the cabs to the passengers.

One can use Genetic Programming technique to find an approximately good solution that minimises the above cost.

Earlier we have seen how to track the cabs : Indexing the cabs into a Redis database with an index on GPS coordinates.

After modification to some of the fields, the updated cab properties to track :

- Current location of all the cabs. Refresh and update location after every second.
- Number of passengers currently in the cab.
- Maximum number of passengers the cab can accommodate.
- If number of passengers in cab is greater than 0, then flag to indicate whether it is a pooled cab or not.
- Last trip start and end time. If cab is on-trip then start time of current trip and estimated end time.

A pooling cab is available if :

- If the number of passengers in the cab is 0.
- If the number of passengers in the cab is greater than 0 but less than the maximum number of passengers it can accommodate and it is a pooled cab.

Similar to our previous implementation, whenever a booking request comes in for a pool, queue the request and update the maps C_{P} and P_{C}.

From the time a request comes, use a separate thread to track the duration and after about 10 seconds, remove the request from the head of the queue, retrieve the passenger id P_{i}, retrieve the cab ids for this passenger id from the map C_{P}. Then for each of these cab id, retrieve the list of other passenger ids from the map P_{C}.

Out of all these cab ids and passenger ids, find the assignment that minimises the total cost and satisfying the constraints as explained above. All possible assignments to evaluate must include the current passenger id P_{i}, implying that even if the number of passengers is more than the number of seats available in the cabs, passenger id P_{i} has to be assigned a cab even if some other assignment without putting P_{i} inside any cab has a lower cost.

Note that, whichever cab passenger id P_{i} has been assigned to, if it has more than one seat available and the assignment which minimizes the cost, also assigns some there passenger id P_{j} to the same cab, then remove P_{i} from the queue and all other passenger ids assigned the same cab and update the maps C_{P} and P_{C}.

Can we think of how to leverage machine learning for solving some of the problems above ? I have tried to come up with some use cases where data mining and machine learning can lead to an improved cab hailing service.

- Forecasting demands :
- Objective : Distribute cabs so as to maximise the number of passengers getting a cab.
- Cluster historical bookings based on proximity, for every 30-40 minute period or so.
- Identify booking hotspots during certain time of the day for a certain area. For e.g. during peak office hours, nearby business and tech parks etc.

- Predicting which destinations can be clubbed together :
- Objective : Club similar destinations in order to reduce overall delay.
- Model learns from data about destinations pooled in the same cab and for each such trip what was the expected overall delay (cost).
- Club destinations for future trips so that it leads to lower expected delays to passengers.

- Dynamic Pricing :
- Objective : Price rides based on demand and supply.
- Time series model of the expected number bookings from a certain area and expected number of cabs nearby that area. (Area = Grid defined by lat. and long.)

- Predict Traffic Congestions :
- Objective : Avoid certain routes during certain time of the day.
- Model takes as input the current location of the cab, the current time and all possible routes from that location and estimates the probability of getting delayed by X (>10) minutes for each route.
- Cannot always rely on Google API because they are real time and the congestion might start after we have travelled half-way through a route and there is no going back.

- User Segmentation :
- Objective : Target customers with reduced app usage or high probability of churn, with customised promotion emails and SMSs.
- Cluster all such customers based on geography, most frequent source and destination, cab preference - shared or non-shared etc, co-passenger etc.
- E.g. if passenger travelled often in a pooled cab with spouse, then offer 20% off on the next pool ride with spouse.

Some of the very interesting reads from Uber's technology blog posts :

- http://highscalability.com/blog/2015/9/14/how-uber-scales-their-real-time-market-platform.html
- In this post, I was really surprised by the RingPop technology they are using for scaling of dispatch. It is simple yet so effective. RingPop is based on the concept of consistent hashing and a tree architecture for forwarding requests. The idea of RingPop is to build a fault-tolerant system that also helps in network partitioning as well as reduce time for lookups and update.
- The core idea behind RingPop is that, each node in the tree is associated with an range of network addresses (global sharding). Whenever a request comes in to this node, it checks if it was intended for this node. If not, then it randomly chooses 3-4 other nodes, who can either respond or similarly forward the request to appropriate nodes. In this way the correct node can be reached in O(logN) number of steps.
- The concept of consistent hashing is that, if an original sequence of numbers 'a', is hashed with a function of the position of the number in the sequence : hash(a
_{k}) = f(k), then if we delete any of the numbers or add a new number in between, then the hash function will not work because, the position of one or more numbers will change and thus hash function will return incorrect answer. We need to re-compute the hash function with the new sequence every time. This is in-consistent hashing. - To avoid in-consistent hashing, one can create a hash function that is a function of the number itself. So even if we delete or remove a number from the sequence, the hash function returns the correct number. This is consistent-hashing.

- More on RingPop : https://eng.uber.com/intro-to-ringpop/
- Predicting customer demand : https://eng.uber.com/elk/

Categories: DESIGN

Tags: Cab hailing service, Cab-Passenger Matrix, Design, Google Maps, Hungarian Method, Uber