# Waiting for the bus

Although the public transportation system in Madrid is very good, I don’t usually take the bus or the train to move around. But sometimes my car decides I should take public transportation (bus or train). One of things I always found intriguing is that I always wait way more for the bus than what is expected according to the frequency quoted by the transportation companies at the bus or train stop. For example: if the frequency of the buses is 4 per hour, you should expect a bus each 15 minutes on average, although fluctuations around this value may occur due to traffic conditions. Thus, arriving at the bus stop at random will get you an average of 15/2 = 7.5 minutes of wait. I have the impression I always wait 15 minutes instead. How can this be? Is it that I am so unlucky I always arrive at the bus stop when the previous bus has just left?

The answer to this question is the famous waiting time paradox in queueing theory: in short, let’s assume that the times between two buses are given by the variable \(T\). In the ideal world, \(T\) will be a deterministic variable. In the example above, \(T = 15\) minutes. However, due to traffic conditions and other variables, \(T\) is distributed around 15 minutes. For example, the following graphic shows the “lateness”of non-frequent buses in Great Britain, extracted from the Bus Punctuality Statistics GB 2007 report

As we can see, although most of the buses were late only less than a couple of minutes, but there is an significant fraction of them that were late more than 5 minutes. And even some of them came earlier than expected! Assuming that our buses come at times \(T\)drawn from the above distribution, let’s see how a day will look like from the bus stop:

Buses arrived at the bus stop (vertical lines) with time between them given by \(T_i\). After my car breaks down, I arrive at the bus stop at the time given by the vertical arrow. Thus I have to wait a time \(\tau\). The question is: what is the average value of \(\tau\)? Given that my car breaks at random, I can assume that my arrival time is completely at random, uncorrelated with bus time tables. However, if the time I get to the bus stop is random, I have more probability to get to the bus stop in an interval in which the time between buses \(T\) is bigger. Specifically, the probability that I arrive at an interval with time between buses \(T\) is

\[\frac{T P(T)}{\overline{T}}\]

where \(\overline{T}\) is the average value of time between buses. Given the interval \(T\), the waiting time \(\tau\) is equally distributed (since my arrival is random), and thus we have to multiply the above probability by \(1/T\). Finally, we average over all possible \(\tau \leq T\), giving \[Q(\tau) = \int_\tau^\infty \frac{TP(T)}{\overline T} \frac{1}{T}dT = \frac{1}{\overline T}\int_\tau^\infty P(T) dT\] which give us the distribution of waiting times. The average value of \(\tau\)is the given by \[\overline \tau = \int_0^\infty \tau Q(\tau) d\tau = \frac{\overline{T^2}}{2\overline T^2} = \frac{\overline T}{2}\left(1 + \frac{\sigma_T^2}{\overline{T}^2}\right) \] where \(\sigma_T^2\) is the variance of the times between buses. This equation is the main result. It says:

- If buses come at perfect and regular time intervals, then \(\sigma_T = 0\) and the waiting time is what we expect: \(\overline{\tau} = \overline{T}/2\). That is if we expect a bus each 15 minutes, I will wait (in average) 7.5 minutes.
- However, in real world, buses do not arrive at perfect times and then \(\sigma_T > 0\). Thus, waiting time is always greater than \(\overline{T}/2\). In fact, the distribution of lateness above shows that there is a large fraction of buses with long delays and then \(\sigma_T\) could be very big, controlling the waiting time.

In fact, the GB Bus Punctuality Report shows that in average, waiting time exceeds 40% the expected time \(\overline{T}/2\). That is: due to the variance in times between buses, you (and me) end up waiting a time greater than the average time between buses. And that is my feeling: if trains/buses come each 15 minutes, I end up waiting 15 minutes, not 7.5.

In a ideal world, bus timetables would quote both the frequency of the buses \(\overline{T}\) and their variance \(\sigma_T\), so we can estimate the waiting time. Unfortunately, they just tell us part of the story…