As I was watching the Jamaican sprinters sweep the 200m dash, I started wondering how such a relatively small and not wealthy by any stretch of imagination country could achieve such dominance. Is there a correlation between the population of the country and its haul of medals? Almost certainly. Perhaps a more incendiary and more interesting question is “does money buy medals?” Not in a literal sense, of course, but in a statistical sense. Is there a correlation between the per-capita medal count and per-capita income? There should be. Money buys equipment, coaching and medical staff, transportation, etc.
As I embarked on this project, I expected to find a significant positive correlation. But what I found was even more shocking. Medal count per person grows as the square of per-capita income. The graph below shows the medal count (obtained from http://www.london2012.com and a Wikipedia article) divided by the population of each country (obtained from Wikipedia) vs. the purchasing parity GDP per capita (obtained from the CIA column of this Wikipedia article). Only populous (>50,000,000) countries are included since statistical trends are more clear in large samples and the fluctuations that obscure these trends are smaller. The straight line is the quadratic fit.
Why a square? I have an explanation for this striking phenomenon which assumes that each sport has an entrance threshold and that the distribution of these entrance thresholds is roughly uniform. If some contry has a GDP per capita that is greater than the entrance threshold for a particular sport, it enters competition. It follows that the number of competitors is inversly proportional to the entrance threshold of a sport. I further assume that all competitors are equally likely to get a medal once they enter competition. Therefore the number of medals each competitor wins is inversely proportional to the number of competitors and consequently it is proportional to the entrance threshold of the sport. The final logical step is to notice that a country with per-capita income competes in all sports whose entrance thresholds are . Thus the total number of medal is proportional to
Thus the square comes from the fact that richer contries enter more sports and it is easier to win medals in more expensive sports since not as many countries can enter.
The efficacy of a vaccination program is quantified by the fraction of the non-immunized population that gets sick in an epidemic. This fraction can also be thought as the probability that a particular individual will get sick in an epidemic.
A number of factors determine the probability of infection in an epidemic:
How long is the sick person contagious? What is the probability of infection given the contact with a sick person? What is the average rate of inter-personal contacts? How far does a person travel during the sickness? The answers to these questions depend on the type of virus, and the properties of the population such as its density and the patterns of movement.
The situation seems too complex for predictive modeling. Could a simplified model offer meaningful insight? Yes, if we pick a narrow aspect of the problem to look at. How about this? You probably heard the doomsday scenarios of a deadly virus spread around the world aboard airplanes. Is a this kind of talk just fear-mongering or a realistic prediction?
Let us construct a model to study whether the doomsday scenario is plausible. Let’s start with a 2D square lattice, or a board, whose sites (spaces) can be empty of occupied by “people” — let’s call them “entities.” The entities could be in three states: immune, vulnerable, and sick. The sick entities can infect the vulnerable but not the immune ones. We need to decide what to do with the sick entities. For example, some fraction of them can “die”–be removed from the board. The simplest thing is to just let them become immune after the disease has run its course. This is what is done in our model.
The entities can move around the board. The movement models the short range everyday movement of the population: commute, shopping, going to and from school, etc. I will use a turn based (like the Conway’s game of life) set of movement rules that are often used in simulating fluid-vapor interfaces. The result is a collection of dense clusters of varying sizes that float in a sparsely inhabited sea. There is little exchange of entities between the clusters. Since the infection is acquired on contact, global epidemics are impeded by the limited inter-cluster movement. One could think of these semi-isolated clusters as communities, cities, or even continents depending on your perspective.
Below is the movie of the model simulation in which the sick entities (red) infect the vulnerable entities (blue) and after a while become immune (green). A fraction of the population is already immune at the onset of the epidemic. Observe how the disease propagates quickly across the clusters and makes infrequent jumps between the clusters. In this particular simulation, 37% of the vulnerable population got sick before the epidemic fizzled out.
You probably noticed that the immunization rate in the above example is rather low, 30% to be exact. Since most entities are vulnerable, the epidemic has no trouble spreading. When the immunization rate is more than doubled to 70%, most epidemics fizzle out early. As you can see in the PDF (probability density function) plot below of the total epidemic size (defined as the fraction of the vulnerable population that go sick), all epidemics involve fewer than 4% of the populace. There is simply not enough population movement for the disease to spread.
Time to include airplanes and examine the plausibility of the doomsday scenario!
In addition to the short range movement, let’s allow at each turn a certain small fraction of the population to move anywhere on the board. The second graph below is the PDF of the epidemic size for the same parameters as the one above, but with the additional 5% of the population executing large scale movement each turn. Notice the radical change in the scale of the x axis. When a small fraction of the population travels long distances each turn, most epidemics grow to encompass the majority of the population. The bimodal nature of the epidemic size distribution suggests that there is a threshold size. If the epidemic hits a cluster that happens to be larger than the threshold, the disease can escape and infect almost all other clusters.
Let us now quantitatively examine the effect of the large scale movements on the probability of significant epidemics. In the graph below I will plot the probability of occurrence of an epidemic that involves > 10% of the vulnerable populace as a function of the immunization rate for two different magnitudes of the large scale movement. Significant epidemics become rare as the immunization rate increases. However, perhaps not surprisingly, greater immunization rate is required to avoid epidemics for a larger magnitude of large scale population movement.
Predicting how epidemics spread in the real world is a tricky business. However, the general conclusion of the simple model, I think, will stand. While 100% immunization rate is not strictly required to stem epidemics, as the extent of long distance travel increases, we will need a higher immunization rate. It would be unwise to be lax about immunization requirements to discover one day that not enough of the population is immunized.
The real issue, I think, is that the small fraction of people who refuse to be immunized are shielded from infection by those who took the risk of immunization (albeit a small risk). But that is a can of worms, I don’t really want to open…
It is simpler to consider the situation in which buses do not have a fixed schedule but arrive at a fixed rate per unit time . Intervals between consecutive buses in this situation obey Poisson statistics, which means that no matter when I arrive at the stop the average waiting time before the bus arrives is .
In what follows I will present a few results without much derivation. If you are interested in the nitty-gritty, contact me for details.
Suppose there are two buses A and B that arrive at a stop with rates and . The probability that A arrives before B is
The mean waiting time for bus A provided that A has arrived first is
Now if the travel times to destination on buses A and B are we can compute the expected travel time if the traveler boards the first bus that comes to the stop. We will call it because the strategy is to let zero buses pass (even if they take longer).
We can interpret this formula as follows. The total bus arrival rate is and therefore the mean waiting time for a bus, any bus is Then with probability the A bus has arrived and the travel time is Likewise, with probability the B bus arrives so that the travel time is
It is a inly marginally trickier to derive the mean trip duration (will call it ) when we are willing to let one A bus pass by in the hopes that the next bus will be the faster B bus. The answer is
The explanation of the second term in the above formula is that if A arrives first, we let it pass and we are back to the “let zero buses pass” strategy. The rest of the terms in the equation for are the same as before.
In general, for any we have a recursion relation:
We can now start asking questions like: “Under what conditions does letting the slow bus pass make sense (result in a shorter expected trip)?” What about letting two buses pass? When does that strategy pay off?
When does Comparing the formulas above we arrive at a simple condition on the arrival rate of the fast bus which is independent of the arrival rate of the slow bus
For example, if the slow bus takes 30 minutes and the fast bus takes 20 minutes to arrive at the destination, it makes sense to let the slow bus pass if the fast bus arrives more frequently than once in 10 minutes. No big surprise there, anybody with a modicum of common sense could tell you that.
What is surprising is that the condition (1) does not depend on the arrival rate of the slow bus. Did I make a mistake? It turns out that when the expected travel times for other strategies are exactly the same! I will leave the proof to my esteemed reader as homework :)
Therefore, since it does not matter how frequently the slow bus comes, if the fast bus comes frequently enough (condition (1) is satisfied), it makes sense to wait for the fast bus no matter how many slow buses pass.
The flight behavior of a prey is instinctual and highly sophisticated–after all, predation is a major selective pressure in a Darwinian universe. The variation of escape strategies reflects the intrinsic abilities of the prey. Some prey, like the antelope, rely on superior speed and safety in numbers. Others, like the rabbit, or the squirrel, rely on superior maneuverability. In this post I will illustrate how superior maneuverability can be used to escape from a faster predator.
What is maneuverability? We will define it in a narrow sense as the ability to quickly change the course or direction of motion. The rate of change of the direction of motion is also known as acceleration. The word acceleration is used colloquially when the rate of change of velocity is in the same direction as velocity. More generally, the vector of acceleration could point in any direction. If the acceleration vector points in the direction opposite to velocity, we would say that the moving object decelerates. Circular motion is sustained by a centripetal acceleration which points toward the center of the circle, perpendicular to the velocity vector.
To illustrate how a slower animal can successfully evade a faster one let us consider a simple model of the predator-prey pursuit. Suppose, the only constraint on the motion is the velocity and acceleration caps. The predator’s maximum velocity is greater than that of the prey. Vice versa, the prey can accelerate faster (in any direction) which means, among other things, that it can make sharper turns.
Here are the model pursuit strategies:
- If traveling slower than the maximum speed, accelerate in the instantaneous direction of the prey at maximum acceleration.
- When traveling at maximum velocity, project the acceleration vector on the direction perpendicular to the instantaneous velocity. This will insure that speed does not exceed the maximum.
- If traveling slower than the maximum speed, accelerate away from the predator at maximum acceleration.
- If traveling at maximum velocity and the predator is a certain distance D away, stay the course.
- If traveling at maximum speed and the predator is within striking distance, execute a turn away from the predator at the tightest turning radius possible.
Even without doing the simulations of this model we can foresee the qualitative features of the trajectories it yields. When the prey is further than D away from the predator, it will run along a straight trajectory which means that the speedier predator will eventually catch up with it and draw within the distance of caution D. At that point the prey will commence a sharp turn away from the predator. The predator, being less agile, will not be able to turn as sharply and will overshoot the prey and the distance between them will grow and might exceed D. When that happens, the prey will stop turning and run along a straight line again. The cycle will repeat ad infinitum the predator not being able to get closer to the prey than some finite fraction of D.
The movie below shows the trajectories of the prey (green) and predator (red) produced by the simple model when the predator is 50% faster, but the prey is able to achieve twice the acceleration.
Finally, let me point out that the strategies in the simple model are far from optimal. For example, one can imagine that if the predator could anticipate the direction of the prey’s turn (which is possible in the above scenario in which the prey always turns away from the predator), it could potentially intercept the prey. The optimality of a particular escape — pursuit strategies is usually hard to prove and the methods of such proofs are still subject of current research.
To compute my bike mpg we will need three numbers:
- Extra calories burned per mile of bike travel at roughly 13 miles per hour (my average commuting speed). For relatively flat terrain and my weight this number is roughly 42 food calories per mile (obtained from about.com).
- We need a food equivalent for the ethanol production. Let’s say I go my extra calories from eating sweet corn. According to the same source, sweet corn has 857 food calories per kilogram. So I will need to eat 49 grams of sweet corn per mile traveled at 13 miles per hour on my bike.
- Now we need to know how much ethanol can be made from 49 grams of sweet corn. The Department of Energy’s Biomass Program to the rescue. According to their website, a metric ton of dry corn can theoretically yield 124.4 gallons of ethanol. Since sweet corn is 77% water, this means that up to 0.0014 gallons of ethanol can be made from 49 grams of sweet corn.
Putting these numbers together we arrive at 0.0014 gallons of ethanol per mile or…drumbeat please:
This number is not small, but neither is it very large! There exist experimental vehicles that seat four and achieve over 100 mpg. When fully loaded, the effective, per passenger mpg is 400. If my calculations are correct, Technology is about to bring motorized transport close to the efficiency of a person on a bike!
If you drive like me, you have no patience for bumper to bumper traffic. There is gotta be a way to beat it somehow, right? Do you sneak into an opening in a neighboring lane if it is moving faster? Do you set goals like: “when I get in front of that van, I’ll switch back?” It doesn’t always seem to work. A lane that was zooming by you comes to a dead stop when you switch into it. If the motion of each lane is random, is there a way to switch lanes and move faster than a car that stays in lane?
It turns out there is a way to beat the traffic. To show this we will use a simple model of traffic flow introduced by Nagel and Schrekenberg (see the previous post). The model consists of a circular track with consecutive slots which can be empty of occupied by cars. Cars have an integer velocity between 0 and vmax. As we saw in the previous post, simple rules for updating the positions and velocities of the cars can reproduce the traffic jam phenomenon thereby a dense region forms in which the cars are at a standstill for a few turns and then, as the jam clears in front of them, the cars accelerate and zoom around the track only to be stuck in the jam again. The jam itself moves in the direction opposite to that of the cars.
Now imagine that we put two of the circular tracks (or lanes) side by side. For starters, let’s require all cars except one to stay in their respective lanes. One rogue car can switch lanes. Can the rogue with the right lane switching strategy move faster than the rest of the cars on average? The answer is most certainly yes although finding the best lane switching strategy is a difficult computational problem. What we are going to do here is compare two lane switching strategies that at first sight seem equally good. What we will discover is that it the lane changing strategy matters. As you might have suspected, if you don’t do it right, you might actually move slower than the rest of the traffic!
Here are the two simple strategies we will compare (I suggest you read the previous post for the description of the model):
1) “Stop-switch:” if the slot directly ahead is occupied, switch if the space in the other lane directly across is not occupied.
2) “Faster-switch:” if the car directly ahead in the neighboring lane is moving faster, switch if there is space available.
The graph above compares the two strategies. It shows the percent improvement of the rogue’s average speed compared to the average speed of the rest of the cars as a function of the car density. When density is low and traffic jams are rare, switching lanes has almost no effect on your average speed for both strategies. When the density is high and traffic jams are abound, switching can make you go slower than the rest of the traffic. The reason is that when a space in the neighboring lane opens up, it is likely to be at the tail end of a jam whereas the jam in the lane you just switched out of might be already partially cleared. The final remark is that the “Stop-switch” strategy is significantly better improving the speed by as much as 35% whereas the best “Faster-switch” can do is a 15% improvement.
Finally let me mention that if all cars switch lanes and use the same strategy, nobody wins. All cars move with the same speed on average. That average speed could be smaller or larger (depending on the car density and the switching strategy) than in the case when everybody says in lane. The graph below explains why everyone is so keen on the advice “Stay in lane!” It turns out that if everyone uses the “Faster-switch” strategy, the average speed is drastically lower for everyone than if everyone stays in lane! The reason for this dramatic result is that when you change lanes, the car behind is likely to slam on the brakes which slows everyone down.
Traffic flow is frequently studied because it is an example of a system far from equilibrium. The practical applications are important as well. Many models from crude to sophisticated have been advanced. Massive amounts of data exist and are frequently used to estimate model parameters and make predictions. I am not going to attempt to review the vast field here. My goal is simply to elucidate the physiological limitation of the human mind that causes the driving patters leading to congestion.
Although great progress has been made in modeling traffic as a compressible fluid, a class of models that fall into the category of Cellular Automata are more intuitive and instructive.
Cellular Automata, promoted by Stephen Wolfram of Mathematica fame as the solution to all problems, are indeed quite nifty. It turns out that autonomous agents, walking on a lattice and interacting according a simple set of rules can reproduce a surprising variety of observed macroscopic phenomena. If you want to learn more the Wikipedia article is a good start.
A pioneering work of Nagel and Schreckenberg published in Journal de Physique in 1992 introduced a simple lattice model of traffic which reproduced the traffic jam phenomenon and came to a surprising conclusion that the essential ingredient was infrequent random slowdowns.
You have probably done so yourself, you change the radio station or adjust the rear view mirror, or speak the child in the seat behind you. As you do so, your foot eases off the accelerator ever so slightly irritating the person behind you who has to disengage the cruise control. You and people like you are responsible for the traffic jams when the volume is heavy but there are no obvious obstructions to traffic.
Allow me to reproduce the authors’ description of the model since it is concise and elegant:
“Our computational model is defined on a one-dimensional array of L sites and with open or periodic boundary conditions. Each site may either be occupied by one vehicle, or it may be empty. Each vehicle has an integer velocity with values between zero and vmax. For an arbitrary configuration, one update of the system consists of the following four consecutive steps, which are performed in parallel for all vehicles:
- Acceleration: if the velocity v of a vehicle is lower than vmax and if the distance to the next car ahead is larger than v + 1, the speed is increased by one.
- Slowing down (due to other cars): if a vehicle at site i sees the next vehicle at site i + j (with j < v), it reduces its speed to j.
- Randomization: with probability p, the velocity of each vehicle (if greater than zero) is decreased by one.
- Car motion: each vehicle is advanced v sites.”
Without the randomizing step 3) the motion is deterministic: “every initial configuration of vehicles and corresponding velocities reaches very quickly a stationary pattern which is shifted backwards (i.e. opposite the vehicle motion) one site per time step.”
The model exhibits the congestion phenomenon when the mean spacing between the cars is smaller then vmax.
Below are the links to the simulations of the model for a circular track with 100 lattice sites, the cars are colored circles which move along the track. It helps to follow a particular color car with your eyes to see what’s happening.
The two simulations are done with 15 cars (density lower than critical) and with 23 cars (above the critical density–exhibits congestion). As you probably guessed vmax=5 in these simulation hence 20 cars correspond to the critical density. The probability of random slowing down is 10% per turn.
The second simulation (above the critical density) shows the development of a jam of 5 cars. Cars zoom around the track and then spend 5 turns not moving at all, before the traffic clears ahead of them and they can accelerate to full velocity again.
The moral of the story? People like you and me can be the cause of traffic congestion!
This post is a digression. It’s not about constructing mathematical models of real world phenomena. It’s about sequences of numbers like the famous Fibonacci sequence and the Android pattern unlock screen shown here.
The rationale for a graphical pattern instead of a numeric PIN is clear. Humans are much better at remembering patterns than strings of digits. But is the pattern as secure? Other issues aside, such as traceability of fingerprint smudges that allows an attacker to recover the pattern, are there as many patterns as there are numeric PINs of the same length?
Well that’s where understanding recursion relations is useful. A recursion relation defines a sequence of numbers by a relationship between and all preceding sequence members. For example the Fibonacci sequence is defined via
Depending on the type of recursion relation (whether it is linear, for example), there are a variety of methods for solving them. Let’s use the power of recursion to find the number of possible Android unlock patterns that connect dots.
To make the problem seemingly more complicated (but really simpler) let’s separately compute the number of patterns that start at the center, side and corner of the 3×3 grid of dots. Let’s call those numbers and The total number of unlock patterns is
Because there are 4 corner and 4 side dots in the 3×3 grid.
Let’s now derive a recursion relation for and The first step in the program is to note that when the pattern consists of only one dot (that would not be a very secure pattern) we trivially get and
Now let’s imagine that we managed to compute and for some We can then quickly compute noticing that the very first link that is made from the center dot can only go to a side dot (4 possible ways) or to a corner dot (also in 4 possible directions). Therefore
When starting from the side dot, there are 4 ways to get to a corner dot, one way to get to the center dot and 2 ways to get to other side dots, therefore
And finally when starting from a corner dot, there are 4 ways to get to a side dot and one way to get to the center dot, thus
The above three relationships can be summarized as
using matrix notation by defining a vector and a matrix
Hang on tight, we are almost there. The simple recursion relation (1) means that we can obtain the numbers of patterns connecting N dots from those for N-1 dots by just applying the matrix A. Applying this reasoning recursively we arrive at
I won’t bore you with the rest of the calculation which is quite routine. As my eccentric math tutor used to say: “After this point it’s just algebra!”
Here is the result of the calculation:
|N||Number of patterns connecting N dots||Number of numeric PINs of N digits|
Clearly, the number of patterns grows slower than the number of numeric PINs. This may not matter, however, if patterns are easier to remember and therefore one can comfortably have a longer pattern.
The only sure way to determine whether it is indeed easier to remember patterns, is through careful experiments with human subjects. Until such data are available, all statements memorability and security of patterns vs PIN’s are pure speculation…
The restriction that a dot cannot be used more than once in a pattern *dramatically* reduces the number of allowed patterns. Please see comments for the true number of unlock patterns.
What would you do with a time machine? I bet some people would be chomping at the bit to pit two dominant teams from different eras against each other and have a grand old spectacle!
But alas, it is safe to say that a time machine will remain for the foreseeable future in the realm of magic.
Can we get a glimpse at what the outcome of such a magical game might be? Is there a scientifically sound way to rate sports teams in a way that judges their true strength. Most importantly, we need a method that yields ratings whose scale does not change with time so that a team that gets a rating of 2000 thirty years ago is as strong (in some sense) as a team that gets a rating of 2000 today.
We are indeed in luck! Such a system exists. It was proposed in the 1950’s by a Hungarian mathematician Arpad Elo (read about him on Wikipedia) and bears his name. His system is based on sound mathematical theory and ever since then dozens upon dozens of mathematical papers have been proving how reliable and reasonable the system is. Although Elo originally proposed his system to rate chess players, it has been adopted by a number of other sports bodies including FIDE, FIFA, MLB, EGF and others.
At the core of the ELO system is the ranking updating scheme which adjusts the ranking of the two teams (or players) after each match depending on the result. Given the rankings before the game, one can compute the probability of each outcome given that the actual performance has a certain probability distribution. If the stronger team wins its rating increases by a smaller amount than if the weaker team wins. There are many different specific incarnations of the system. While some are more accurate than others, even in its simplest form, the system is quite useful. In fact using publicly available match data we can resolve the question:
If 1997 Chicago Bulls played a best of 7 series against the 1986 Boston Celtics, what are the chances of each team winning?
After downloading the match data (56,467 games over 64 years that involved a total of 53 franchises some of which changed names and cities a number of time) and computing the rating history I came up with the top ten highest rated franchises:
|3||Los Angeles Lakers||1988||2163.3|
|8||San Antonio Spurs||2007||2089.4|
It is a telling sign that the NBA is a competitively healthy organization that the top 10 all time high ranking teams of all time pretty close to each other in rating. Also, it seems at least superficially, that there is no historical bias meaning the objective meaning of a rating does not change with time.
So, what would happen if the 1997 Bulls played a best of 7 series against the 1986 Celtics?
Home field advantage aside (the ranking I am using does not take that into account), the probability of the Bulls winning any particular game is . The probability of winning a best of 7 series (below )
The Bulls would have a 57.5% chance of winning the series: an exiting spectacle indeed!
Finally I leave you with a graph of the historical ratings of six teams from large metropolitan areas from 1980 to present day. It seems that it is extremely difficult to maintain a dominant team for more than a few seasons (although the Lakers managed to do so in the 1980’s).