A neat probability rule-of-thumb

Disclaimer: I am NOT a probabilist. Not only have I never taught probability, the last time I took a course in probability was in my sophomore year of college. So if this is well known (or totally wrong), forgive me.

A non-mathematician friend of mine shared this link with me. It compares the lifetime risk of dying by various means—cancer, heart disease, shark attack, etc. There are many problems with the analysis presented on this web page (for example, you are not equally likely to die from the flu in each of your 77.6 years (the average lifespan), conditional probability would be a more useful measure of risk for some of these, etc.), but I will ignore all of that. I want to focus on the last line. It says:

Lifetime risk is calculated by dividing 2003 population (290,850,005) by the number of deaths, divided by 77.6, the life expectancy of a person born in 2003.

For example, for drowning the risk is 1 in 290850005/(3306\cdot 77.6)=1133.7

Stated another way, they are claiming that if D people die each year from a given cause, the total population is P, and the life expectancy is L, then the probability of dying from the given cause is DL/P. I saw this and I thought, “Surely this is wrong. Why would that formula give the probability?”

So I tried to calculate it myself. Here is my back-of-the-envelope calculation. The chance of dying from this cause in one year is D/P. The chance of not dying from this cause in one year is 1-D/P, the chance of not dying from this cause for L years is (1-D/P)^L, and so the chance of dying from the cause in L years is 1-(1-D/P)^L. (Of course, this leaves open the possibility of dying several times in those L years, but we’ll ignore that.)

Let’s use this formula with the drowning example. I get 1-(1-3306/290850005)^{77.6}=0.000881671\ldots, or 1 in 1134.2.

What?!?! I was shocked to see an answer almost identical to the one using the “wrong” technique. There must be more to this than I first thought. Let’s look a little closer.

First, notice that 1-(1-D/P)^L=1-((1+(-D)/P)^P)^{L/P}. Sitting inside this expression is a sub-expression that looks a lot like the limit definition of e^x. In particular, because P is a large number, this expression is very nearly 1-(e^{-D})^{L/P}=1-e^{-DL/P}. Aha! There’s the DL/P term! But we still don’t quite have what we want.

What we’ve shown is that if the probability someone dies of a given cause in one year is x, then the probability that they will die from it in L years is approximately 1-e^{-Lx}. Now suppose the probability x is small (like the probability of dying by drowning). We will compute the linear approximation to this function at x=0. We see that d(1-e^{-Lx})/dx=Le^{-Lx}. At x=0, that derivative is L. So the linear approximation at x=0 is simply Lx. In particular, if we evaluate it at our specific annual probability value D/P, we obtain DL/P. And there it is! [Update: thank you to the commenters for pointing out that the introduction of the exponential function, while fine, is unnecessary. Quicker: just use the linear approximation for 1-(1-x)^L at x=0.]

Again, I’ve never seen this before. Perhaps it is well known. For example, maybe it is a good rule-of-thumb that all good actuaries know.

I’d be happy to hear people’s thoughts about this formula and my reasoning. Maybe there’s another, different way to see this.

[I’d like to thank my colleague Jeff Forrester for talking through this with me.]


  1. dwees says:

    Here’s an interesting follow up. If the actuaries have to assign a dollar value to a policy based on one of these probabilities, and they have 20,000,000 people signed up to a policy for 50 years each, how much money do they gain or lose depending on which approximation they use to calculate this probability?

  2. Yan Zhang says:

    Everything you’ve done is correct – here’s an equivalent though maybe mentally quicker way:

    (1-x)^t = 1 – (t choose 1) x + (t choose 2) x^2 + … etc. (binomial theorem)

    the interesting here is your x is D/P, which is typically a small number, in your particular case on the order of 10^{-5}. Thus, by the time you get to x^2 you’re already facing a 10^{-10} multiplier, whereas your t=L is only going to give you about 100 at most.

    Therefore 1-(1-x)^t is roughtly 1 – (1-tx) = tx. Here your t=L and x = D/P, so you get your desired thing.

    @dwee: since 10^{-10}*100 = 10^{-8}, and you have about 2*10^7 people, I wouldn’t see your total $ being off by more than a couple of orders of magnitude away from the dollar (on the safe side I may go with 100 dollars. I hope I’m not totally wrong with this since I really need to sleep).

  3. Dude says:

    If x is small, (1-x)^n = 1 – x*n + O(x^2).

    So, (1-D/P)^L can be approximated by 1 – L*D/P. Your expression follows.

  4. Thanks, Yan and Dude. You’re right. The introduction of the exponential function is unnecessary.

  5. Very well known result, and quite useful. I’m surprised your sophomore course didn’t include it.

    1. Well, maybe we did learn it. But that was 20 years ago…

Comments are closed.