Yongfu's Blog

# Constructing Life Tables with R

Dec 11, 2017

I have been using the package dplyr to handle with data for a while, and I thought I can use it with ease until I was stuck with my homework on contructing a life table. I found spreadsheets (either Excel or Google Spreadsheets) easy for handling this task, but had a hard time dealing with it in R. I think it was due to my unfamiliarity with the built-in functions and insufficient practice in R. So, I wrote this post as a review and practice of my data-wrangling skills in R.

I will illustrate how I constructed a life table with R, and you’ll find out how easy it is (and wonder how could I stumble on it).

I used these packages to construct a Life table.

1library(readr)
2library(dplyr)
3library(knitr)


Load the csv file to life_table. The raw data contains 3 columns: Age, Survivorship at Age x ($l_x$), and Fecundity at Age x ($m_x$). I added options(scipen=999) to disable scientific notations.

1options(scipen=999)  # Disable Scientific Notation
3life_table

# A tibble: 10 x 3
Age        lx    mx
<int>     <dbl> <int>
1     0 1.0000000     0
2     1 0.0000620  4600
3     2 0.0000340  8700
4     3 0.0000200 11600
5     4 0.0000155 12700
6     5 0.0000110 12700
7     6 0.0000065 12700
8     7 0.0000020 12700
9     8 0.0000020 12700
10     9 0.0000000     0


## Variables to Construct

Here are the variables that need to be calculated.

StatisticNotationCalculation Formula
$l_x m_x$
$x l_x m_x$
$l_x m_x e^{-rx}$
Average survivorship
(age class)
$L_x$$L_x = (l_x + l_{x+1})/2$
Life expectancy$e_x$$e_x = (L_x + L_{x+1} + … + L_{max})/l_x$
Reproductive value$V_x$$\displaystyle \frac{\sum_{y=x}^{max\hspace{0.3mm}x} e^{-ry} l_y m_y}{e^{-rx} l_x} Net reproductive rateR_0$$\sum_{all \hspace{0.3mm} x} l_x m_x$
Generation time$G$$\frac{\sum_{all \hspace{0.3mm} x} x l_x m_x}{R_0} Intrinsic rate of increaseApproximate r$$r \approx \frac{ln(R_0)}{G}$
Intrinsic rate of increase(True) r$$\displaystyle \sum_{all \hspace{0.3mm} x} e^{-rx}l_x m_x = 1 l_xm_xe^{-rx} and V_x will be calculated twice for the approximate and the true r. ## Constructing Variables ### mutate: Creating new columns Using the pipe %>% and the function mutate in package dplyr, I first constructed 7 new variables. Note the dependencies of the variables, so that I couldn’t construct the life tables at once.  1life_table <- life_table %>% 2 mutate("lx*mx"=lx*mx, 3 "x*lx*mx"=Age*lx*mx, 4 "Lx"=(lx+lead(lx))/2, 5 "Lx"=replace(Lx, 10, 0), 6 "ex"=rev(cumsum(rev(Lx)))/lx, 7 "R0"=sum(lx*mx), 8 "G"=sum(Age*lx*mx)/R0, 9 "approx.r"=log(R0)/G 10 )  Two things worth noting in the mutate function: 1. The code "Lx"=(lx+lead(lx))/2, "Lx"=replace(Lx, 10, 0) • lead(lx) shifts the whole column of l_x to its next value, i.e. the column l_x becomes l_{(x+1)}. • Due to lead(lx), the last entry of the new column L_x must be a NA, so I have to assign 0 to it (otherwise all calculations based on it will become NAs). 2. The code "ex"=rev(cumsum(rev(Lx)))/lx (This is where I was stuck) 1. The numerator of ex is calculated by summing over L_x to L_{max}, the maximum age of L_x. This is not so intuitive when working with R. 2. rev(Lx) reverse the order of L_x, and cumsum() is for cummulative sum. cumsum(rev(Lx)) then is equivalent to summing L_x backwards (i.e. from L_{max} to L_x). 3. But since L_x is reversed in the first place, cumsum(rev(Lx)) is also in reverse order. Reversing cumsum(rev(Lx)) with rev() then gives what I want. ### Calculating r by while loop By the Eular-Lotka equation, I can calculate r.$$\displaystyle \sum_{all \hspace{0.3mm} x} e^{-rx}l_x m_x = 1Using while loop and Approximate r calculated earlier as the starting value for r, r is calculated as below.  1df <- as.data.frame(life_table) 2r <- 0.0812198 3x <- sum(exp(-r*dfAge)*dflx*mx) 4while (abs(x-1) >= 0.000001) { 5 if (x-1>0){ 6 r <- r+0.00000001 7 } 8 else{ 9 r <- r-0.00000001 10 } 11 x <- sum(exp(-r*dfAge)*dflx*mx) 12 r 13}  The rest is simple, and the logic applied is the same. 1life_table <- life_table %>% 2 mutate( 3 "approx.r"=log(R0)/G, 4 "r"=r, 5 "Vx"=rev(cumsum(rev(exp(-r*Age)*lx*mx)))/exp(-r*Age)*lx, 6 "lx*mx*e^-rx"= lx*mx*exp(-r*Age), 7 "approx.Vx"=rev(cumsum(rev(exp(-approx.r*Age)*lx*mx)))/exp(-approx.r*Age)*lx, 8 "approx.lx*mx*e^-rx"= lx*mx*exp(-approx.r*Age) 9 )  Now, I’m done constructing a life table. ## Printing pretty Life Table 1kable(life_table, format="markdown", align="c")  Agel_x$$m_x$$l_x m_x$$x l_x m_x$$L_x$$e_x$$R_0$$G$$Approximate
$r$
$r$$V_x$$l_x m_x e^{-rx}$$Approximate$
$V_x$
$Approximate$
$l_x m_x e^{-rx}$
01.000000000.000000.00000.50003100.5001531.28293.067270.08121980.08471171.00000100.00000001.00991030.0000000
10.000062046000.285200.28520.00004801.9677421.28293.067270.08121980.08471170.00006750.26203520.00006790.2629518
20.000034087000.295800.59160.00002702.1764711.28293.067270.08121980.08471170.00002970.24970000.00002990.2514499
30.0000200116000.232000.69600.00001782.3500001.28293.067270.08121980.08471170.00001260.17993620.00001260.1818310
40.0000155127000.196850.78740.00001321.8870971.28293.067270.08121980.08471170.00000670.14027370.00000670.1422467
50.0000110127000.139700.69850.00000871.4545461.28293.067270.08121980.08471170.00000280.09146340.00000280.0930743
60.0000065127000.082550.49530.00000421.1153851.28293.067270.08121980.08471170.00000080.04965670.00000080.0507081
70.0000020127000.025400.17780.00000201.5000001.28293.067270.08121980.08471170.00000010.01403800.00000010.0143853
80.0000020127000.025400.20320.00000100.5000001.28293.067270.08121980.08471170.00000010.01289780.00000010.0132632
90.000000000.000000.00000.0000000NaN1.28293.067270.08121980.08471170.00000000.00000000.00000000.0000000

Note that I edited the column names of the table in a text editor to make it display in LaTeX style. There is no simple knitr function (at least I don’t know) that print out pretty displayed style text in a table generated from a data frame