library(data.table)
q <- c(0.001,
0.002,
0.003,
0.003,
0.004,
0.004,
0.005,
0.007,
0.009,
0.011)
w <- c(0.05,
0.07,
0.08,
0.10,
0.14,
0.20,
0.20,
0.20,
0.10,
0.04)
P <- 100
S <- 25000
r <- 0.02
dt <- as.data.table(cbind(q,w))
npv <- function(cf, r, S, P) {
cf[, inforce := shift(cumprod(1 - q - w), fill = 1)
][, lapses := inforce * w
][, deaths := inforce * q
][, claims := deaths * S
][, premiums := inforce * P
][, ncf := premiums - claims
][, d := (1/(1+r))^(.I)
][, sum(ncf*d)]
}
npv(dt,r,S,P)
#> [1] 50.32483
microbenchmark::microbenchmark(npv(dt,r,S,P))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> npv(dt, r, S, P) 2.5791 2.71035 2.964293 2.85625 3.10385 6.0357 100
Created on 2021-01-15 by the reprex package (v0.3.0)
Yes and no, you could technically compile the base r code to eek out a little bit more performance. data.table has a decent amount of overhead going on for something with such a small amount of data. My hunch would be the data.table version scales better when you’re getting to millions and millions of rows.
Bespoke to this problem and not readable. Could be cleaned up with proper variable naming, etc. There is something to be said about generalizing and splitting the code up. I would separate the calculation of the contingent cashflows and the discount curve from the npv.
My intention was to show a vectorized approach to the “life problem”.