Performance

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz).

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at p = 10, and simulate the parameter value and n × p matrices using rnorm(). In order to ensure convergence with a large n, we set a large threshold value using el_control().

library(ggplot2)
library(microbenchmark)
set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min         lq       mean      median         uq        max neval
#>  n1e2    441.173    483.036    521.783    508.2485    558.602    634.944   100
#>  n1e3   1208.063   1414.458   1559.677   1493.6010   1620.021   5489.108   100
#>  n1e4  10956.897  13235.302  14950.700  15245.2465  16225.660  20950.598   100
#>  n1e5 175109.832 196507.813 242930.996 234921.4950 263814.948 401274.523   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result)

Increasing the number of parameters

This time we fix the number of observations at n = 1000, and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)
result2
#> Unit: microseconds
#>  expr        min         lq        mean     median         uq        max neval
#>    p5    745.561    784.338    865.1821    829.357    885.246   3756.006   100
#>   p25   2918.844   2989.011   3116.0017   3049.378   3086.312   7014.213   100
#>  p100  23608.566  26194.494  30011.3271  26665.306  31352.650 170932.300   100
#>  p400 270006.196 295298.972 329067.8237 318106.680 344939.244 455292.309   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.