Performance

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz).

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at \(p = 10\), and simulate the parameter value and \(n \times p\) matrices using rnorm(). In order to ensure convergence with a large \(n\), we set a large threshold value using el_control().

library(ggplot2)
library(microbenchmark)
set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min         lq        mean     median          uq        max neval
#>  n1e2    336.459    416.823    462.5418    445.255    506.0955    631.787   100
#>  n1e3    994.035   1184.832   1304.1584   1271.085   1369.9865   4017.478   100
#>  n1e4   8963.925  10168.491  13310.1612  12518.407  13313.3855 123961.193   100
#>  n1e5 141636.297 156886.766 182604.0573 177529.987 211970.9835 295406.150   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result)
#> Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
#> ℹ Please use tidy evaluation idioms with `aes()`.
#> ℹ See also `vignette("ggplot2-in-packages")` for more information.
#> ℹ The deprecated feature was likely used in the microbenchmark package.
#>   Please report the issue at
#>   <https://github.com/joshuaulrich/microbenchmark/issues/>.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

Increasing the number of parameters

This time we fix the number of observations at \(n = 1000\), and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)
result2
#> Unit: microseconds
#>  expr        min          lq        mean      median          uq        max
#>    p5    570.476    605.4985    658.0006    634.0155    713.0225    778.715
#>   p25   2283.788   2320.2015   2500.9673   2371.8330   2447.3895   6010.794
#>  p100  18035.900  20152.1035  22018.3248  20473.9360  24513.1465  37505.215
#>  p400 199887.492 220927.8280 250715.7387 238367.9740 269762.6575 428738.301
#>  neval cld
#>    100 a  
#>    100 a  
#>    100  b 
#>    100   c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.