Title: | High-Dimensional Cure Models |
---|---|
Description: | Provides functions for fitting various penalized parametric and semi-parametric mixture cure models with different penalty functions, testing for a significant cure fraction, and testing for sufficient follow-up as described in Fu et al (2022)<doi:10.1002/sim.9513> and Archer et al (2024)<doi:10.1186/s13045-024-01553-6>. False discovery rate controlled variable selection is provided using model-X knock-offs. |
Authors: | Han Fu [aut], Kellie J. Archer [aut, cre] (ORCID: <https://orcid.org/0000-0003-1555-5781>), Tung Lam Nguyen [rev] (Reviewed the package for ROpenSci), Panagiotis Papastamoulis [rev] (Reviewed the package for ROpenSci) |
Maintainer: | Kellie J. Archer <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.5 |
Built: | 2025-10-02 12:44:03 UTC |
Source: | https://github.com/ropensci/hdcuremodels |
Duration of complete response for 40 cytogenetically normal AML patients and a subset of 320 transcript expression from RNA-sequencing.
amltest
amltest
A data frame with 40 rows (subjects) and 322 columns:
duration of complete response in years
censoring indicator: 1 = relapsed or died; 0 = alive at last follow=up
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
doi:10.1186/s13045-024-01553-6
Duration of complete response for 306 cytogenetically normal AML patients and a subset of 320 transcript expression from RNA-sequencing.
amltrain
amltrain
A data frame with 306 rows (subjects) and 322 columns:
duration of complete response in years
censoring indicator: 1 = relapsed or died; 0 = alive at last follow=up
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
normalized expression for indicated transcript
doi:10.1186/s13045-024-01553-6
This function calculates the AUC for cure prediction using the mean score imputation (MSI) method proposed by Asano et al (2014).
auc_mcm(object, newdata, cure_cutoff = 5, model_select = "AIC")
auc_mcm(object, newdata, cure_cutoff = 5, model_select = "AIC")
object |
a |
newdata |
an optional data.frame that minimally includes the incidence and/or latency variables to use for predicting the response. If omitted, the training data are used. |
cure_cutoff |
cutoff value for cure, used to produce a proxy for the unobserved cure status (default is 5 representing 5 years). Users should be careful to note the time scale of their data and adjust this according to the time scale and clinical application. |
model_select |
either a case-sensitive parameter for models fit using
This option has no effect for objects fit using |
Returns the AUC value for cure prediction using the mean score imputation (MSI) method.
Asano, J., Hirakawa, H., Hamada, C. (2014) Assessing the prediction accuracy of cure in the Cox proportional hazards cure model: an application to breast cancer data. Pharmaceutical Statistics, 13:357–363.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training testing <- temp$testing fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) auc_mcm(fit, model_select = "cAIC") auc_mcm(fit, newdata = testing)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training testing <- temp$testing fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) auc_mcm(fit, model_select = "cAIC") auc_mcm(fit, newdata = testing)
coef.mixturecure
is a generic function which extracts the model
coefficients from a fitted mixturecure
model object fit using
curegmifs
, cureem
, cv_curegmifs
, or cv_cureem
.
## S3 method for class 'mixturecure' coef(object, model_select = "AIC", ...)
## S3 method for class 'mixturecure' coef(object, model_select = "AIC", ...)
object |
a |
model_select |
either a case-sensitive parameter for models fit using
This option has no effect for objects fit using |
... |
other arguments. |
rate |
estimated rate parameter when fitting a Weibull or exponential mixture cure model. |
shape |
estimated shape parameter when fitting a Weibull mixture cure model. |
b0 |
estimated intercept for the incidence portion of the mixture cure model. |
beta_inc |
the vector of coefficient estimates for the incidence portion of the mixture cure model. |
beta_lat |
the vector of coefficient estimates for the latency portion of the mixture cure model. |
p_uncured |
a vector of probabilities from the incidence portion of the fitted model representing the P(uncured). |
curegmifs
, cureem
,
summary.mixturecure
, plot.mixturecure
,
predict.mixturecure
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) coef(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) coef(fit)
This function calculates the C-statistic using the cure status weighting (CSW) method proposed by Asano and Hirakawa (2017).
concordance_mcm(object, newdata, cure_cutoff = 5, model_select = "AIC")
concordance_mcm(object, newdata, cure_cutoff = 5, model_select = "AIC")
object |
a |
newdata |
an optional data.frame that minimally includes the incidence and/or latency variables to use for predicting the response. If omitted, the training data are used. |
cure_cutoff |
cutoff value for cure, used to produce a proxy for the unobserved cure status (default is 5 representing 5 years). Users should be careful to note the time scale of their data and adjust this according to the time scale and clinical application. |
model_select |
either a case-sensitive parameter for models fit using
This option has no effect for objects fit using |
value of C-statistic for the cure models.
Asano, J. and Hirakawa, H. (2017) Assessing the prediction accuracy of a cure model for censored survival data with long-term survivors: Application to breast cancer data. Journal of Biopharmaceutical Statistics, 27:6, 918–932.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training testing <- temp$testing fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) concordance_mcm(fit, model_select = "cAIC") concordance_mcm(fit, newdata = testing, model_select = "cAIC")
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training testing <- temp$testing fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) concordance_mcm(fit, model_select = "cAIC") concordance_mcm(fit, newdata = testing, model_select = "cAIC")
Estimates the cured fraction using a Kaplan-Meier fitted object.
cure_estimate(object)
cure_estimate(object)
object |
a |
estimated proportion of cured observations
survfit
, sufficient_fu_test
,
nonzerocure_test
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training km_fit <- survfit(Surv(Time, Censor) ~ 1, data = training) cure_estimate(km_fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training km_fit <- survfit(Surv(Time, Censor) ~ 1, data = training) cure_estimate(km_fit)
Fits penalized parametric and semi-parametric mixture cure models (MCM) using the E-M algorithm with user-specified penalty parameters. The lasso (L1), MCP, and SCAD penalty are supported for the Cox MCM while only lasso is currently supported for parametric MCMs.
cureem( formula, data, subset, x_latency = NULL, model = c("cox", "weibull", "exponential"), penalty = c("lasso", "MCP", "SCAD"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, thresh = 0.001, scale = TRUE, maxit = NULL, inits = NULL, lambda_inc = 0.1, lambda_lat = 0.1, gamma_inc = 3, gamma_lat = 3, na.action = na.omit, ... )
cureem( formula, data, subset, x_latency = NULL, model = c("cox", "weibull", "exponential"), penalty = c("lasso", "MCP", "SCAD"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, thresh = 0.001, scale = TRUE, maxit = NULL, inits = NULL, lambda_inc = 0.1, lambda_lat = 0.1, gamma_inc = 3, gamma_lat = 3, na.action = na.omit, ... )
formula |
an object of class " |
data |
a data.frame in which to interpret the variables named in the
|
subset |
an optional expression indicating which subset of observations to be used in the fitting process, either a numeric or factor variable should be used in subset, not a character variable. All observations are included by default. |
x_latency |
specifies the variables to be included in the latency
portion of the model and can be either a matrix of predictors, a model
formula with the right hand side specifying the latency variables, or the
same data.frame passed to the |
model |
type of regression model to use for the latency portion of mixture cure model. Can be "cox", "weibull", or "exponential" (default is "cox"). |
penalty |
type of penalty function. Can be "lasso", "MCP", or "SCAD" (default is "lasso"). |
penalty_factor_inc |
vector of binary indicators representing the penalty to apply to each incidence coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all incidence variables. |
penalty_factor_lat |
vector of binary indicators representing the penalty to apply to each latency coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all latency variables. |
thresh |
small numeric value. The iterative process stops when the differences between successive expected penalized complete-data log-likelihoods for both incidence and latency components are less than this specified level of tolerance (default is 10^-3). |
scale |
logical, if TRUE the predictors are centered and scaled. |
maxit |
integer specifying the maximum number of passes over the data
for each lambda. If not specified, 100 is applied when
|
inits |
an optional list specifiying the initial values. This includes:
Penalized coefficients are initialized to zero. If |
lambda_inc |
numeric value for the penalization parameter |
lambda_lat |
numeric value for the penalization parameter |
gamma_inc |
numeric value for the penalization parameter |
gamma_lat |
numeric value for the penalization parameter |
na.action |
this function requires complete data so |
... |
additional arguments. |
b_path |
Matrix representing the solution path of the coefficients in the incidence portion of the model. Row is step and column is variable. |
beta_path |
Matrix representing the solution path of the coefficients in the latency portion of the model. Row is step and column is variable. |
b0_path |
Vector representing the solution path of the intercept in the incidence portion of the model. |
logLik_inc |
Vector representing the expected penalized complete-data log-likelihood for the incidence portion of the model for each step in the solution path. |
logLik_lat |
Vector representing the expected penalized complete-data log-likelihood for the latency portion of the model for each step in the solution path. |
x_incidence |
Matrix representing the design matrix of the incidence predictors. |
x_latency |
Matrix representing the design matrix of the latency predictors. |
y |
Vector representing the survival object response as
returned by the |
model |
Character string indicating the type of regression model used for the latency portion of mixture cure model ("weibull" or "exponential"). |
scale |
Logical value indicating whether the predictors were centered and scaled. |
method |
Character string indicating the EM algorithm was used in fitting the mixture cure model. |
rate_path |
Vector representing the solution path of the rate parameter for the Weibull or exponential density in the latency portion of the model. |
alpha_path |
Vector representing the solution path of the shape parameter for the Weibull density in the latency portion of the model. |
call |
the matched call. |
Archer, K. J., Fu, H., Mrozek, K., Nicolet, D., Mims, A. S., Uy, G. L., Stock, W., Byrd, J. C., Hiddemann, W., Braess, J., Spiekermann, K., Metzeler, K. H., Herold, T., Eisfeld, A.-K. (2024) Identifying long-term survivors and those at higher or lower risk of relapse among patients with cytogenetically normal acute myeloid leukemia using a high-dimensional mixture cure model. Journal of Hematology & Oncology, 17:28.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 80, j = 100, n_true = 10, a = 1.8) training <- temp$training fit <- cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "cox", penalty = "lasso", lambda_inc = 0.1, lambda_lat = 0.1, gamma_inc = 6, gamma_lat = 10 )
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 80, j = 100, n_true = 10, a = 1.8) training <- temp$training fit <- cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "cox", penalty = "lasso", lambda_inc = 0.1, lambda_lat = 0.1, gamma_inc = 6, gamma_lat = 10 )
Fits a penalized Weibull or exponential mixture cure model using the generalized monotone incremental forward stagewise (GMIFS) algorithm (Hastie et al 2007) and yields solution paths for parameters in the incidence and latency portions of the model.
curegmifs( formula, data, subset, x_latency = NULL, model = c("weibull", "exponential"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, epsilon = 0.001, thresh = 1e-05, scale = TRUE, maxit = 10000, inits = NULL, verbose = TRUE, suppress_warning = FALSE, na.action = na.omit, ... )
curegmifs( formula, data, subset, x_latency = NULL, model = c("weibull", "exponential"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, epsilon = 0.001, thresh = 1e-05, scale = TRUE, maxit = 10000, inits = NULL, verbose = TRUE, suppress_warning = FALSE, na.action = na.omit, ... )
formula |
an object of class " |
data |
a data.frame in which to interpret the variables named in the
|
subset |
an optional expression indicating which subset of observations to be used in the fitting process, either a numeric or factor variable should be used in subset, not a character variable. All observations are included by default. |
x_latency |
specifies the variables to be included in the latency
portion of the model and can be either a matrix of predictors, a model
formula with the right hand side specifying the latency variables, or the
same data.frame passed to the |
model |
type of regression model to use for the latency portion of mixture cure model. Can be "weibull" or "exponential"; default is "weibull". |
penalty_factor_inc |
vector of binary indicators representing the penalty to apply to each incidence coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all incidence variables. |
penalty_factor_lat |
vector of binary indicators representing the penalty to apply to each latency coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all latency variables. |
epsilon |
small numeric value reflecting the incremental value used to update a coefficient at a given step (default is 0.001). |
thresh |
small numeric value. The iterative process stops when the differences between successive expected penalized log-likelihoods for both incidence and latency components are less than this specified level of tolerance (default is 10^-5). |
scale |
logical, if TRUE the predictors are centered and scaled. |
maxit |
integer specifying the maximum number of steps to run in the iterative algorithm (default is 10^4). |
inits |
an optional list specifying the initial values as follows:
If not supplied or improperly supplied, initialization is automatically provided by the function. |
verbose |
logical, if TRUE running information is printed to the console (default is FALSE). |
suppress_warning |
logical, if TRUE, suppresses echoing the warning that the maximum number of iterations was reached so that the algorithm may not have converged. Instead, warning is returned as part of the output with this message. |
na.action |
this function requires complete data so |
... |
additional arguments. |
b_path |
Matrix representing the solution path of the coefficients in the incidence portion of the model. Row is step and column is variable. |
beta_path |
Matrix representing the solution path of the coefficients in the latency portion of the model. Row is step and column is variable. |
b0_path |
Vector representing the solution path of the intercept in the incidence portion of the model. |
rate_path |
Vector representing the solution path of the rate parameter for the Weibull or exponential density in the latency portion of the model. |
logLik |
Vector representing the log-likelihood for each step in the solution path. |
x_incidence |
Matrix representing the design matrix of the incidence predictors. |
x_latency |
Matrix representing the design matrix of the latency predictors. |
y |
Vector representing the survival object response as returned
by the |
model |
Character string indicating the type of regression model used for the latency portion of mixture cure model ("weibull" or "exponential"). |
scale |
Logical value indicating whether the predictors were centered and scaled. |
alpha_path |
Vector representing the solution path of the shape parameter for the Weibull density in the latency portion of the model. |
call |
the matched call. |
warning |
message indicating whether the maximum number of iterations was achieved which may indicate the model did not converge. |
Fu, H., Nicolet, D., Mrozek, K., Stone, R. M., Eisfeld, A. K., Byrd, J. C., Archer, K. J. (2022) Controlled variable selection in Weibull mixture cure models for high-dimensional data. Statistics in Medicine, 41(22), 4340–4366.
Hastie, T., Taylor J., Tibshirani R., Walther G. (2007) Forward stagewise regression and the monotone lasso. Electron J Stat, 1:1–29.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE )
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE )
Fits penalized parametric and semi-parametric mixture cure models (MCM) using the E-M algorithm with with k-fold cross-validation for parameter tuning. The lasso (L1), MCP and SCAD penalty are supported for the Cox MCM while only lasso is currently supported for parametric MCMs. When FDR controlled variable selection is used, the model-X knockoffs method is applied and indices of selected variables are returned.
cv_cureem( formula, data, subset, x_latency = NULL, model = c("cox", "weibull", "exponential"), penalty = c("lasso", "MCP", "SCAD"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, fdr_control = FALSE, fdr = 0.2, grid_tuning = FALSE, thresh = 0.001, scale = TRUE, maxit = NULL, inits = NULL, lambda_inc_list = NULL, lambda_lat_list = NULL, nlambda_inc = NULL, nlambda_lat = NULL, gamma_inc = 3, gamma_lat = 3, lambda_min_ratio_inc = 0.1, lambda_min_ratio_lat = 0.1, n_folds = 5, measure_inc = c("c", "auc"), one_se = FALSE, cure_cutoff = 5, parallel = FALSE, seed = NULL, verbose = TRUE, na.action = na.omit, ... )
cv_cureem( formula, data, subset, x_latency = NULL, model = c("cox", "weibull", "exponential"), penalty = c("lasso", "MCP", "SCAD"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, fdr_control = FALSE, fdr = 0.2, grid_tuning = FALSE, thresh = 0.001, scale = TRUE, maxit = NULL, inits = NULL, lambda_inc_list = NULL, lambda_lat_list = NULL, nlambda_inc = NULL, nlambda_lat = NULL, gamma_inc = 3, gamma_lat = 3, lambda_min_ratio_inc = 0.1, lambda_min_ratio_lat = 0.1, n_folds = 5, measure_inc = c("c", "auc"), one_se = FALSE, cure_cutoff = 5, parallel = FALSE, seed = NULL, verbose = TRUE, na.action = na.omit, ... )
formula |
an object of class " |
data |
a data.frame in which to interpret the variables named in
the |
subset |
an optional expression indicating which subset of observations to be used in the fitting process, either a numeric or factor variable should be used in subset, not a character variable. All observations are included by default. |
x_latency |
specifies the variables to be included in the latency
portion of the model and can be either a matrix of predictors, a model
formula with the right hand side specifying the latency variables, or the
same data.frame passed to the |
model |
type of regression model to use for the latency portion of mixture cure model. Can be "cox", "weibull", or "exponential" (default is "cox"). |
penalty |
type of penalty function. Can be "lasso", "MCP", or "SCAD" (default is "lasso"). |
penalty_factor_inc |
vector of binary indicators representing the penalty to apply to each incidence coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all incidence variables. |
penalty_factor_lat |
vector of binary indicators representing the penalty to apply to each latency coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all latency variables. |
fdr_control |
logical, if TRUE, model-X knockoffs are used for FDR-controlled variable selection and indices of selected variables are returned (default is FALSE). |
fdr |
numeric value in (0, 1) range specifying the target FDR level to
use for variable selection when |
grid_tuning |
logical, if TRUE a 2-D grid tuning approach is used to
select the optimal pair of |
thresh |
small numeric value. The iterative process stops when the differences between successive expected penalized complete-data log-likelihoods for both incidence and latency components are less than this specified level of tolerance (default is 10^-3). |
scale |
logical, if TRUE the predictors are centered and scaled. |
maxit |
maximum number of passes over the data for each lambda. If not
specified, 100 is applied when |
inits |
an optional list specifiying the initial values to be used for model fitting as follows:
Penalized coefficients are initialized to zero. If |
lambda_inc_list |
a numeric vector used to search for the optimal
|
lambda_lat_list |
a numeric vector used to search for the optimal
|
nlambda_inc |
an integer specifying the number of values to search for
the optimal |
nlambda_lat |
an integer specifying the number of values to search
for the optimal |
gamma_inc |
numeric value for the penalization parameter |
gamma_lat |
numeric value for the penalization parameter |
lambda_min_ratio_inc |
numeric value in (0,1) representing the smallest
value for |
lambda_min_ratio_lat |
numeric value in (0.1) representing the smallest
value for |
n_folds |
an integer specifying the number of folds for the k-fold cross-valiation procedure (default is 5). |
measure_inc |
character string specifying the evaluation criterion used
in selecting the optimal
|
one_se |
logical, if TRUE then the one standard error rule is applied for selecting the optimal parameters. The one standard error rule selects the most parsimonious model having evaluation criterion no more than one standard error worse than that of the best evaluation criterion (default is FALSE). |
cure_cutoff |
numeric value representing the cutoff time value that represents subjects not experiencing the event by this time are cured. This value is used to produce a proxy for the unobserved cure status when calculating C-statistic and AUC (default is 5 representing 5 years). Users should be careful to note the time scale of their data and adjust this according to the time scale and clinical application. |
parallel |
logical. If TRUE, parallel processing is performed for K-fold
CV using |
seed |
optional integer representing the random seed. Setting the random seed fosters reproducibility of the results. |
verbose |
logical, if TRUE running information is printed to the console (default is FALSE). |
na.action |
this function requires complete data so |
... |
additional arguments. |
b0 |
Estimated intercept for the incidence portion of the model. |
b |
Estimated coefficients for the incidence portion of the model. |
beta |
Estimated coefficients for the latency portion of the model. |
alpha |
Estimated shape parameter if the Weibull model is fit. |
rate |
Estimated rate parameter if the Weibull or exponential model is fit. |
logLik_inc |
Expected penalized complete-data log-likelihood for the incidence portion of the model. |
logLik_lat |
Expected penalized complete-data log-likelihood for the latency portion of the model. |
selected_lambda_inc |
Value of |
selected_lambda_lat |
Value of |
max_c |
Maximum C-statistic achieved. |
max_auc |
Maximum AUC for cure prediction achieved; only output
when |
selected_index_inc |
Indices of selected variables for the
incidence portion of the model when |
selected_index_lat |
Indices of selected variables for the
latency portion of the model when |
call |
the matched call. |
Archer, K. J., Fu, H., Mrozek, K., Nicolet, D., Mims, A. S., Uy, G. L., Stock, W., Byrd, J. C., Hiddemann, W., Braess, J., Spiekermann, K., Metzeler, K. H., Herold, T., Eisfeld, A.-K. (2024) Identifying long-term survivors and those at higher or lower risk of relapse among patients with cytogenetically normal acute myeloid leukemia using a high-dimensional mixture cure model. Journal of Hematology & Oncology, 17:28.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 200, j = 25, n_true = 5, a = 1.8) training <- temp$training fit.cv <- cv_cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training, fdr_control = FALSE, grid_tuning = FALSE, nlambda_inc = 10, nlambda_lat = 10, n_folds = 2, seed = 23, verbose = TRUE ) fit.cv.fdr <- cv_cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", penalty = "lasso", fdr_control = TRUE, grid_tuning = FALSE, nlambda_inc = 10, nlambda_lat = 10, n_folds = 2, seed = 23, verbose = TRUE)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 200, j = 25, n_true = 5, a = 1.8) training <- temp$training fit.cv <- cv_cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training, fdr_control = FALSE, grid_tuning = FALSE, nlambda_inc = 10, nlambda_lat = 10, n_folds = 2, seed = 23, verbose = TRUE ) fit.cv.fdr <- cv_cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", penalty = "lasso", fdr_control = TRUE, grid_tuning = FALSE, nlambda_inc = 10, nlambda_lat = 10, n_folds = 2, seed = 23, verbose = TRUE)
Fits a penalized Weibull or exponential mixture cure model using the generalized monotone incremental forward stagewise (GMIFS) algorithm with k-fold cross-validation to select the optimal iteration step along the solution path. When FDR controlled variable selection is used, the model-X knockoffs method is applied and indices of selected variables are returned.
cv_curegmifs( formula, data, subset, x_latency = NULL, model = c("weibull", "exponential"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, fdr_control = FALSE, fdr = 0.2, epsilon = 0.001, thresh = 1e-05, scale = TRUE, maxit = 10000, inits = NULL, n_folds = 5, measure_inc = c("c", "auc"), one_se = FALSE, cure_cutoff = 5, parallel = FALSE, seed = NULL, verbose = TRUE, na.action = na.omit, ... )
cv_curegmifs( formula, data, subset, x_latency = NULL, model = c("weibull", "exponential"), penalty_factor_inc = NULL, penalty_factor_lat = NULL, fdr_control = FALSE, fdr = 0.2, epsilon = 0.001, thresh = 1e-05, scale = TRUE, maxit = 10000, inits = NULL, n_folds = 5, measure_inc = c("c", "auc"), one_se = FALSE, cure_cutoff = 5, parallel = FALSE, seed = NULL, verbose = TRUE, na.action = na.omit, ... )
formula |
an object of class " |
data |
a data.frame in which to interpret the variables named in the
|
subset |
an optional expression indicating which subset of observations to be used in the fitting process, either a numeric or factor variable should be used in subset, not a character variable. All observations are included by default. |
x_latency |
specifies the variables to be included in the latency
portion of the model and can be either a matrix of predictors, a model
formula with the right hand side specifying the latency variables, or the
same data.frame passed to the |
model |
type of regression model to use for the latency portion of mixture cure model. Can be "weibull" or "exponential"; default is "weibull". |
penalty_factor_inc |
vector of binary indicators representing the penalty to apply to each incidence coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all incidence variables. |
penalty_factor_lat |
vector of binary indicators representing the penalty to apply to each latency coefficient: 0 implies no shrinkage and 1 implies shrinkage. If not supplied, 1 is applied to all latency variables. |
fdr_control |
logical, if TRUE, model-X knockoffs are used for FDR-controlled variable selection and indices of selected variables are returned (default is FALSE). |
fdr |
numeric value in (0, 1) range specifying the target FDR level to
use for variable selection when |
epsilon |
small numeric value reflecting incremental value used to update a coefficient at a given step (default is 0.001). |
thresh |
small numeric value. The iterative process stops when the differences between successive expected penalized complete-data log-likelihoods for both incidence and latency components are less than this specified level of tolerance (default is 10^-5). |
scale |
logical, if TRUE the predictors are centered and scaled. |
maxit |
integer specifying the maximum number of steps to run in the iterative algorithm (default is 10^4). |
inits |
an optional list specifying the initial values as follows:
If |
n_folds |
an integer specifying the number of folds for the k-fold cross-validation procedure (default is 5). |
measure_inc |
character string specifying the evaluation criterion used
in selecting the optimal
|
one_se |
logical, if TRUE then the one standard error rule is applied for selecting the optimal parameters. The one standard error rule selects the most parsimonious model having evaluation criterion no more than one standard error worse than that of the best evaluation criterion (default is FALSE). |
cure_cutoff |
numeric value representing the cutoff time value that represents subjects not experiencing the event by this time are cured. This value is used to produce a proxy for the unobserved cure status when calculating C-statistic and AUC (default is 5 representing 5 years). Users should be careful to note the time scale of their data and adjust this according to the time scale and clinical application. |
parallel |
logical. If TRUE, parallel processing is performed for K-fold
CV using |
seed |
optional integer representing the random seed. Setting the random seed fosters reproducibility of the results. |
verbose |
logical, if TRUE running information is printed to the console (default is FALSE). |
na.action |
this function requires complete data so |
... |
additional arguments. |
b0 |
Estimated intercept for the incidence portion of the model. |
b |
Estimated coefficients for the incidence portion of the model. |
beta |
Estimated coefficients for the latency portion of the model. |
alpha |
Estimated shape parameter if the Weibull model is fit. |
rate |
Estimated rate parameter if the Weibull or exponential model is fit. |
logLik |
Log-likelihood value. |
selected_step_inc |
Iteration step selected for the incidence portion of the model using cross-validation. NULL when fdr_control is TRUE. |
selected_step_lat |
Iteration step selected for the latency portion of the model using cross-validation. NULL when fdr_control is TRUE. |
max_c |
Maximum C-statistic achieved |
max_auc |
Maximum AUC for cure prediction achieved; only output
when |
selected_index_inc |
Indices of selected variables for the
incidence portion of the model when |
selected_index_lat |
Indices of selected variables for the
latency portion of the model when |
call |
the matched call. |
Fu, H., Nicolet, D., Mrozek, K., Stone, R. M., Eisfeld, A. K., Byrd, J. C., Archer, K. J. (2022) Controlled variable selection in Weibull mixture cure models for high-dimensional data. Statistics in Medicine, 41(22), 4340–4366.
Hastie, T., Taylor J., Tibshirani R., Walther G. (2007) Forward stagewise regression and the monotone lasso. Electron J Stat, 1:1–29.
library(survival) withr::local_seed(123) temp <- generate_cure_data(n = 100, j = 15, n_true = 3, a = 1.8, rho = 0.2) training <- temp$training fit.cv <- cv_curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, fdr_control = FALSE, maxit = 450, epsilon = 0.01, n_folds = 2, seed = 23, verbose = TRUE )
library(survival) withr::local_seed(123) temp <- generate_cure_data(n = 100, j = 15, n_true = 3, a = 1.8, rho = 0.2) training <- temp$training fit.cv <- cv_curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, fdr_control = FALSE, maxit = 450, epsilon = 0.01, n_folds = 2, seed = 23, verbose = TRUE )
Dimension method for mixturecure
objects.
## S3 method for class 'mixturecure' dim(x)
## S3 method for class 'mixturecure' dim(x)
x |
An object of class |
nobs |
number of subjects in the dataset. |
p_incidence |
number of variables in the incidence portion of the model. |
p_latency |
number of variables in the latency portion of the model. |
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) dim(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) dim(fit)
Return model family and fitting algorithm formixturecure
model fits.
## S3 method for class 'mixturecure' family(object, ...)
## S3 method for class 'mixturecure' family(object, ...)
object |
an object of class |
... |
other arguments. |
the parametric or semi-parametric model fit and the fitting algorithm.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) family(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) family(fit)
Extract the model formula for mixturecure
object
## S3 method for class 'mixturecure' formula(x, ...)
## S3 method for class 'mixturecure' formula(x, ...)
x |
an object from class |
... |
other arguments. |
a formula representing the incidence and variables for the latency portion of the model
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) formula(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) formula(fit)
Simulate data under a mixture cure model.
generate_cure_data( n = 400, j = 500, nonp = 2, train_prop = 0.75, n_true = 10, a = 1, rho = 0.5, itct_mean = 0.5, cens_ub = 20, alpha = 1, lambda = 2, same_signs = FALSE, model = "weibull" )
generate_cure_data( n = 400, j = 500, nonp = 2, train_prop = 0.75, n_true = 10, a = 1, rho = 0.5, itct_mean = 0.5, cens_ub = 20, alpha = 1, lambda = 2, same_signs = FALSE, model = "weibull" )
n |
an integer denoting the total sample size. |
j |
an integer denoting the number of penalized predictors which is the same for both the incidence and latency portions of the model. |
nonp |
an integer denoting the number of unpenalized predictors (which is the same for both the incidence and latency portions of the model). |
train_prop |
a numeric value in [0, 1) representing the fraction of |
n_true |
an integer less than |
a |
a numeric value denoting the effect size (signal amplitude) which is the same for both the incidence and latency portions of the model. |
rho |
a numeric value in [0, 1) representing the correlation between adjacent covariates in the same block. |
itct_mean |
a numeric value representing the expectation of the incidence intercept which controls the cure rate. |
cens_ub |
a numeric value representing the upper bound on the censoring
time distribution which follows a uniform distribution on (0, |
alpha |
a numeric value representing the shape parameter in the Weibull density. |
lambda |
a numeric value representing the rate parameter in the Weibull density. |
same_signs |
logical, if TRUE the incidence and latency coefficients have the same signs. |
model |
type of regression model to use for the latency portion of mixture cure model. Can be one of the following:
|
training |
training data.frame which includes Time, Censor, and
covariates. Variables prefixed with |
training_y |
the true status for the training set: uncured = 1; cured = 0 |
testing |
testing data.frame which includes Time, Censor, Y
(the true uncured = 1; cured = 0 status), and
covariates. Variables prefixed with |
testing_y |
the true status for the testing set: uncured = 1; cured = 0 |
parameters |
a list including: the indices of true incidence
signals ( |
library(survival) withr::local_seed(1234) # This dataset has 2 penalized features associated with the outcome, # 3 penalized features not associated with the outcome (noise features), and 1 # unpenalized noise feature. data <- generate_cure_data(n = 1000, j = 5, n_true = 2, nonp = 1, a = 2) # Extract the training data training <- data$training # Extract the testing data testing <- data$testing # To identify the features truly associated with incidence names(training)[grep("^X", names(training))][data$parameters$nonzero_b] # To identify the features truly associated with latency names(training)[grep("^X", names(training))][data$parameters$nonzero_beta] # Fit the model to the training data fitem <- cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training) # Examine the estimated coefficients at the (default) minimum AIC coef(fitem) # As the penalty increases, the coefficients for the noise variables shrink # to or remain at zero, while the truly associated features have coefficient # paths that depart from zero. This shows the model's ability to distinguish # signal from noise. plot(fitem, label = TRUE)
library(survival) withr::local_seed(1234) # This dataset has 2 penalized features associated with the outcome, # 3 penalized features not associated with the outcome (noise features), and 1 # unpenalized noise feature. data <- generate_cure_data(n = 1000, j = 5, n_true = 2, nonp = 1, a = 2) # Extract the training data training <- data$training # Extract the testing data testing <- data$testing # To identify the features truly associated with incidence names(training)[grep("^X", names(training))][data$parameters$nonzero_b] # To identify the features truly associated with latency names(training)[grep("^X", names(training))][data$parameters$nonzero_beta] # Fit the model to the training data fitem <- cureem(Surv(Time, Censor) ~ ., data = training, x_latency = training) # Examine the estimated coefficients at the (default) minimum AIC coef(fitem) # As the penalty increases, the coefficients for the noise variables shrink # to or remain at zero, while the truly associated features have coefficient # paths that depart from zero. This shows the model's ability to distinguish # signal from noise. plot(fitem, label = TRUE)
This function returns the log-likelihood for a user-specified model criterion
or step for a curegmifs
, cureem
,
cv_curegmifs
or cv_cureem
fitted object.
## S3 method for class 'mixturecure' logLik(object, model_select = "AIC", ...)
## S3 method for class 'mixturecure' logLik(object, model_select = "AIC", ...)
object |
a |
model_select |
either a case-sensitive parameter for models fit using
This option has no effect for objects fit using |
... |
other arguments. |
log-likelihood of the fitted mixture cure model using the specified criteria.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) logLik(fit, model_select = "AIC")
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) logLik(fit, model_select = "AIC")
Number of observations in fitted mixturecure
object.
## S3 method for class 'mixturecure' nobs(object, ...)
## S3 method for class 'mixturecure' nobs(object, ...)
object |
An object of class |
... |
other arguments. |
number of subjects in the dataset.
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) nobs(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) nobs(fit)
Tests the null hypothesis that the proportion of observations susceptible to the event = 1 against the alternative that the proportion of observations susceptible to the event is < 1. If the null hypothesis is rejected, there is a significant cured fraction.
nonzerocure_test(object, reps = 1000, seed = NULL, plot = FALSE, b = NULL)
nonzerocure_test(object, reps = 1000, seed = NULL, plot = FALSE, b = NULL)
object |
a |
reps |
number of simulations on which to base the p-value (default = 1000). |
seed |
optional random seed. |
plot |
logical. If TRUE a histogram of the estimated susceptible proportions over all simulations is produced. |
b |
optional. If specified the maximum observed time for the uniform distribution for generating the censoring times. If not specified, an exponential model is used for generating the censoring times (default). |
proportion_susceptible |
estimated proportion of susceptibles |
proportion_cured |
estimated proportion of those cured |
p_value |
p-value testing the null hypothesis that the proportion of susceptibles = 1 (cured fraction = 0) against the alternative that the proportion of susceptibles < 1 (non-zero cured fraction) |
time_95_percent_of_events |
estimated time at which 95% of events should have occurred |
Maller, R. A. and Zhou, X. (1996) Survival Analysis with Long-Term Survivors. John Wiley & Sons.
survfit
, cure_estimate
,
sufficient_fu_test
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training km_fit <- survfit(Surv(Time, Censor) ~ 1, data = training) nonzerocure_test(km_fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training km_fit <- survfit(Surv(Time, Censor) ~ 1, data = training) nonzerocure_test(km_fit)
This function returns the number of parameters in a user-specified model
criterion or step for a curegmifs
, cureem
,
cv_curegmifs
or cv_cureem
fitted object.
npar_mixturecure(object, model_select = "AIC")
npar_mixturecure(object, model_select = "AIC")
object |
a |
model_select |
either a case-sensitive parameter for models fit using
This option has no effect for objects fit using |
number of paramaters of the fitted mixture cure model using the specified criteria.
This function plots either the coefficient path, the AIC, the cAIC, the BIC,
or the log-likelihood for a fitted curegmifs
or cureem
object.
This function produces a lollipop plot of the coefficient estimates for a
fitted cv_curegmifs
or cv_cureem
object.
## S3 method for class 'mixturecure' plot( x, type = c("trace", "AIC", "BIC", "logLik", "cAIC", "mAIC", "mBIC", "EBIC"), xlab = NULL, ylab = NULL, label = FALSE, main = NULL, ... )
## S3 method for class 'mixturecure' plot( x, type = c("trace", "AIC", "BIC", "logLik", "cAIC", "mAIC", "mBIC", "EBIC"), xlab = NULL, ylab = NULL, label = FALSE, main = NULL, ... )
x |
a |
type |
a case-sensitive parameter indicating what to plot on the y-axis. The complete list of options are:
This option has no effect for objects fit using
|
xlab |
a default x-axis label will be used which can be changed by specifying a user-defined x-axis label. |
ylab |
a default y-axis label will be used which can be changed by specifying a user-defined y-axis label. |
label |
logical. If TRUE the variable names will appear in a legend.
Applicable only when |
main |
a default main title will be used which can be changed by
specifying a user-defined main title. This option is not used for
|
... |
other arguments. |
this function has no returned value but is called for its side effects
curegmifs
, cureem
,
coef.mixturecure
, summary.mixturecure
,
predict.mixturecure
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) plot(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) plot(fit)
This function returns a list that includes the predicted probabilities for
susceptibles as well as the linear predictor for the latency distribution
and a dichotomous risk for latency for a curegmifs
, cureem
,
cv_curegmifs
or cv_cureem
fitted object.
## S3 method for class 'mixturecure' predict(object, newdata, model_select = "AIC", ...)
## S3 method for class 'mixturecure' predict(object, newdata, model_select = "AIC", ...)
object |
a |
newdata |
an optional data.frame that minimally includes the incidence and/or latency variables to use for predicting the response. If omitted, the training data are used. |
model_select |
either a case-sensitive parameter for models fit using
This option has no effect for objects fit using |
... |
other arguments |
p_uncured |
a vector of probabilities from the incidence portion of the fitted model representing the P(uncured). |
linear_latency |
a vector for the linear predictor from the latency portion of the model. |
latency_risk |
a dichotomous class representing low (below the median) versus high risk for the latency portion of the model. |
curegmifs
, cureem
,
coef.mixturecure
, summary.mixturecure
,
plot.mixturecure
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) predict_train <- predict(fit) names(predict_train) testing <- temp$testing predict_test <- predict(fit, newdata = testing)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) predict_train <- predict(fit) names(predict_train) testing <- temp$testing predict_test <- predict(fit, newdata = testing)
This function prints the first several incidence and latency coefficients, the rate (when fitting an exponential or Weibull mixture cure model), and alpha (when fitting a Weibull mixture cure model). This function returns the fitted object invisible to the user.
## S3 method for class 'mixturecure' print(x, max = 6, ...)
## S3 method for class 'mixturecure' print(x, max = 6, ...)
x |
a |
max |
maximum number of rows in a matrix or elements in a vector to display |
... |
other arguments. |
prints coefficient estimates for the incidence portion of the model
and if included, prints the coefficient estimates for the latency portion of
the model. Also prints rate for exponential and Weibull models and scale
(alpha) for the Weibull mixture cure model. Returns all objects fit using
cureem
, curegmifs
, cv_cureem
, or cv_curegmifs
.
The contents of a mixturecure
fitted object differ depending
upon whether the EM (cureem
) or GMIFS (curegmifs
) algorithm is
used for model fitting or if cross-validation is used. Also, the output
differs depending upon whether x_latency
is specified in the model
(i.e., variables are included in the latency portion of the model fit) or
only terms
on the right hand side of the equation are included (i.e.,
variables are included in the incidence portion of the model).
curegmifs
, cureem
,
coef.mixturecure
, summary.mixturecure
,
plot.mixturecure
, predict.mixturecure
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) print(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) print(fit)
Tests for sufficient follow-up using a Kaplan-Meier fitted object.
sufficient_fu_test(object)
sufficient_fu_test(object)
object |
a |
p_value |
p-value from testing the null hypothesis that there was not sufficient follow-up against the alternative that there was sufficient follow-up |
n_n |
total number of events that occurred at time > pmax(0, 2*(last observed event time)-(last observed time)) and < the last observed event time |
N |
number of observations in the dataset |
Maller, R. A. and Zhou, X. (1996) Survival Analysis with Long-Term Survivors. John Wiley & Sons.
survfit
, cure_estimate
,
nonzerocure_test
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training km_fit <- survfit(Surv(Time, Censor) ~ 1, data = training) sufficient_fu_test(km_fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training km_fit <- survfit(Surv(Time, Censor) ~ 1, data = training) sufficient_fu_test(km_fit)
summary
method for a mixturecure object fit using curegmifs
,
cureem
, cv_curegmifs
, or cv_cureem
.
## S3 method for class 'mixturecure' summary(object, ...)
## S3 method for class 'mixturecure' summary(object, ...)
object |
a |
... |
other arguments. |
prints the number of non-zero coefficients from the incidence and
latency portions of the fitted mixture cure model when using the minimum AIC
to select the final model. When fitting a model using curegmifs
or
cureem
the summary function additionally prints results associated
with the following model selection methods: the step and value that maximizes
the log-likelihood; the step and value that minimizes the AIC, modified AIC
(mAIC), corrected AIC (cAIC), BIC, modified BIC (mBIC), and extended BIC
(EBIC). This information can be used to guide the user in the selection of
a final model from the solution path.
curegmifs
, cureem
,
coef.mixturecure
, plot.mixturecure
,
predict.mixturecure
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) summary(fit)
library(survival) withr::local_seed(1234) temp <- generate_cure_data(n = 100, j = 10, n_true = 10, a = 1.8) training <- temp$training fit <- curegmifs(Surv(Time, Censor) ~ ., data = training, x_latency = training, model = "weibull", thresh = 1e-4, maxit = 2000, epsilon = 0.01, verbose = FALSE ) summary(fit)