Skip to contents

A ModelBoot object contains full-model bootstrapped statistics and ALE data for a trained model. Full-model bootstrapping (as distinct from data-only bootstrapping) retrains a model for each bootstrap iterations. Thus, it is very slow, though more reliable. However, for obtaining bootstrapped ALE data, plots, and statistics, full-model bootstrapping as provided by ModelBoot is only necessary for models that have not been developed by cross-validation. For cross-validated models, it is sufficient (and much faster) to create a regular ALE object with bootstrapping by setting the boot_it argument there. In fact, full-model bootstrapping with ModelBoot is often infeasible for slow machine-learning models trained on large datasets, which should rather be cross-validated to assure their reliability. However, for models that have not been cross-validated, full-model bootstrapping with ModelBoot is necessary for reliable results. Further details follow below; see also vignette('ale-statistics').

Usage

ModelBoot(
  model,
  data = NULL,
  ...,
  model_call_string = NULL,
  model_call_string_vars = character(),
  parallel = future::availableCores(logical = FALSE, omit = 1),
  model_packages = NULL,
  y_col = NULL,
  binary_true_value = TRUE,
  pred_fun = function(object, newdata, type = pred_type) {
     stats::predict(object =
    object, newdata = newdata, type = type)
 },
  pred_type = "response",
  boot_it = 100,
  boot_alpha = 0.05,
  boot_centre = "mean",
  seed = 0,
  output_model_stats = TRUE,
  output_model_coefs = TRUE,
  output_ale = TRUE,
  output_boot_data = FALSE,
  ale_options = list(),
  tidy_options = list(),
  glance_options = list(),
  silent = FALSE
)

Arguments

model

Required. See documentation for ALE()

data

dataframe. Dataset that will be bootstrapped. This must be the same data on which the model was trained. If not provided, ModelBoot() will try to detect it automatically. For non-standard models, data should be provided.

...

not used. Inserted to require explicit naming of subsequent arguments.

model_call_string

character(1). If NULL (default), the ModelBoot tries to automatically detect and construct the call for bootstrapped datasets. If it cannot, the function will fail early. In that case, a character string of the full call for the model must be provided that includes boot_data as the data argument for the call. See examples.

model_call_string_vars

character. Names of variables included in model_call_string that are not columns in data. If any such variables exist, they must be specified here or else parallel processing will produce an error. If parallelization is disabled with parallel = 0, then this is not a concern. See documentation for the model_packages argument in ALE().

parallel

See documentation for ALE()

model_packages

See documentation for ALE()

y_col, pred_fun, pred_type

See documentation for ALE(). Used to calculate bootstrapped performance measures. If NULL (default), then the relevant performance measures are calculated only if these arguments can be automatically detected.

binary_true_value

any single atomic value. If the model represented by model or model_call_string is a binary classification model, binary_true_value specifies the value of y_col (the target outcome) that is considered TRUE; any other value of y_col is considered FALSE. This argument is ignored if the model is not a binary classification model. For example, if 2 means TRUE and 1 means FALSE, then set binary_true_value = 2.

boot_it

non-negative integer(1). Number of bootstrap iterations for full-model bootstrapping. For bootstrapping of ALE values, see details to verify if ALE() with bootstrapping is not more appropriate than ModelBoot(). If boot_it = 0, then the model is run as normal once on the full data with no bootstrapping.

boot_alpha

numeric(1) from 0 to 1. Alpha for percentile-based confidence interval range for the bootstrap intervals; the bootstrap confidence intervals will be the lowest and highest (1 - 0.05) / 2 percentiles. For example, if boot_alpha = 0.05 (default), the intervals will be from the 2.5 and 97.5 percentiles.

boot_centre

character(1) in c('mean', 'median'). When bootstrapping, the main estimate for the ALE y value is considered to be boot_centre. Regardless of the value specified here, both the mean and median will be available.

seed

integer. Random seed. Supply this between runs to assure identical bootstrap samples are generated each time on the same data.

output_model_stats

logical(1). If TRUE (default), return overall model statistics.

output_model_coefs

logical(1). If TRUE (default), return model coefficients.

output_ale

logical(1). If TRUE (default), return ALE data and statistics.

output_boot_data

logical(1). If TRUE, return the full raw data for each bootstrap iteration, specifically, the bootstrapped models and the model row indices. Default is FALSE.

ale_options, tidy_options, glance_options

list of named arguments. Arguments to pass to the ALE() when ale = TRUE, broom::tidy() when model_coefs = TRUE, or broom::glance() when model_stats = TRUE, respectively, beyond (or overriding) their defaults. In particular, to obtain p-values for ALE statistics, see the details.

silent

See documentation for ALE()

Value

An object of class ALE with properties model_stats, model_coefs, ale, model_stats, boot_data, and params.

Properties

model_stats

tibble of bootstrapped results from broom::glance(). NULL if model_stats argument is FALSE. In general, only broom::glance() results that make sense when bootstrapped are included, such as df and adj.r.squared. Results that are incomparable across bootstrapped datasets (such as aic) are excluded. In addition, certain model performance measures are included; these are bootstrap-validated with the .632 correction (NOT the .632+ correction):

  • For regression (numeric prediction) models:

    • mae: mean absolute error (MAE)

    • sa_mae_mad: standardized accuracy of the MAE referenced on the mean absolute deviation

    • rmse: root mean squared error (RMSE)

    • sa_rmse_sd: standardized accuracy of the RMSE referenced on the standard deviation

  • For classification (probability) models:

    • auc: area under the ROC curve

model_coefs

A tibble of bootstrapped results from broom::tidy(). NULL if model_coefs argument is FALSE.

ale

A list of bootstrapped ALE results using default ALE() settings unless if overridden with ale_options. NULL if ale argument is FALSE. Elements are:

  * `single`: an `ALE` object of ALE calculations on the full dataset without bootstrapping.
  * `boot`: a list of bootstrapped ALE data and statistics. This element is not an `ALE` object; it is in a special internal format.

boot_data

A tibble of bootstrap results. Each row represents a bootstrap iteration. NULL if boot_data argument is FALSE. The columns are:

  * `it`: the specific bootstrap iteration from 0 to `boot_it` iterations. Iteration 0 is the results from the full dataset (not bootstrapped).
  * `row_idxs`: the row indexes for the bootstrapped sample for that iteration. To save space, the row indexes are returned rather than the full datasets. So, for example, iteration i's bootstrap sample can be reproduced by `data[ModelBoot_obj@boot_data$row_idxs[[2]], ]` where `data` is the dataset and `ModelBoot_obj` is the result of `ModelBoot()`.
  * `model`: the model object trained on that iteration.
  * `tidy`: the results of `broom::tidy(model)` on that iteration.
  * `stats`: the results of `broom::glance(model)` on that iteration.
  * `perf`: performance measures on the entire dataset. These are the measures specified above for regression and classification models.

params

Parameters used to calculate bootstrapped data. Most of these repeat the arguments passed to ModelBoot(). These are either the values provided by the user or used by default if the user did not change them but the following additional objects created internally are also provided:

* `y_cats`: same as `ALE@params$y_cats` (see documentation there).
* `y_type`: same as `ALE@params$y_type` (see documentation there).
* `model`: same as `ALE@params$model` (see documentation there).
* `data`: same as `ALE@params$data` (see documentation there).

Full-model bootstrapping

No modelling results, with or without ALE, should be considered reliable without appropriate validation. For ALE, both the trained model itself and the ALE that explains the trained model must be validated. ALE must be validated by bootstrapping. The trained model might be validated either by cross-validation or by bootstrapping. For ALE that explains trained models that have been developed by cross-validation, it is sufficient to bootstrap just the training data. That is what the ALE object does with its boot_it argument. However, unvalidated models must be validated by bootstrapping them along with the calculation of ALE; this is what the ModelBoot object does with its boot_it argument.

ModelBoot() carries out full-model bootstrapping to validate models. Specifically, it:

  • Creates multiple bootstrap samples (default 100; the user can specify any number);

  • Creates a model on each bootstrap sample;

  • Calculates overall model statistics, variable coefficients, and ALE values for each model on each bootstrap sample;

  • Calculates the mean, median, and lower and upper confidence intervals for each of those values across all bootstrap samples.

p-values

The broom::tidy() summary statistics will provide p-values. However, the procedure for obtaining p-values for ALE statistics is very slow: it involves retraining the model 1000 times. Thus, it is not efficient to calculate p-values whenever a ModelBoot object is created. Although the ALE() function provides an 'auto' option for creating p-values, that option is disabled when creating a ModelBoot because it would be far too slow: it would involve retraining the model 1000 times the number of bootstrap iterations. Rather, you must first create a p-values distribution object using the procedure described in help(-ALEpDist). If the name of your p-values object is p_dist, you can then request p-values each time you create a ModelBoot by passing it the argument ale_options = list(p_values = p_dist).

References

Okoli, Chitu. 2023. “Statistical Inference Using Machine Learning and Classical Techniques Based on Accumulated Local Effects (ALE).” arXiv. https://arxiv.org/abs/2310.09877.

Examples


# attitude dataset
attitude
#>    rating complaints privileges learning raises critical advance
#> 1      43         51         30       39     61       92      45
#> 2      63         64         51       54     63       73      47
#> 3      71         70         68       69     76       86      48
#> 4      61         63         45       47     54       84      35
#> 5      81         78         56       66     71       83      47
#> 6      43         55         49       44     54       49      34
#> 7      58         67         42       56     66       68      35
#> 8      71         75         50       55     70       66      41
#> 9      72         82         72       67     71       83      31
#> 10     67         61         45       47     62       80      41
#> 11     64         53         53       58     58       67      34
#> 12     67         60         47       39     59       74      41
#> 13     69         62         57       42     55       63      25
#> 14     68         83         83       45     59       77      35
#> 15     77         77         54       72     79       77      46
#> 16     81         90         50       72     60       54      36
#> 17     74         85         64       69     79       79      63
#> 18     65         60         65       75     55       80      60
#> 19     65         70         46       57     75       85      46
#> 20     50         58         68       54     64       78      52
#> 21     50         40         33       34     43       64      33
#> 22     64         61         52       62     66       80      41
#> 23     53         66         52       50     63       80      37
#> 24     40         37         42       58     50       57      49
#> 25     63         54         42       48     66       75      33
#> 26     66         77         66       63     88       76      72
#> 27     78         75         58       74     80       78      49
#> 28     48         57         44       45     51       83      38
#> 29     85         85         71       71     77       74      55
#> 30     82         82         39       59     64       78      39

## ALE for general additive models (GAM)
## GAM is tweaked to work on the small dataset.
gam_attitude <- mgcv::gam(rating ~ complaints + privileges + s(learning) +
                            raises + s(critical) + advance,
                          data = attitude)
summary(gam_attitude)
#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> Formula:
#> rating ~ complaints + privileges + s(learning) + raises + s(critical) + 
#>     advance
#> 
#> Parametric coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 36.97245   11.60967   3.185 0.004501 ** 
#> complaints   0.60933    0.13297   4.582 0.000165 ***
#> privileges  -0.12662    0.11432  -1.108 0.280715    
#> raises       0.06222    0.18900   0.329 0.745314    
#> advance     -0.23790    0.14807  -1.607 0.123198    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Approximate significance of smooth terms:
#>               edf Ref.df     F p-value  
#> s(learning) 1.923  2.369 3.761  0.0312 *
#> s(critical) 2.296  2.862 3.272  0.0565 .
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> R-sq.(adj) =  0.776   Deviance explained = 83.9%
#> GCV = 47.947  Scale est. = 33.213    n = 30

# \donttest{
# Full model bootstrapping
# Only 4 bootstrap iterations for a rapid example; default is 100
# Increase value of boot_it for more realistic results
mb_gam <- ModelBoot(
  gam_attitude,
  boot_it = 4
)

# If the model is not standard, supply model_call_string with 'data = boot_data'
# in the string instead of the actual dataset name (in addition to the actual dataset
# as the 'data' argument directly to the `ModelBoot` constructor).
mb_gam <- ModelBoot(
  gam_attitude,
  data = attitude,  # the actual dataset
  model_call_string = 'mgcv::gam(
    rating ~ complaints + privileges + s(learning) +
      raises + s(critical) + advance,
    data = boot_data  # required for model_call_string
  )',
  boot_it = 4
)

# Model statistics and coefficients
mb_gam@model_stats
#> # A tibble: 9 × 7
#>   name          boot_valid conf.low median  mean conf.high       sd
#>   <chr>              <dbl>    <dbl>  <dbl> <dbl>     <dbl>    <dbl>
#> 1 df               NA         15.2   18.5  18.3     20.9   2.50e+ 0
#> 2 df.residual      NA          9.15  11.5  11.7     14.8   2.50e+ 0
#> 3 nobs             NA         30     30    30       30     0       
#> 4 adj.r.squared    NA          1.00   1.00  1.00     1.00  1.93e-15
#> 5 npar             NA         23     23    23       23     0       
#> 6 mae              19.7       13.7   NA    NA       58.4   2.20e+ 1
#> 7 sa_mae_mad        0.0639    -1.42  NA    NA        0.268 8.09e- 1
#> 8 rmse             24.9       17.8   NA    NA       76.9   2.98e+ 1
#> 9 sa_rmse_sd        0.0523    -1.66  NA    NA        0.240 9.46e- 1
mb_gam@model_coefs
#> # A tibble: 2 × 6
#>   term        conf.low median  mean conf.high std.error
#>   <chr>          <dbl>  <dbl> <dbl>     <dbl>     <dbl>
#> 1 s(learning)     6.92   7.63  7.77      8.85     0.874
#> 2 s(critical)     2.80   6.10  5.49      7.13     2.13 

# Plot ALE
plot(mb_gam)

# }