Skip to contents

Standardized accuracy (staccuracy) represents error or accuracy measures on a scale where 1 or 100% means perfect prediction and 0.5 or 50% is a reference comparison of some specified standard performance. Higher than 0.5 is better than the reference and below 0.5 is worse. 0 might or might not have a special meaning; sometimes negative scores are possible, but these often indicate modelling errors.

The core function is staccuracy(), which receives as input a generic error function and a reference function against which to compare the error function performance. In addition, the following recommended staccuracy functions are provided:

  • sa_mae_mad: standardized accuracy of the mean absolute error (MAE) based on the mean absolute deviation (MAD)

  • sa_rmse_sd: standardized accuracy of the root mean squared error (RMSE) based on the standard deviation (SD)

  • sa_wmae_mad: standardized accuracy of the winsorized mean absolute error (MAE) based on the mean absolute deviation (MAD)

  • sa_wrmse_sd: standardized accuracy of the winsorized root mean squared error (RMSE) based on the standard deviation (SD)

Usage

staccuracy(error_fun, ref_fun)

sa_mae_mad(actual, pred, na.rm = FALSE)

sa_wmae_mad(actual, pred, na.rm = FALSE)

sa_rmse_sd(actual, pred, na.rm = FALSE)

sa_wrmse_sd(actual, pred, na.rm = FALSE)

Arguments

error_fun

function. The unquoted name of the function that calculates the error (or accuracy) measure. This function must be of the signature function(actual, pred, na.rm = FALSE).

ref_fun

function. The unquoted name of the function that calculates the reference error, accuracy, or deviation measure. This function must be of the signature ref_fun(actual, na.rm = FALSE).

actual

numeric. The true (actual) labels.

pred

numeric. The predicted estimates. Must be the same length as actual.

na.rm

logical(1). Whether NA values should be removed (TRUE) or not (FALSE, default).

Value

staccuracy() returns a function with signature function(actual, pred, na.rm = FALSE) that receives an actual and a pred vector as inputs and returns the staccuracy of the originally input error function based on the input reference function.

The convenience sa_*() functions return the staccuracy measures specified above.

Details

The core function staccuracy() receives as input a generic error function and a reference function against which to compare the error function's performance. These input functions must have the following signatures (see the argument specifications for details of the arguments):

  • error_fun: function(actual, pred, na.rm = na.rm); the output must be a scalar numeric (that is, a single number).

  • error_fun: function(actual, pred, na.rm = na.rm); the output must be a scalar numeric (that is, a single number).

Examples

# Here's some data
actual_1 <- c(2.3, 4.5, 1.8, 7.6, 3.2)

# Here are some predictions of that data
predicted_1 <- c(2.5, 4.2, 1.9, 7.4, 3.0)

# MAE measures the average error in the predictions
mae(actual_1, predicted_1)
#> [1] 0.2

# But how good is that?
# MAD gives the natural variation in the actual data; this is a point of comparison.
mad(actual_1)
#> [1] 1.736

# So, our predictions are better (lower) than the MAD, but how good, really?
# Create a standardized accuracy function to give us an easily interpretable metric:
my_mae_vs_mad_sa <- staccuracy(mae, mad)

# Now use it
my_mae_vs_mad_sa(actual_1, predicted_1)
#> [1] 0.9423963

# That's 94.2% standardized accuracy compared to the MAD. Pretty good!