Standardized accuracy (staccuracy) represents error or accuracy measures on a scale where 1 or 100% means perfect prediction and 0.5 or 50% is a reference comparison of some specified standard performance. Higher than 0.5 is better than the reference and below 0.5 is worse. 0 might or might not have a special meaning; sometimes negative scores are possible, but these often indicate modelling errors.
The core function is staccuracy()
, which receives as input a generic error function and a reference function against which to compare the error function performance. In addition, the following recommended staccuracy functions are provided:
sa_mae_mad
: standardized accuracy of the mean absolute error (MAE) based on the mean absolute deviation (MAD)sa_rmse_sd
: standardized accuracy of the root mean squared error (RMSE) based on the standard deviation (SD)sa_wmae_mad
: standardized accuracy of the winsorized mean absolute error (MAE) based on the mean absolute deviation (MAD)sa_wrmse_sd
: standardized accuracy of the winsorized root mean squared error (RMSE) based on the standard deviation (SD)
Usage
staccuracy(error_fun, ref_fun)
sa_mae_mad(actual, pred, na.rm = FALSE)
sa_wmae_mad(actual, pred, na.rm = FALSE)
sa_rmse_sd(actual, pred, na.rm = FALSE)
sa_wrmse_sd(actual, pred, na.rm = FALSE)
Arguments
- error_fun
function. The unquoted name of the function that calculates the error (or accuracy) measure. This function must be of the signature
function(actual, pred, na.rm = FALSE)
.- ref_fun
function. The unquoted name of the function that calculates the reference error, accuracy, or deviation measure. This function must be of the signature
ref_fun(actual, na.rm = FALSE)
.- actual
numeric. The true (actual) labels.
- pred
numeric. The predicted estimates. Must be the same length as
actual
.- na.rm
logical(1). Whether NA values should be removed (
TRUE
) or not (FALSE
, default).
Value
staccuracy()
returns a function with signature function(actual, pred, na.rm = FALSE)
that receives an actual
and a pred
vector as inputs and returns the staccuracy of the originally input error function based on the input reference function.
The convenience sa_*()
functions return the staccuracy measures specified above.
Details
The core function staccuracy()
receives as input a generic error function and a reference function against which to compare the error function's performance. These input functions must have the following signatures (see the argument specifications for details of the arguments):
error_fun
:function(actual, pred, na.rm = na.rm)
; the output must be a scalar numeric (that is, a single number).error_fun
:function(actual, pred, na.rm = na.rm)
; the output must be a scalar numeric (that is, a single number).
Examples
# Here's some data
actual_1 <- c(2.3, 4.5, 1.8, 7.6, 3.2)
# Here are some predictions of that data
predicted_1 <- c(2.5, 4.2, 1.9, 7.4, 3.0)
# MAE measures the average error in the predictions
mae(actual_1, predicted_1)
#> [1] 0.2
# But how good is that?
# MAD gives the natural variation in the actual data; this is a point of comparison.
mad(actual_1)
#> [1] 1.736
# So, our predictions are better (lower) than the MAD, but how good, really?
# Create a standardized accuracy function to give us an easily interpretable metric:
my_mae_vs_mad_sa <- staccuracy(mae, mad)
# Now use it
my_mae_vs_mad_sa(actual_1, predicted_1)
#> [1] 0.9423963
# That's 94.2% standardized accuracy compared to the MAD. Pretty good!