Skip to contents

Calculates the Geometric Mean (G-mean) for the provided confusion matrix. G-mean is a measure of a model's performance that considers both the sensitivity (True Positive Rate) and specificity (True Negative Rate), especially useful in imbalanced datasets.

Usage

dx_g_mean(cm, detail = "full", boot = FALSE, bootreps = 1000)

Arguments

cm

A dx_cm object created by dx_cm().

detail

Character specifying the level of detail in the output: "simple" for raw estimate, "full" for detailed estimate including 95% confidence intervals.

boot

Logical specifying if confidence intervals should be generated via bootstrapping. Note, this can be slow.

bootreps

The number of bootstrap replications for calculating confidence intervals.

Value

Depending on the detail parameter, returns a numeric value representing the calculated metric or a data frame/tibble with detailed diagnostics including confidence intervals and possibly other metrics relevant to understanding the metric.

Details

G-mean is the geometric mean of sensitivity and specificity. It tries to maximize the accuracy on each of the two classes while keeping these accuracies balanced. For a classifier to achieve a high G-mean score, it must perform well on both positive and negative classes.

The formula for G-mean is: $$G-mean = \sqrt{Sensitivity \times Specificity}$$

See also

dx_cm() to understand how to create and interact with a 'dx_cm' object.

dx_sensitivity(), dx_specificity() for components of G-mean.

Examples

cm <- dx_cm(dx_heart_failure$predicted, dx_heart_failure$truth, threshold = 0.5, poslabel = 1)
simple_g_mean <- dx_g_mean(cm, detail = "simple")
detailed_g_mean <- dx_g_mean(cm)
print(simple_g_mean)
#> [1] 0.7990855
print(detailed_g_mean)
#> # A tibble: 1 × 8
#>   measure summary estimate conf_low conf_high fraction conf_type notes          
#>   <chr>   <chr>      <dbl>    <dbl>     <dbl> <chr>    <chr>     <chr>          
#> 1 G-mean  0.8        0.799       NA        NA ""       NA        Specify `boot …