Skip to contents

Generates and plots the Receiver Operating Characteristic (ROC) curve for the binary classification model represented by the given dx object. The ROC curve is a graphical representation of the trade-off between the true positive rate (sensitivity) and the false positive rate (1 - specificity) at various threshold settings.

Usage

dx_plot_roc(
  dx_obj,
  curve_color = "#0057B8",
  fill_color = "#cfcdcb",
  text_color = "black",
  add_text = TRUE,
  add_ref_lines = TRUE,
  add_fractions = TRUE,
  axis_color = "#333333",
  add_ref_circle = TRUE,
  ref_lines_color = "#8a8887",
  circle_ref_color = "#E4002B",
  summary_stats = c(1, 2, 3, 4, 5, 6, 7, 8),
  filename = NA
)

Arguments

dx_obj

An object of class dx containing the necessary data and statistics for generating the ROC curve.

curve_color

Color of the ROC curve. Default is "#0057B8".

fill_color

Color filled under the ROC curve. Use "transparent" for no fill. Default is "#cfcdcb".

text_color

Color of the text included on the ROC curve. Default is "black".

add_text

Logical; if TRUE, includes statistical annotations on the ROC curve. Default is TRUE.

add_ref_lines

Logical; if TRUE, includes reference lines on the ROC curve. Default is TRUE.

add_fractions

Logical; if TRUE, includes fraction details in text annotations. Default is TRUE.

axis_color

Color of the x and y axis. Default is "#333333".

add_ref_circle

Logical; if TRUE, includes a reference circle around the point of specified threshold. Default is TRUE.

ref_lines_color

Color for reference lines. Default is "#8a8887".

circle_ref_color

Color of the reference circle. Default is "#E4002B".

summary_stats

A vector of integers indicating which statistics to include on the ROC curve. Default is c(1, 2, 3, 4, 5, 6, 7, 8).

filename

File name to create on disk using ggplot2::ggsave. If left NA, no file will be created.

Value

A ggplot object representing the ROC curve, allowing for further customization if desired.

Details

The ROC curve is a widely used tool for diagnosing the performance of binary classification models. It plots the true positive rate (sensitivity) against the false positive rate (1 - specificity) for various thresholds. A model with perfect discrimination (no overlap in the two distributions) has an ROC curve that passes through the upper left corner (100% sensitivity, 100% specificity). Therefore the closer the ROC curve is to the upper left corner, the higher the overall accuracy of the test.

The dx_roc function allows for extensive customization of the ROC curve, including color schemes, reference lines, text annotations, and more, to accommodate a variety of visualization needs and preferences.

The area under the ROC curve (AUC) provides a single scalar value summarizing the performance of the test. The AUC can be interpreted as the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance.

The function also provides options to include a reference circle at a specific threshold point, reference lines indicating a 45-degree line (chance line) and lines for the chosen threshold's specificity and sensitivity, and fill color under the curve.

Examples

dx_obj <- dx(
  data = dx_heart_failure,
  true_varname = "truth",
  pred_varname = "predicted",
  outcome_label = "Heart Attack",
  threshold_range = c(.1, .2, .3),
  setthreshold = .3,
  grouping_variables = c("AgeGroup", "Sex", "AgeSex")
)
dx_plot_roc(dx_obj)
#> Warning: All aesthetics have length 1, but the data has 263 rows.
#>  Please consider using `annotate()` or provide this layer with data containing
#>   a single row.
#> Warning: All aesthetics have length 1, but the data has 263 rows.
#>  Please consider using `annotate()` or provide this layer with data containing
#>   a single row.
#> Warning: All aesthetics have length 1, but the data has 263 rows.
#>  Please consider using `annotate()` or provide this layer with data containing
#>   a single row.
#> Warning: All aesthetics have length 1, but the data has 263 rows.
#>  Please consider using `annotate()` or provide this layer with data containing
#>   a single row.