Skip to contents

Performs McNemar's test to evaluate the difference between two paired proportions. This is typically used in the context of binary classification to test whether the proportion of correct classifications significantly differs between two classifiers on the same set of instances.

Usage

dx_mcnemars(dx1, dx2, detail = "full")

Arguments

dx1

The first dx object containing predictions and truth values.

dx2

The second dx object containing predictions and truth values.

detail

A string indicating the level of detail for the output: "simple" for just the p-value, "full" for a comprehensive result including test statistics.

Value

Depending on the detail parameter, returns either the p-value (simple) or a more comprehensive list including the test statistic and p-value (full).

Details

McNemar's test is appropriate when comparing the classification results of two algorithms on the same data set (paired design). It specifically tests the null hypothesis that the marginal probabilities of row and column variable are the same.

This test is suitable for binary classification tasks where you are comparing the performance of two classifiers over the same instances. It's not appropriate for independent samples or more than two classifiers.

The function expects the input as two dx objects, each containing the predictions and truth values from the classifiers being compared. It calculates the contingency table based on the agreements and disagreements between the classifiers and applies McNemar's test to this table.

Examples

dx_glm <- dx(data = dx_heart_failure, true_varname = "truth", pred_varname = "predicted")
dx_rf <- dx(data = dx_heart_failure, true_varname = "truth", pred_varname = "predicted_rf")
dx_mcnemars(dx_glm, dx_rf)
#> # A tibble: 1 × 9
#>   models test        summary p_value estimate conf_low conf_high statistic notes
#>   <chr>  <chr>       <chr>     <dbl> <chr>    <lgl>    <lgl>         <dbl> <chr>
#> 1 ""     McNemar's … p=0.02   0.0180 ""       NA       NA              5.6 ""