Create a table of missing values along with percentages. It may be undesirable to count a variable if it is dependent on another one. This can be accounted for here by using a named vector where the name is the dependent variable and the value is the indpendent variable. c("B" = "A") can be thought of here as "B depends on A". In other words, each entry in B will only be evaluated if A is not missing. If the input is a labeled data frame, the final output will include a column of labels.

missing_tbl(data, depends = NA)

Arguments

data

A tbl.

depends

A named vector representing variable dependencies. For example, c("B" = "A", "C" = "B") can be read as "B depends on A and C depends on B". If a value for B is missing, it will only be counted if A is not missing, and similarly for C and B.

Examples

data <- tibble::tribble( ~A, ~B, ~C, 1, 2, 3, NA, NA, 6, 7, NA, 9 ) data
#> # A tibble: 3 x 3 #> A B C #> <dbl> <dbl> <dbl> #> 1 1 2 3 #> 2 NA NA 6 #> 3 7 NA 9
missing_tbl(data)
#> Variable N Non-Missing Missing #> 1 A 3 2.0 (67%) 1.0 (33%) #> 2 B 3 1.0 (33%) 2.0 (67%) #> 3 C 3 3.0 (100%) 0.0 (0%)
missing_tbl(data, depends = c("B" = "A"))
#> Variable Depends N Non-Missing Missing #> 1 A <NA> 3.0 2.0 (67%) 1.0 (33%) #> 2 B A 2.0 1.0 (50%) 1.0 (50%) #> 3 C <NA> 3.0 3.0 (100%) 0.0 (0%)