Skip to contents

This function de-identifies Muse ECG XML files by replacing specified elements with anonymized values using regex patterns.

Usage

muse_deidentify(file, output_file, replace = list())

Arguments

file

Path to the input XML file to be de-identified.

output_file

Path where the de-identified XML file will be saved.

replace

A named list where each name is an XPath to an XML node and its value is a list of regex patterns and replacements. For example, `list("//PatientDemographics/PatientLastName" = list(".*" = "LastName"))` replaces the text of the `PatientLastName` element within `PatientDemographics`.#'

Value

Invisibly returns the path to the output file containing the de-identified XML document. The function primarily operates through side effects (reading an input file, modifying its content, and writing the result to a new file).

Note

Ensure the paths in `replace` accurately reflect your XML structure.

See also

read_xml, write_xml for the underlying XML manipulation functions used.

Examples

# De-identify a sample Muse ECG XML file


# For diagnosis statements, we will remove only specific text (like dates)
# rather than the full nodes

# Create a string of valid month abbreviations
months <- paste(toupper(month.abb), collapse = "|")

dx_replace <- setNames(
  list(
    "XX-XXX-XXXX XX:XX",  # Replacement for date-time format
    "XX-XXX-XXXX",        # Replacement for date format
    "Confirmed by XXX (XX) on XX/XX/XXXX XX:XX:XX XM" # Replacement for confirmation format
  ),
  c(
    paste0("\\d{2}-(", months, ")-\\d{4} \\d{2}:\\d{2}"),
    paste0("\\d{2}-(", months, ")-\\d{4}"),
    "Confirmed by [A-Za-z]+, [A-Za-z]+ \\(\\d+\\) on \\d{1,2}/\\d{1,2}/\\d{4} \\d{1,2}:\\d{2}:\\d{2} [APM]+"
  )
)

replace <- list(
  "/RestingECG/PatientDemographics/PatientLastName" = list(".*" = "LastName"),
  "/RestingECG/PatientDemographics/PatientFirstName" = list(".*" = "FirstName"),
  "/RestingECG/PatientDemographics/PatientID" = list(".*" = "PatientID"),
  "/RestingECG/PatientDemographics/DateofBirth" = list(".*" = "XXXX"),
  "/RestingECG/Diagnosis/DiagnosisStatement/StmtText" = dx_replace,
  "/RestingECG/OriginalDiagnosis/DiagnosisStatement/StmtText" = dx_replace
 )
file <- muse_example("muse/muse_ecg1.xml")
output_file <- tempfile(fileext = ".xml")
muse_deidentify(file, output_file, replace)