This function de-identifies Muse ECG XML files by replacing specified elements with anonymized values using regex patterns.
Usage
muse_deidentify(file, output_file, replace = list())
Arguments
- file
Path to the input XML file to be de-identified.
- output_file
Path where the de-identified XML file will be saved.
- replace
A named list where each name is an XPath to an XML node and its value is a list of regex patterns and replacements. For example, `list("//PatientDemographics/PatientLastName" = list(".*" = "LastName"))` replaces the text of the `PatientLastName` element within `PatientDemographics`.#'
Value
Invisibly returns the path to the output file containing the de-identified XML document. The function primarily operates through side effects (reading an input file, modifying its content, and writing the result to a new file).
Examples
# De-identify a sample Muse ECG XML file
# For diagnosis statements, we will remove only specific text (like dates)
# rather than the full nodes
# Create a string of valid month abbreviations
months <- paste(toupper(month.abb), collapse = "|")
dx_replace <- setNames(
list(
"XX-XXX-XXXX XX:XX", # Replacement for date-time format
"XX-XXX-XXXX", # Replacement for date format
"Confirmed by XXX (XX) on XX/XX/XXXX XX:XX:XX XM" # Replacement for confirmation format
),
c(
paste0("\\d{2}-(", months, ")-\\d{4} \\d{2}:\\d{2}"),
paste0("\\d{2}-(", months, ")-\\d{4}"),
"Confirmed by [A-Za-z]+, [A-Za-z]+ \\(\\d+\\) on \\d{1,2}/\\d{1,2}/\\d{4} \\d{1,2}:\\d{2}:\\d{2} [APM]+"
)
)
replace <- list(
"/RestingECG/PatientDemographics/PatientLastName" = list(".*" = "LastName"),
"/RestingECG/PatientDemographics/PatientFirstName" = list(".*" = "FirstName"),
"/RestingECG/PatientDemographics/PatientID" = list(".*" = "PatientID"),
"/RestingECG/PatientDemographics/DateofBirth" = list(".*" = "XXXX"),
"/RestingECG/Diagnosis/DiagnosisStatement/StmtText" = dx_replace,
"/RestingECG/OriginalDiagnosis/DiagnosisStatement/StmtText" = dx_replace
)
file <- muse_example("muse/muse_ecg1.xml")
output_file <- tempfile(fileext = ".xml")
muse_deidentify(file, output_file, replace)