Open tools for quantitative anonymization of tabular phenotype data: literature review

2022 | journal article. A publication with affiliation to the University of Göttingen.

Jump to: Cite & Linked | Documents & Media | Details | Version history

Cite this publication

​Open tools for quantitative anonymization of tabular phenotype data: literature review​
Haber, A. C; Sax, U.   & Prasser, F.​ (2022) 
Briefings in Bioinformatics, art. bbac440​.​ DOI: https://doi.org/10.1093/bib/bbac440 

Documents & Media

bbac440.pdf856.39 kBAdobe PDF

License

Published Version

Attribution 4.0 CC BY 4.0

Details

Authors Group
the NFDI4Health Consortium
The authors list is uncomplete:
Authors
Haber, Anna C; Sax, Ulrich ; Prasser, Fabian
Abstract
Abstract Precision medicine relies on molecular and systems biology methods as well as bidirectional association studies of phenotypes and (high-throughput) genomic data. However, the integrated use of such data often faces obstacles, especially in regards to data protection. An important prerequisite for research data processing is usually informed consent. But collecting consent is not always feasible, in particular when data are to be analyzed retrospectively. For phenotype data, anonymization, i.e. the altering of data in such a way that individuals cannot be identified, can provide an alternative. Several re-identification attacks have shown that this is a complex task and that simply removing directly identifying attributes such as names is usually not enough. More formal approaches are needed that use mathematical models to quantify risks and guide their reduction. Due to the complexity of these techniques, it is challenging and not advisable to implement them from scratch. Open software libraries and tools can provide a robust alternative. However, also the range of available anonymization tools is heterogeneous and obtaining an overview of their strengths and weaknesses is difficult due to the complexity of the problem space. We therefore performed a systematic review of open anonymization tools for structured phenotype data described in the literature between 1990 and 2021. Through a two-step eligibility assessment process, we selected 13 tools for an in-depth analysis. By comparing the supported anonymization techniques and further aspects, such as maturity, we derive recommendations for tools to use for anonymizing phenotype datasets with different properties.
Issue Date
2022
Journal
Briefings in Bioinformatics 
Organization
Campus-Institut Data Science ; Institut für Medizinische Informatik ; Universitätsmedizin Göttingen 
ISSN
1467-5463
eISSN
1477-4054
Language
English

Reference

Citations


Social Media