Systematic review suggests small amounts of data can be used to re-identify individuals.
While wearable devices have opened up a new paradigm in healthcare research due to their ability to generate huge amounts of health-related data, such research is predicated on the idea that the data has been “de-identified” to preserve the privacy of the individuals wearing the devices.
However, a new report in The Lancet Digital Health suggests the current process of de-identification may not be as protective of privacy as the name implies. The authors of the report conducted a systematic review of studies that attempted to re-identify patients from de-identified data sets using biometric signals from wearable devices. They found that correct identification was possible in between 86% and 100% of cases.
“Although data sharing provides tremendous benefits, it also poses many crucial questions around privacy risks to patients and study participants that remain unanswered,” wrote corresponding author Jessilyn Dunn, Ph.D., of Duke University, and colleagues.
The investigators noted that wearables, such as smartwatches, have an increasing array of capabilities, including the ability to track a patient’s steps, heart rate and location. The data generated by the devices can be used by the individual wearing the device, but it can also be used by software-makers to improve their algorithms, and by scientific researchers to study population health or the impact of specific medical interventions.
The National Institutes of Health have adopted guidelines aimed at promoting de-identified data-sharing, but the investigators said the possibility of re-identification — using other data to link wearable device data with a person’s identity —mcould open the door to data misuse by government, corporations, or other individuals.
For example, the investigators posited a scenario in which a patient participates in an employee wellness program that involves tracking her steps and heart rate that also requires the collection of demographic and identity information. In their scenario, the patient had previously participated in a stroke prevention study that tracked the same metrics, but which also included other health information that was inaccessible to her employer (the scenario used an HIV diagnosis as an example). If the study’s data were made publicly available in a de-identified manner, the patient’s employer could be able to link the data back to their own employee, thereby learning of her HIV diagnosis. The employer could then theoretically use such data to discriminate against the patient, for instance, by curtailing their contribution to her health coverage.
In an effort to better understand the scope of the potential “re-identification” problem, Dunn and colleagues performed a literature search that ultimately yielded 72 studies that met their inclusion criteria, 64 of which were classified as high-quality and 8 of which were classified as moderate quality, according to the investigators’ custom study-quality assessment tool.
In most of the studies included in the analysis (57), the metric used to assess re-identification was “correct identification rate” (CIR). Those studies found CIR values ranging from 86% to 100%, Dunn and colleagues said, “suggesting that reidentification risks from wearable device data are higher than previously appreciated.
The authors cautioned that most of the studies are small, with fewer than 100 participants each. However, the four larger studies showed results consistent with the smaller studies.
Adding to the concern, the investigators said re-identification was possible even with very little data.
For example, the investigators cited one study that found 50 seconds worth of accelerometer and gyroscope data from people who brushed their teeth while wearing an LG G smartwatch could be used to identify patients with a CIR of 96%.
“This discovery is concerning since publicly identified data is becoming increasingly abundant, given data-sharing advocacy and policy by influential bodies, such as the U.S. Food and Drug Administration and National Institutes of Health,” Dunn and colleagues said.
The investigators said it is still necessary to have identifiers in order to re-identify someone—simply having two de-identified data sets including the same person is not enough to identify that person. However, the authors noted that the availability of identifiers is on the rise because “an increasing number of companies are entering third-party data-sharing agreements, some of which are ethically tenuous.”
Still, Dunn and colleagues were clear that they are not suggesting blocking the sharing of biometric data.
“On the contrary, this systematic review exposes the need for more careful consideration of how data should be shared since the risk of not sharing data (e.g., algorithmic bias and failure to develop new algorithmic tools that could save lives) might even be greater than the risk of reidentification,” they wrote.
Rather, they said, their study is a warning that in order for open science to flourish, better measures are needed to preserve privacy.
“For example, an emphasis on research directions for developing privacy-protecting methods… could allow the biomedical research community to continue to reap the many benefits of data sharing while protecting the privacy of individuals,” they concluded.
The Don'ts of Kicking Off Your Digital Health Startup
May 5th 2023This month's episode of Tuning In to the C-Suite features Erica Jain, CEO & co-founder of digital health company, Healthie. In this discussion, Jain shared some of the challenges or things to avoid when creating a digital health startup, based on her journey with starting Healthie in 2016. She also addressed some best practices and tools that can help those working toward a digital health startup.
Listen
Drones in Healthcare Delivery Show Promise, Environmental Factors Play Key Role, Study Finds
November 13th 2023In recent years, the use of drone technologies and healthcare delivery has become a hot topic in scholarly discussions as they have attributed to address challenges associated with poor road conditions, transportation limitations and the need for more efficient responses to healthcare emergencies.
Read More