Nav: Home

Identifying problems with national identifiers

September 28, 2015

In a pair of experiments that raise questions about the use of national identifying numbers, Harvard researchers have shown that Resident Registration Numbers (RRN) used in South Korea can be decrypted to reveal a host of personal information.

Led by Professor of Government and Technology in Residence Latanya Sweeney, a team of researchers in two experiments was able to decrypt more than 23,000 RRNs using both computation and logical reasoning. The findings suggest that, while such identifiers are encrypted to protect privacy, they remain vulnerable to attack and must be designed to avoid such weaknesses. The studies are described in a September 29 paper published in Technology Science.

"Like most data driven highly networked societies, South Korea uses personally identifying numbers as a linchpin to personal identity in employment, banking, taxation, and for social and medical services. In the United States, we use Social Security numbers similarly. When these numbers become easily accessible to others, whether through breaches or poor encryption in data sold or given away, the major institutions that rely on them become vulnerable. "

Sweeney and Ji Su Yoo, a Research Assistant at the Data Privacy Lab at Harvard and an author of the study, were able to show that each number in the RRN could be replaced with a letter in a recognizable pattern. Using such a pattern, they were able to decrypt thousands of RRNs that could uncover personal information about their users.

They also found that, like credit cards, the final digit is a weighted sum of prior digits, meaning researchers were able to decrypt the numbers, then used arithmetic to confirm the accuracy of the information they uncovered.

"South Koreans depend on personally identifying numbers for numerous economic transactions and it is inconvenient for businesses and individuals alike to verify identities and track clients without these numbers. But in the end, it is the South Korean population that is receiving the short end of the stick. That is, when data is so easily de-anonymized, individual privacy, not company profits, are compromised. Our study shows that weak encoding systems, which refer to the very design of the number, render encryptions as poor methods of protecting privacy. South Koreans are aware of the vulnerabilities of the RRN encoding system - our study therefore urges a more robust redesign of these personally identifying numbers not only for the sake of the institutions and system that depend on them but also for the individuals who use them."

Sweeney and Yoo conducted the study using prescription data that was presumed to be anonymous because it did not include patient's name or address, and had encrypted their RRN. Similar data is often shared with corporations around the world who track health data - believed to be anonymous - on millions of South Koreans.

"Administrators often use simple schemes to encrypt personal information because it passes a face test -if it looks okay, it must be okay. Sometimes they use strong encryption but in a wrong way, leading to the same vulnerable outcome. If researchers like us don't provide scientific facts and insights into these practices, who will? Companies that receive the data may exploit these same vulnerabilities to advantage. If so, they would hardly then turn around and tell administrators. It is up to researchers to give administrators and society the scientific knowledge needed to make better choices."

The findings are particularly timely, Sweeney said, because South Korea is currently debating a redesign of RRNs and other nations, including the United States, have discussed the use of a single identifier for medical records.

The study, she said, reveals that such identifiers - if not carefully designed and monitored - can be vulnerable to leaks, and must be carefully considered going forward.

"The problem is not unique to South Korea, it's a worldwide concern because we all rely on credit card numbers and other identifiers to function."

Harvard University

Related Data Articles:

Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.
Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.
Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.
Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.
Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.
Ecologists ask: Should we be more transparent with data?
In a new Ecological Applications article, authors Stephen M. Powers and Stephanie E.
Should you share data of threatened species?
Scientists and conservationists have continually called for location data to be turned off in wildlife photos and publications to help preserve species but new research suggests there could be more to be gained by sharing a rare find, rather than obscuring it, in certain circumstances.
Using light for next-generation data storage
Tiny, nano-sized crystals of salt encoded with data using light from a laser could be the next data storage technology of choice, following research by Australian scientists.
Futuristic data storage
The development of high-density data storage devices requires the highest possible density of elements in an array made up of individual nanomagnets.
Making data matter
The advent of 3-D printing has made it possible to take imaging data and print it into physical representations, but the process of doing so has been prohibitively time-intensive and costly.
More Data News and Data Current Events

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Rethinking Anger
Anger is universal and complex: it can be quiet, festering, justified, vengeful, and destructive. This hour, TED speakers explore the many sides of anger, why we need it, and who's allowed to feel it. Guests include psychologists Ryan Martin and Russell Kolts, writer Soraya Chemaly, former talk radio host Lisa Fritsch, and business professor Dan Moshavi.
Now Playing: Science for the People

#538 Nobels and Astrophysics
This week we start with this year's physics Nobel Prize awarded to Jim Peebles, Michel Mayor, and Didier Queloz and finish with a discussion of the Nobel Prizes as a way to award and highlight important science. Are they still relevant? When science breakthroughs are built on the backs of hundreds -- and sometimes thousands -- of people's hard work, how do you pick just three to highlight? Join host Rachelle Saunders and astrophysicist, author, and science communicator Ethan Siegel for their chat about astrophysics and Nobel Prizes.