Beyond encryption: Protecting consumer privacy while keeping survey results accurate

April 17, 2020

It comes as no surprise that consumer data is continuously being collected by various organizations, including local governments, marketing agencies and social media companies. These organizations assure anonymity and confidentiality when collecting this data, however, existing data privacy laws don't guarantee that data breaches won't occur. According to a recent report, more than 2,000 confirmed data breaches occurred in 2019 alone, with 34% of those executed by internal actors such as employees. To add to that, city and state agencies collect sensitive data that they are required by law to share with the public -- courtesy of Open Data movements and the Freedom of Information Act.

Data privacy laws require encryption and, in some cases, transforming the original data to "protected data" before it's released to external parties. But for researchers like Matthew Schneider, PhD, an assistant professor of Decision Sciences and Management Information Systems at Drexel University's LeBow College of Business, this isn't adequate.

"Encryption definitely helps, but it does not prevent a data breach," he said. "It's similar to safeguarding your email password. An internal actor with access to the encryption key could easily cause a data breach. It's more conservative from a risk perspective to assume that all data will eventually get out and should be transformed prior to sharing anywhere within the organization."

In a recent paper published in the Journal of Marketing Analytics, Schneider and Dawn Iacobucci, PhD, of Vanderbilt University, proposed a new methodology that permanently alters survey datasets to protect consumers' privacy --when data is shared-- while still preserving a level of reasonable accuracy for these datasets.

According to the authors, survey data is often held within organizations and used for purposes beyond the original reason for collecting the data. "Databases and customer information have become a contemporary asset that makes one business attractive to another when forging alliances," Schneider said. "Even firms with high standards of data security can find it challenging to protect the privacy of consumer data."

Another less common, but all-too-real, threat, according to the authors, are cases where employees have illegally taken data from their former companies to a position with a new employer -- for reasons ranging from gaining a favorable impression with the new company, to harming the old company, to even having to provide the data as a condition of the job offer.

For Schneider, the solution to fulfilling data privacy promises turns out to be a technological one.

"Survey data are increasingly used for respondent-level analytics, such as in linkage to other proprietary datasets, and promises of privacy may not be guaranteed in the myriad of subsequent uses of the data," said Schneider. "Confidentiality does not guarantee anonymity. It takes about three or four carefully posed questions in a survey to uniquely identify anyone."

In the paper, the authors analyzed a survey data set that was collected in 2015 by the city of Austin, Texas and released to the public following an Open Data movement. Other cities have similar movements, including New York and Philadelphia.

"There are lots of privacy risks in Open Data since they don't do privacy as well as the federal government that has the large budget and resources to hire statisticians, economists or computer scientists to address this technological problem," said Schneider. "Protection often depends on how the data is used."

The city of Austin administered a survey to 2,614 Asian Americans living in the city to explore the health and service needs of one of the city's fastest growing populations aiming to create higher levels of community engagement, policies and to identify resources to address the needs of the Asian American community. Officials in Austin posted their data sets, as required, to make them readily available for users.

In one survey dataset, each respondent was asked their ethnic origin, which had 32 categories; age, which had 77 categories; zip code, which had 61 categories; and gender.

"Nearly everyone is identifiable with these four variables --some more so than others," said Schneider. "Once you identify them, this survey revealed other sensitive responses such as employment status, religious affiliation, household income, housing affordability and many attitudinal questions. "

Similarly, New York City experienced an Open Data problem with the New York City Taxi and Limousine Commission where 124 million driving routes could be traced to a driver's home address.

One major challenge when considering methodologies to alter participant data effectively is to do this in a way that doesn't greatly change the accuracy of the survey results. The methodology proposed by the authors, was built upon a technique found in genomic sequencing applications that was able to disguise the identity of consumers while maintaining the accuracy of insights within 5%.

"Our method would essentially 'shuffle' the demographic data in a survey dataset," said Schneider. "But, unlike previous methods, ours only shuffles data when it maintains the correlations between important variables that are essential to analysts. The protected data is simulated on a consumer level but still valuable to the end user. If this dataset got out, then only the organization's insights would be known."

The paper, "Protecting Survey Data on a Consumer Level," was published in the Journal of Marketing Analytics and is available at this link. Details about the new methodology are included in the paper.
-end-


Drexel University

Related Privacy Articles from Brightsurf:

Yale team finds way to protect genetic privacy in research
In a new report, a team of Yale scientists has developed a way to protect people's private genetic information while preserving the benefits of a free exchange of functional genomics data between researchers.

Researchers simulate privacy leaks in functional genomics studies
In a study publishing November 12 in the journal Cell, a team of investigators demonstrates that it's possible to de-identify raw functional genomics data to ensure patient privacy.

Some children at higher risk of privacy violations from digital apps
While federal privacy laws prohibit digital platforms from storing and sharing children's personal information, those rules aren't always enforced, researchers find.

COVID-19 symptom tracker ensures privacy during isolation
An online COVID-19 symptom tracking tool developed by researchers at Georgetown University Medical Center ensures a person's confidentiality while being able to actively monitor their symptoms.

New research reveals privacy risks of home security cameras
An international study has used data from a major home Internet Protocol (IP) security camera provider to evaluate potential privacy risks for users.

Researcher develops tool to protect children's online privacy
A University of Texas at Dallas study of 100 mobile apps for kids found that 72 violated a federal law aimed at protecting children's online privacy.

Do COVID-19 apps protect your privacy?
Many mobile apps that track the spread of COVID-19 ask for personal data but don't indicate the information will be secure.

COVID-19 contact tracing apps: 8 privacy questions governments should ask
Imperial experts have posed eight privacy questions governments should consider when developing coronavirus contact tracing apps.

New security system to revolutionise communications privacy
A new uncrackable security system created by researchers at King Abdullah University of Science and Technology (KAUST), the University of St Andrews and the Center for Unconventional Processes of Sciences (CUP Sciences) is set to revolutionize communications privacy.

Mayo Clinic studies patient privacy in MRI research
Though identifying data typically are removed from medical image files before they are shared for research, a Mayo Clinic study finds that this may not be enough to protect patient privacy.

Read More: Privacy News and Privacy Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.