Researchers lift the curtain behind the 'black box' of data broker records

November 04, 2019

CATONSVILLE, MD, November 4, 2019 - It's no longer news that our data is for sale. Data brokers often use online browsing records to create digital consumer profiles that are then sold to marketers as pre-defined audiences for targeted advertising.

It is often assumed that the tools used to analyze and categorize customer data are so sophisticated that marketers can reliably fine-tune messaging and targeting. But new research from the INFORMS journal Marketing Science (Editor's note: The source of this research is INFORMS) has revealed that the process for creating those digital profiles may not be as reliable as many may assume.

The study, to be published in the November edition of the INFORMS journal Marketing Science, is titled "Frontiers: How Effective Is Third-Party Consumer Profiling? Evidence from Field Studies." It is authored by Nico Neumann of Melbourne Business School, Catherine Tucker of MIT and the National Bureau of Economic Research, and Timothy Whitfield of Burst SMS in Australia.

The researchers examined two basic demographic attributes (age and gender), and three distinct Internet user interest areas (sports, travel and fitness). They analyzed data from more than 19 different data brokers, which resulted in more than 90 validated digital audiences of Internet users. And they conducted three distinct field tests.

"In general, the process which underlies the creation of user profiles and segments for targeting is a `black box,' which creates challenges for understanding the reliability and the accuracy of digital profiles" said Tucker. "Furthermore, advertisers have little chance of assessing how accurate the profiles they are buying are.

"In our first field test, we ran an online campaign in much the same way as an advertiser would run a campaign and assessed whether the ad was seen by the requested demographic segment," said Tim Whitfield. "In our second field test, we narrowed our focus and looked directly at whether data brokers are able to accurately determine the age and gender of a specific pair of eyeballs. And in our third field test, we extended our data quality assessment from demographics to audience-interest segments."

"In our first field test, we found that our ad was shown to the right demographic segment 59 percent of the time," said Neumann. "In our second field test, we found that data brokers basically were able to identify gender about the same as random chance. The third field test revealed that the accuracy of interest-based audiences is higher (72.8 -87.4 percent on average). However, this greater classification percentage seemed rather linked to the fact that the tested attributes occur very often in the population - for example there are many people who like sports in Australia and the US, so identifying someone who is interested in sports is not that hard. "The relative improvement of using audience data versus randomly picking people is still overall disappointing across all our tests", added Neumann.

The three studies combined illustrate that it is important to consider the costs and benefits of using audience data for ad targeting. Because audience data leads to large extra expense, it may not provide a useful business case for every situation relative to untargeted advertising. For example, the average extra costs for display ad targeting based on purchased audience data are around 151%. However, in a best-case scenario the relative improvement in finding the right customer was only 123% (when comparing audience targeting versus random people selection).

However, the business case depends on the individual organization's expertise and technology costs, the selected data brokers and media used. In particular, more expensive media (e.g. video advertising) is much more likely to result in positive benefit-cost trade-offs for the use of audience information purchased from data brokers.
About INFORMS and Marketing Science

Marketing Science is a premier peer-reviewed scholarly marketing journal focused on research using quantitative approaches to study all aspects of the interface between consumers and firms. It is published by INFORMS, the leading international association for operations research and analytics professionals. More information is available at or @informs.


Tim O'Brien

Institute for Operations Research and the Management Sciences

Related Data Articles from Brightsurf:

Keep the data coming
A continuous data supply ensures data-intensive simulations can run at maximum speed.

Astronomers are bulging with data
For the first time, over 250 million stars in our galaxy's bulge have been surveyed in near-ultraviolet, optical, and near-infrared light, opening the door for astronomers to reexamine key questions about the Milky Way's formation and history.

Novel method for measuring spatial dependencies turns less data into more data
Researcher makes 'little data' act big through, the application of mathematical techniques normally used for time-series, to spatial processes.

Ups and downs in COVID-19 data may be caused by data reporting practices
As data accumulates on COVID-19 cases and deaths, researchers have observed patterns of peaks and valleys that repeat on a near-weekly basis.

Data centers use less energy than you think
Using the most detailed model to date of global data center energy use, researchers found that massive efficiency gains by data centers have kept energy use roughly flat over the past decade.

Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.

Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.

Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.

Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.

Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.

Read More: Data News and Data Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to