Nav: Home

Is your big data messy? We're making an app for that

February 16, 2017

BUFFALO, N.Y. -- Like a teenager's bedroom, big data is often messy.

Malfunctioning computers, data entry errors and other hard-to-spot problems can skew datasets and mislead people -- everyone from data scientists to data hobbyists -- trying to draw conclusions from raw data.

Vizier, a software tool under development by a University at Buffalo-led research team, aims to proactively catch those errors.

The project, backed by a $2.7 million National Science Foundation grant, launched in January. Like Excel and other spreadsheet software, Vizier will allow users to interactively work with datasets. For example, it will help people explore, clean, curate and visualize data in meaningful ways, as well as spot errors and offer solutions.

But unlike spreadsheet software, Vizier is intended for much larger datasets; it will be used to examine millions or billions of data points, as opposed to hundreds or thousands typically plugged into spreadsheet software.

"We are creating a tool that'll let you work with the data you have, and also unobtrusively make helpful observations like 'Hmm... have you noticed that two out of a million records make a 10 percent difference in this average?'" says Oliver Kennedy, PhD, assistant professor of computer science and engineering at UB, and the grant's principal investigator.

Co-principal investigators include Juliana Freire, professor of computer science and engineering at New York University, and Boris Glavic, assistant professor in the Department of Computer Science at the Illinois Institute of Technology. The award is from NSF's Data Infrastructure Building Blocks (DIBBs) program.

For years, companies like Google, Microsoft and Apple have utilized big data to improve their products and services. That same power is now spreading to the masses as government agencies in the United States and elsewhere publish massive amounts of public data on the internet.

For example, New York City and the federal government have open data portals making it possible for anyone with an internet connection to download information and ask questions about their government. When properly used, these portals can shed light on issues relating to health code violations, discrimination, bias and other matters, Kennedy said. Vizier will be released as free, open-source software.

"We want to make it easier for data scientists -- and eventually data hobbyists -- to discover and communicate not only what the data says, but why the data says that," he said.
-end-
Contact: Cory Nealon, cmnealon@buffalo.edu, University at Buffalo

University at Buffalo

Related Engineering Articles:

Engineering the meniscus
Damage to the meniscus is common, but there remains an unmet need for improved restorative therapies that can overcome poor healing in the avascular regions.
Artificially engineering the intestine
Short bowel syndrome is a debilitating condition with few treatment options, and these treatments have limited efficacy.
Reverse engineering the fireworks of life
An interdisciplinary team of Princeton researchers has successfully reverse engineered the components and sequence of events that lead to microtubule branching.
New method for engineering metabolic pathways
Two approaches provide a faster way to create enzymes and analyze their reactions, leading to the design of more complex molecules.
Engineering for high-speed devices
A research team from the University of Delaware has developed cutting-edge technology for photonics devices that could enable faster communications between phones and computers.
Breakthrough in blood vessel engineering
Growing functional blood vessel networks is no easy task. Previously, other groups have made networks that span millimeters in size.
Next-gen batteries possible with new engineering approach
Dramatically longer-lasting, faster-charging and safer lithium metal batteries may be possible, according to Penn State research, recently published in Nature Energy.
What can snakes teach us about engineering friction?
If you want to know how to make a sneaker with better traction, just ask a snake.
Engineering a plastic-eating enzyme
Scientists have engineered an enzyme which can digest some of our most commonly polluting plastics, providing a potential solution to one of the world's biggest environmental problems.
A new way to do metabolic engineering
University of Illinois researchers have created a novel metabolic engineering method that combines transcriptional activation, transcriptional interference, and gene deletion, and executes them simultaneously, making the process faster and easier.
More Engineering News and Engineering Current Events

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Rethinking Anger
Anger is universal and complex: it can be quiet, festering, justified, vengeful, and destructive. This hour, TED speakers explore the many sides of anger, why we need it, and who's allowed to feel it. Guests include psychologists Ryan Martin and Russell Kolts, writer Soraya Chemaly, former talk radio host Lisa Fritsch, and business professor Dan Moshavi.
Now Playing: Science for the People

#538 Nobels and Astrophysics
This week we start with this year's physics Nobel Prize awarded to Jim Peebles, Michel Mayor, and Didier Queloz and finish with a discussion of the Nobel Prizes as a way to award and highlight important science. Are they still relevant? When science breakthroughs are built on the backs of hundreds -- and sometimes thousands -- of people's hard work, how do you pick just three to highlight? Join host Rachelle Saunders and astrophysicist, author, and science communicator Ethan Siegel for their chat about astrophysics and Nobel Prizes.