Nav: Home

Is your big data messy? We're making an app for that

February 16, 2017

BUFFALO, N.Y. -- Like a teenager's bedroom, big data is often messy.

Malfunctioning computers, data entry errors and other hard-to-spot problems can skew datasets and mislead people -- everyone from data scientists to data hobbyists -- trying to draw conclusions from raw data.

Vizier, a software tool under development by a University at Buffalo-led research team, aims to proactively catch those errors.

The project, backed by a $2.7 million National Science Foundation grant, launched in January. Like Excel and other spreadsheet software, Vizier will allow users to interactively work with datasets. For example, it will help people explore, clean, curate and visualize data in meaningful ways, as well as spot errors and offer solutions.

But unlike spreadsheet software, Vizier is intended for much larger datasets; it will be used to examine millions or billions of data points, as opposed to hundreds or thousands typically plugged into spreadsheet software.

"We are creating a tool that'll let you work with the data you have, and also unobtrusively make helpful observations like 'Hmm... have you noticed that two out of a million records make a 10 percent difference in this average?'" says Oliver Kennedy, PhD, assistant professor of computer science and engineering at UB, and the grant's principal investigator.

Co-principal investigators include Juliana Freire, professor of computer science and engineering at New York University, and Boris Glavic, assistant professor in the Department of Computer Science at the Illinois Institute of Technology. The award is from NSF's Data Infrastructure Building Blocks (DIBBs) program.

For years, companies like Google, Microsoft and Apple have utilized big data to improve their products and services. That same power is now spreading to the masses as government agencies in the United States and elsewhere publish massive amounts of public data on the internet.

For example, New York City and the federal government have open data portals making it possible for anyone with an internet connection to download information and ask questions about their government. When properly used, these portals can shed light on issues relating to health code violations, discrimination, bias and other matters, Kennedy said. Vizier will be released as free, open-source software.

"We want to make it easier for data scientists -- and eventually data hobbyists -- to discover and communicate not only what the data says, but why the data says that," he said.
-end-
Contact: Cory Nealon, cmnealon@buffalo.edu, University at Buffalo

University at Buffalo

Related Engineering Articles:

Engineering the meniscus
Damage to the meniscus is common, but there remains an unmet need for improved restorative therapies that can overcome poor healing in the avascular regions.
Artificially engineering the intestine
Short bowel syndrome is a debilitating condition with few treatment options, and these treatments have limited efficacy.
Reverse engineering the fireworks of life
An interdisciplinary team of Princeton researchers has successfully reverse engineered the components and sequence of events that lead to microtubule branching.
New method for engineering metabolic pathways
Two approaches provide a faster way to create enzymes and analyze their reactions, leading to the design of more complex molecules.
Engineering for high-speed devices
A research team from the University of Delaware has developed cutting-edge technology for photonics devices that could enable faster communications between phones and computers.
Breakthrough in blood vessel engineering
Growing functional blood vessel networks is no easy task. Previously, other groups have made networks that span millimeters in size.
Next-gen batteries possible with new engineering approach
Dramatically longer-lasting, faster-charging and safer lithium metal batteries may be possible, according to Penn State research, recently published in Nature Energy.
What can snakes teach us about engineering friction?
If you want to know how to make a sneaker with better traction, just ask a snake.
Engineering a plastic-eating enzyme
Scientists have engineered an enzyme which can digest some of our most commonly polluting plastics, providing a potential solution to one of the world's biggest environmental problems.
A new way to do metabolic engineering
University of Illinois researchers have created a novel metabolic engineering method that combines transcriptional activation, transcriptional interference, and gene deletion, and executes them simultaneously, making the process faster and easier.
More Engineering News and Engineering Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Listen Again: Reinvention
Change is hard, but it's also an opportunity to discover and reimagine what you thought you knew. From our economy, to music, to even ourselves–this hour TED speakers explore the power of reinvention. Guests include OK Go lead singer Damian Kulash Jr., former college gymnastics coach Valorie Kondos Field, Stockton Mayor Michael Tubbs, and entrepreneur Nick Hanauer.
Now Playing: Science for the People

#562 Superbug to Bedside
By now we're all good and scared about antibiotic resistance, one of the many things coming to get us all. But there's good news, sort of. News antibiotics are coming out! How do they get tested? What does that kind of a trial look like and how does it happen? Host Bethany Brookeshire talks with Matt McCarthy, author of "Superbugs: The Race to Stop an Epidemic", about the ins and outs of testing a new antibiotic in the hospital.
Now Playing: Radiolab

Dispatch 6: Strange Times
Covid has disrupted the most basic routines of our days and nights. But in the middle of a conversation about how to fight the virus, we find a place impervious to the stalled plans and frenetic demands of the outside world. It's a very different kind of front line, where urgent work means moving slow, and time is marked out in tiny pre-planned steps. Then, on a walk through the woods, we consider how the tempo of our lives affects our minds and discover how the beats of biology shape our bodies. This episode was produced with help from Molly Webster and Tracie Hunte. Support Radiolab today at Radiolab.org/donate.