Nav: Home

Information theory holds surprises for machine learning

January 24, 2019

New research challenges a popular conception of how machine learning algorithms "think" about certain tasks.

The conception goes something like this: because of their ability to discard useless information, a class of machine learning algorithms called deep neural networks can learn general concepts from raw data-- like identifying cats generally after encountering tens of thousands of images of different cats in different situations. This seemingly human ability is said to arise as a byproduct of the networks' layered architecture. Early layers encode the "cat" label along with all of the raw information needed for prediction. Subsequent layers then compress the information, as if through a bottleneck. Irrelevant data, like the color of the cat's coat, or the saucer of milk beside it, is forgotten, leaving only general features behind. Information theory provides bounds on just how optimal each layer is, in terms of how well it can balance the competing demands of compression and prediction.

"A lot of times when you have a neural network and it learns to map faces to names, or pictures to numerical digits, or amazing things like French text to English text, it has a lot of intermediate hidden layers that information flows through," says Artemy Kolchinsky, an SFI Postdoctoral Fellow and the study's lead author. "So there's this long-standing idea that as raw inputs get transformed to these intermediate representations, the system is trading prediction for compression, and building higher-level concepts through this information bottleneck."

However, Kolchinsky and his collaborators Brendan Tracey (SFI, MIT) and Steven Van Kuyk (University of Wellington) uncovered a surprising weakness when they applied this explanation to common classification problems, where each input has one correct output (e.g., in which each picture can either be of a cat or of a dog). In such cases, they found that classifiers with many layers generally do not give up some prediction for improved compression. They also found that there are many "trivial" representations of the inputs which are, from the point of view of information theory, optimal in terms of their balance between prediction and compression.

"We found that this information bottleneck measure doesn't see compression in the same way you or I would. Given the choice, it is just as happy to lump 'martini glasses' in with 'Labradors', as it is to lump them in with 'champagne flutes,'" Tracey explains. "This means we should keep searching for compression measures that better match our notions of compression."

While the idea of compressing inputs may still play a useful role in machine learning, this research suggests it is not sufficient for evaluating the internal representations used by different machine learning algorithms.

At the same time, Kolchinsky says that the concept of trade-off between compression and prediction will still hold for less deterministic tasks, like predicting the weather from a noisy dataset. "We're not saying that information bottleneck is useless for supervised [machine] learning," Kolchinsky stresses. "What we're showing here is that it behaves counter-intuitively on many common machine learning problems, and that's something people in the machine learning community should be aware of."
-end-
The paper has been accepted to the 2019 International Conference on Learning Representations (ICLR 2019).

A copy of the preprint is available on arXiv.

Santa Fe Institute

Related Balance Articles:

Examining associations between hearing loss, balance
About 3,800 adults 40 and older in South Korea participating in a national health survey were included in this analysis that examined associations between hearing loss and a test of their ability to retain balance.
Pancreatic cancer blocked by disrupting cellular pH balance
Scientists at Sanford Burnham Prebys have found a new way to kill pancreatic cancer cells by disrupting their pH equilibrium.
Stop or go: The cell maintains its fine motility balance with the help of tropomodulin
Tropomodulin maintains the fine balance between the protein machineries responsible for cell movement and morphogenesis.
Need to balance guides development of limb-body coordination
The need to feel balanced drives the development of coordination between body and limbs as zebrafish larvae learn to swim, a new study finds.
Scientists weigh the balance of matter in galaxy clusters
A method of weighing the quantities of matter in galaxy clusters - the largest objects in our universe - has shown a balance between the amounts of hot gas, stars and other materials.
A matter of fine balance
How does the brain's circuitry adjust itself to make sense of the world despite the hugely different signals it receives?
Virtual reality could improve your balance, study finds
Virtual Reality technology could become an efficient tool for older people with balance problems or for rehabilitation following injuries or illness that affect balance and movement.
Achieving a balance: Animal welfare and conservation
In a paper recently published in the journal Frontiers in Veterinary Science, a team of researchers, animal care experts and veterinarians evaluate the balance between animal welfare and conservation needs for a number of rare species of native birds being raised in San Diego Zoo Global breeding centers in Hawaii.
New device improves balance in veterans with Gulf War Illness
Gulf War veterans with unexplained illnesses that cause fatigue, headaches, respiratory disorders and memory problems can improve their balance with a device developed by Rutgers University researchers.
European workers fail to maintain water balance
A newly published scientific paper indicates that occupational safety and daily day performance in seven out of 10 workers, from several European industries, is negatively affected by a combination of heat stress and failure to maintain water balance.
More Balance News and Balance Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Climate Mindset
In the past few months, human beings have come together to fight a global threat. This hour, TED speakers explore how our response can be the catalyst to fight another global crisis: climate change. Guests include political strategist Tom Rivett-Carnac, diplomat Christiana Figueres, climate justice activist Xiye Bastida, and writer, illustrator, and artist Oliver Jeffers.
Now Playing: Science for the People

#562 Superbug to Bedside
By now we're all good and scared about antibiotic resistance, one of the many things coming to get us all. But there's good news, sort of. News antibiotics are coming out! How do they get tested? What does that kind of a trial look like and how does it happen? Host Bethany Brookeshire talks with Matt McCarthy, author of "Superbugs: The Race to Stop an Epidemic", about the ins and outs of testing a new antibiotic in the hospital.
Now Playing: Radiolab

Speedy Beet
There are few musical moments more well-worn than the first four notes of Beethoven's Fifth Symphony. But in this short, we find out that Beethoven might have made a last-ditch effort to keep his music from ever feeling familiar, to keep pushing his listeners to a kind of psychological limit. Big thanks to our Brooklyn Philharmonic musicians: Deborah Buck and Suzy Perelman on violin, Arash Amini on cello, and Ah Ling Neu on viola. And check out The First Four Notes, Matthew Guerrieri's book on Beethoven's Fifth. Support Radiolab today at Radiolab.org/donate.