More-flexible digital communication

December 12, 2014

Communication protocols for digital devices are very efficient but also very brittle: They require information to be specified in a precise order with a precise number of bits. If sender and receiver -- say, a computer and a printer -- are off by even a single bit relative to each other, communication between them breaks down entirely.

Humans are much more flexible. Two strangers may come to a conversation with wildly differing vocabularies and frames of reference, but they will quickly assess the extent of their mutual understanding and tailor their speech accordingly.

Madhu Sudan, an adjunct professor of electrical engineering and computer science at MIT and a principal researcher at Microsoft Research New England, wants to bring that type of flexibility to computer communication. In a series of recent papers, he and his colleagues have begun to describe theoretical limits on the degree of imprecision that communicating computers can tolerate, with very real implications for the design of communication protocols.

"Our goal is not to understand how human communication works," Sudan says. "Most of the work is really in trying to abstract, 'What is the kind of problem that human communication tends to solve nicely, [and] designed communication doesn't?' -- and let's now see if we can come up with designed communication schemes that do the same thing."

One thing that humans do well is gauging the minimum amount of information they need to convey in order to get a point across. Depending on the circumstances, for instance, one co-worker might ask another, "Who was that guy?"; "Who was that guy in your office?"; "Who was that guy in your office this morning?"; or "Who was that guy in your office this morning with the red tie and glasses?"

Similarly, the first topic Sudan and his colleagues began investigating is compression, or the minimum number of bits that one device would need to send another in order to convey all the information in a data file.

Uneven odds

In a paper presented in 2011, at the ACM Symposium on Innovations in Computer Science (now known as Innovations in Theoretical Computer Science, or ITCS), Sudan and colleagues at Harvard University, Microsoft, and the University of Pennsylvania considered a hypothetical case in which the devices shared an almost infinite codebook that assigned a random string of symbols -- a kind of serial number -- to every possible message that either might send.

Of course, such a codebook is entirely implausible, but it allowed the researchers to get a statistical handle on the problem of compression. Indeed, it's an extension of one of the concepts that longtime MIT professor Claude Shannon used to determine the maximum capacity of a communication channel in the seminal 1948 paper that created the field of information theory.

In Sudan and his colleagues' codebook, a vast number of messages might have associated strings that begin with the same symbol. But fewer messages will have strings that share their first two symbols, fewer still strings that share their first three symbols, and so on. In any given instance of communication, the question is how many symbols of the string one device needs to send the other in order to pick out a single associated message.

The answer to that question depends on the probability that any given interpretation of a string of symbols makes sense in context. By way of analogy, if your co-worker has had only one visitor all day, asking her, "Who was that guy in your office?" probably suffices. If she's had a string of visitors, you may need to specify time of day and tie color.

Existing compression schemes do, in fact, exploit statistical regularities in data. But Sudan and his colleagues considered the case in which sender and receiver assign different probabilities to different interpretations. They were able to show that, so long as protocol designers can make reasonable assumptions about the ranges within which the probabilities might fall, good compression is still possible.

For instance, Sudan says, consider a telescope in deep-space orbit. The telescope's designers might assume that 90 percent of what it sees will be blackness, and they can use that assumption to compress the image data it sends back to Earth. With existing protocols, anyone attempting to interpret the telescope's transmissions would need to know the precise figure -- 90 percent -- that the compression scheme uses. But Sudan and his colleagues showed that the protocol could be designed to accommodate a range of assumptions -- from, say, 85 percent to 95 percent -- that might be just as reasonable as 90 percent.

Buggy codebook

In a paper being presented at the next ITCS, in January, Sudan and colleagues at Columbia University, Carnegie Mellon University, and Microsoft add even more uncertainty to their compression model. In the new paper, not only do sender and receiver have somewhat different probability estimates, but they also have slightly different codebooks. Again, the researchers were able to devise a protocol that would still provide good compression.

They also generalized their model to new contexts. For instance, Sudan says, in the era of cloud computing, data is constantly being duplicated on servers scattered across the Internet, and data-management systems need to ensure that the copies are kept up to date. One way to do that efficiently is by performing "checksums," or adding up a bunch of bits at corresponding locations in the original and the copy and making sure the results match.

That method, however, works only if the servers know in advance which bits to add up -- and if they store the files in such a way that data locations correspond perfectly. Sudan and his colleagues' protocol could provide a way for servers using different file-management schemes to generate consistency checks on the fly.

"I shouldn't tell you if the number of 1's that I see in this subset is odd or even," Sudan says. "I should send you some coarse information saying 90 percent of the bits in this set are 1's. And you say, 'Well, I see 89 percent,' but that's close to 90 percent -- that's actually a good protocol. We prove this."
-end-
Written by Larry Hardesty, MIT News Office

Massachusetts Institute of Technology

Related Communication Articles from Brightsurf:

Video is not always effective in science communication
What we can learn for online public relations: - Keep the information concise so that one can go thorough it within about 1 minute.

Ultraviolet communication to transform Army networks
Of ever-increasing concern for operating a tactical communications network is the possibility that a sophisticated adversary may detect friendly transmissions.

Adding noise for completely secure communication
How can we protect communications against 'eavesdropping' if we don't trust the devices used in the process?

How serotonin balances communication within the brain
Our brain is steadily engaged in soliloquies. These internal communications are usually also bombarded with external sensory events.

Breaking the communication code
Ever wonder how mice talk to each other. We don't have a dictionary quite yet, but UD neuroscientist Josh Neunuebel and his lab have linked mice chatter (their ultrasonic vocalizations) with specific behaviors.

A new twist on quantum communication in fiber
New research done at the University of the Witwatersrand in Johannesburg, South Africa, and Huazhang University of Science and Technology in Wuhan, China, has exciting implications for secure data transfer across optical fiber networks.

Study traces evolution of acoustic communication
A study tracing acoustic communication across the tree of life of land-living vertebrates reveals that the ability to vocalize goes back hundreds of millions of years, is associated with a nocturnal lifestyle and has remained stable.

Should preschool writing be more communication and less ABCs?
Writing instruction in early education should be about more than letter formation and penmanship, argue Michigan State University researchers who found preschool teachers don't often encourage writing for communication purposes.

Trump's Twitter communication style shifted over time based on varying communication goals
The linguistic and discursive style of Donald Trump's tweets varied systematically before, during, and after the 2016 presidential campaign, depending on the communicative goals of Trump and his team, according to a study published Sept.

Intercultural communication crucial for engineering education
In an increasingly connected world it helps to engage with other cultures without prejudice or assumption.

Read More: Communication News and Communication Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.