Rensselaer Polytechnic Institute professor wins HP Innovation Award

September 01, 2010

Troy, N.Y. - The universe around us can be expressed as numbers, and those numbers in pattern paint a picture: a network of friends from the vastness of the Internet; travel patterns among residents of cold climates; or a common factor among victims of a disease. Now, the deluge of digital-age data makes possible more complex patterns and a more complete picture - the link between the friends, their travel, and the illness.

The work of Mohammed J. Zaki, a Rensselaer professor of computer science, will enable us to see links and patterns we did not know existed. In recognition of the importance of his work, Zaki has been selected for the 2010 HP Labs Innovation Research Program.

Zaki searches for so-called "graph" patterns - patterns represented by links between points - on an unprecedented level, designing algorithms and systems that connect data through multiple layers and links.

"The ability to connect all the dots from the disparate data sources is extremely important to gain critical insights into complex social, business, or scientific phenomena of interest," said Zaki.

Early systems for gathering and analyzing data were built on a "transactional" basis, looking at one transaction at a time within a particular category (such as an e-mail to a friend among e-mail records, or a flight to a particular destination among flight records, or an urgent visit to a hospital among hospital admissions records) independent of links that exist between those different transactions. With the preponderance of data, the need for a new model has grown.

"If you think of the data of the world today, everything is related to everything else," Zaki said. "Existing statistical frameworks have to be extended to reflect the new interconnectedness of this world."

For example, Zaki said, researchers may have studied the structural properties of social or other interaction networks, but have largely ignored the content of individual nodes and entities.

"Other people have primarily looked at the topology of the networks; my goal is to simultaneously add the content," Zaki said.

According to HP, the prestigious Innovation Research Program is designed to provide colleges, universities, and research institutes around the world with opportunities to conduct breakthrough collaborative research with HP.

HP reviewed more than 375 proposals from 202 universities across 36 countries. Rensselaer is one of only 52 universities in the world to receive a 2010 Innovation Research award. The HP Labs Innovation Research Program is designed to encourage open collaboration between HP and the academic community on mutually beneficial, high-impact research. This year's proposals were solicited on a range of topics within the eight broad research themes at HP Labs: analytics, cloud, content transformation, digital commercial print, immersive interaction, information management, intelligent infrastructure, and sustainability.

"Our goal with the HP Labs Innovation Research Program is to inspire the brightest minds from around the world to conduct high-impact scientific research, addressing the most important challenges and opportunities facing society in the next decade," said Prith Banerjee, senior vice president of research at HP and director of HP Labs. "Rensselaer has demonstrated outstanding achievement and we look forward to collaborating with it in this dynamic area of research."

The award will allow Zaki, working with HP labs, to tackle two specific problems: graph pattern mining and graph indexing.

Pattern mining allows data miners to establish whether a links exists between two or more things.

"We're looking for hidden patterns: Are entities connected? Are they connected in a particular configuration?" Zaki said.

Pattern mining enables advances in a broad variety of applications like the development of the semantic web, proteins-protein interactions, social networks, and pattern discovery.

Graph indexing refers to systems that can store complex graph and network data.

"Once you have enhanced interlinked data sets, how do you store them? It's a way of physically storing data on a system on the back end for rapid search and mining," Zaki said.

Rensselaer Polytechnic Institute

Related Data Articles from Brightsurf:

Keep the data coming
A continuous data supply ensures data-intensive simulations can run at maximum speed.

Astronomers are bulging with data
For the first time, over 250 million stars in our galaxy's bulge have been surveyed in near-ultraviolet, optical, and near-infrared light, opening the door for astronomers to reexamine key questions about the Milky Way's formation and history.

Novel method for measuring spatial dependencies turns less data into more data
Researcher makes 'little data' act big through, the application of mathematical techniques normally used for time-series, to spatial processes.

Ups and downs in COVID-19 data may be caused by data reporting practices
As data accumulates on COVID-19 cases and deaths, researchers have observed patterns of peaks and valleys that repeat on a near-weekly basis.

Data centers use less energy than you think
Using the most detailed model to date of global data center energy use, researchers found that massive efficiency gains by data centers have kept energy use roughly flat over the past decade.

Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.

Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.

Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.

Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.

Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.

Read More: Data News and Data Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to