Method ranks impact of computer and information science funding agencies, institutions & individuals

December 13, 2004

The National Science Foundation tops all national and international agencies for funding research that makes the most impact in computer and information science, according to Penn State researchers in the School of Information Sciences and Technology (IST).

The researchers have developed a new method which can automatically extract and identify acknowledgments of funding agencies, institutions and individuals in papers available on the Internet and indexed on CiteSeer, the largest public digital library of and search engine for papers in computer and information science.

"One of the measures of efficacy is impact, and a measure of impact is citations and acknowledgments," said Dr. C. Lee Giles, David Reese Professor of Information Sciences and Technology, and one of the developers of CiteSeer. "This automated method allows us to see which agencies are funding influential research in computer and information science. We speculate that these measures could be used to evaluate the efficacy of funding agencies and programs both at the national and international level."

In addition, the method could potentially enable the evaluation of individual research programs within a funding agency, Giles said. The method is described in a paper, "Who Gets Acknowledged: Measuring Scientific Contributions through Automatic Acknowledgment Indexing," in the current issue of the Proceedings of the National Academy of Sciences. Giles' co-author is Isaac Councill, a doctoral student in IST.

Authors typically conclude their scientific papers with acknowledgments which document intellectual debt, financial support or important editorial, conceptual or scientific contributions. Acknowledgments differ from citations as they reflect an active participation in the research by the acknowledged party or parties.

While corporations, educational institutions and individuals don't request acknowledgments, funding agencies do, Giles said. NSF had the most acknowledgments--12,287--of the 15 most often acknowledged funding agencies, followed by the Defense Advanced Research Projects Agency (DARPA) with 4,712. At the bottom of the top 15 was the Science and Engineering Research Council, established by the government of India in 1974, with 489 acknowledgments.

"NSF funds established researchers as well as takes a chance on new researchers," Giles said. "In the context of scientific research, NSF's total number of acknowledgments suggests it is doing extremely well."

But the researchers created a second measure: a metric of total citations (C) to number of acknowledgments (A). DARPA leads in this measure with a C/A of 17.12, suggesting that DARPA funds only established researchers who already have done high impact work, Giles said.

Using that metric, NSF placed third after DARPA and the Office of Naval Research. A German agency, Deutsche Forschungsgemeinschaft, had the lowest C/A at 3.52 although the fourth-highest number of acknowledgments, according to the researchers. Surprisingly, India's funding agency also has a very high C/A ratio even though the number of acknowledgements is 1/30 of NSF's.

Using another metric--what agency is acknowledged by the 100 most cited papers in CiteSeer--NSF tops the list followed by DARPA. In addition to funding agencies, the researchers also used the method to look at corporations, educational institutions and individuals acknowledged in computer and information science papers. The bad news for corporations is that funding research doesn't ensure survival.

Several of the companies--including AT&T Bell Labs and Digital Equipment Corporation--acknowledged in the 10-year period studied either don't exist anymore or have been greatly diminished, Giles said.

The company most often acknowledged was IBM; the company with the highest C/A metric was Siemens Corporation.

The most acknowledged authors represent a very international cast with many from the U.S but others from Israel, Denmark and the United Kingdom. Of the 15 most acknowledged authors, three are from Carnegie Mellon University.

In developing the automated acknowledgment measuring method, the researchers relied upon the CiteSeer digital library with more than 425,000 computer and information science research papers from the 1990s to the present that were harvested from the Web and submitted by users. While at the NEC Research Institute, Giles was one of the developers of CiteSeer, now hosted at IST as Using machine-learning methods that identify and classify data, the researchers developed new algorithms to extract acknowledgments from 335,000 documents.

Testing with 1,800 manually labeled documents showed the algorithms to achieve 78 percent precision--the amount of relevant information of all information retrieved--and almost 90 percent recall--the amount of information retried out of all relevant information--according to the researchers.

"While we used computer and information science papers, this automated method can be applied to any area of academic documents," Giles said.

"And it also allows us to add new metrics such as accounting for funding amounts and getting some idea of the impact for funds spent."
The full acknowledgment analysis data will soon be publicly available on CiteSeer as part of the Next Generation CiteSeer project. The research was supported by the National Science Foundation and Microsoft.

Penn State is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to