Who does what on Wikipedia?

March 11, 2010

The quality of entries in the world's largest open-access online encyclopedia depends on how authors collaborate, University of Arizona Professor Sudha Ram finds.

The patterns of collaboration between Wikipedia contributors have a direct effect on the data quality of an article, according to a new paper co-authored by a University of Arizona professor and graduate student.

Sudha Ram, a UA's Eller College of Management professor, co-authored the article with Jun Liu, a graduate student in the management information systems department (MIS). Their work in this area received a "Best Paper Award" at the Workshop on Information Technology and Systems held in conjunction with the International Conference on Information Systems, or ICIS.

"Most of the existing research on Wikipedia is at the aggregate level, looking at total number of edits for an article, for example, or how many unique contributors participated in its creation," said Ram, who is a McClelland Professor of MIS in the Eller College.

"What was missing was an explanation for why some articles are of high quality and others are not," she said. "We investigated the relationship between collaboration and data quality."

Wikipedia has an internal quality rating system for entries, with featured articles at the top, followed by A, B, and C-level entries. Ram and Liu randomly collected 400 articles at each quality level and applied a data provenance model they developed in an earlier paper.

"We used data mining techniques and identified various patterns of collaboration based on the provenance or, more specifically, who does what to Wikipedia articles," Ram says. "These collaboration patterns either help increase quality or are detrimental to data quality."

Ram and Liu identified seven specific roles that Wikipedia contributors play.

Starters, for example, create sentences but seldom engage in other actions. Content justifiers create sentences and justify them with resources and links. Copy editors contribute primarily though modifying existing sentences. Some users - the all-round contributors - perform many different functions.

"We then clustered the articles based on these roles and examined the collaboration patterns within each cluster to see what kind of quality resulted," Ram said. "We found that all-round contributors dominated the best-quality entries. In the entries with the lowest quality, starters and casual contributors dominated."

To generate the best-quality entries, she says, people in many different roles must collaborate. Ram and Liu suggest that the results of this study should spark the design of software tools that can help improve quality.

"A software tool could prompt contributors to justify their insertions by adding links," she said, "and down the line, other software tools could encourage specific role setting and collaboration patterns to improve overall quality."

The impetus behind the paper came from Ram's involvement in UA's $50 million iPlant Collaborative, which is funded by the National Science Foundation and aims to unite the international scientific community around solving plant biology's "grand challenge" questions. Ram's role as a faculty advisor is to develop a cyberinfrastructure to facilitate collaboration.

"We initially suggested wikis for this, but we faced a lot of resistance," she said. Scientists expressed concerns ranging from lack of experience using the wikis to lack of incentive.

"We wondered how we could make people collaborate," Ram said. "So we looked at the English version of Wikipedia. There are more than three million entries, and thousands of people contribute voluntarily on a daily basis."

The results of this research have helped guide recommendations to the iPlant collaborators.

"If we want scientists to be collaborative," Ram said, "we need to assign them to specific roles and motivate them to police themselves and justify their contributions."

University of Arizona

Related Wikipedia Articles from Brightsurf:

Wikipedia visits to disease outbreak pages show impact of news media on public attention
During the 2016 Zika outbreak, news exposure appears to have had a far bigger impact than local disease risk on the number of times people visited Zika-related Wikipedia pages in the U.S.

Automated system can rewrite outdated sentences in Wikipedia articles
A system created by MIT researchers could be used to automatically update factual inconsistencies in Wikipedia articles, reducing time and effort spent by human editors who now do the task manually.

Wikipedia, a source of information on natural disasters biased towards rich countries
This is the result of a study led by Valerio Lorini, a PhD student on the ICT programme, led by Carlos Castillo, coordinator of the Web Science and Social Computing group, with Javier Rando, a student at UPF doing the bachelor's degree in Mathematical Engineering in Data Science, focusing on flooding as a case study.

Rise of the bots: Stevens team completes first census of Wikipedia bots
Researchers at Stevens Institute of Technology, in Hoboken, N.J., have completed the first analysis of all 1,601 of Wikipedia's bots, using computer algorithms to classify them by function and shed light on the ways that machine intelligences and human users work together to improve and expand the world's largest digital encyclopedia.

Secretome of pleural effusions associated with non-small cell lung cancer (NSCLC) and malignant meso
Cryopreserved cell-free PE fluid from 101 NSCLC patients, 8 mesothelioma and 13 with benign PE was assayed for a panel of 40 cytokines/chemokines using the Luminex system.

Anatomy of a cosmic seagull
Colourful and wispy, this intriguing collection of objects is known as the Seagull Nebula, named for its resemblance to a gull in flight.

The Wikipedia gender gap
In a recent University of Washington study, researchers interviewed women 'Wikipedians' to examine the lack of female and non-binary editors in Wikipedia.

Dermatology students improve Wikipedia entries on skin disease
A group of medical students recruited to improve Wikipedia articles on skin-related diseases, saw millions more views of those stories following their editing, highlighting the value of expert input on the popular web encyclopedia.

Could internet activity provide accurate in plant and animal conservation?
More than a quarter of the species in their dataset showed seasonal interest.

Analysis of billions of Wikipedia searches reveals biodiversity secrets
An international team of researchers from the University of Oxford, the University of Birmingham and Ben-Gurion University of the Negev have found that the way in which people use the internet is closely tied to patterns and rhythms in the natural world.

Read More: Wikipedia News and Wikipedia Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.