Modern tools to unlock ancient texts

December 01, 2005

The CHLT project also unified several important digital library collections - such as Isaac Newton's manuscripts - and early modern scientific texts, as well as creating new digital library collections of Old Norse sagas. It's a vast achievement.

"It was a remarkably successful project between the National Science Foundation in the US and EU institutions. It generated results beyond expectations, and illustrated how essential it is to work together to create an integrated global infrastructure for scholarly research," says CHLT's European coordinator Dolores Iorizzo from the London e-Science Centre.

The team wanted to find the most effective ways to use technology to interpret digitised, historic manuscripts. CHLT responds to the challenges faced by teachers, students and scholars who are working with texts written in Ancient Greek, Mediaeval and Early-Modern Latin, and Old Norse.

The number of primary texts - arguably the most important resource for historians and linguists - is staggering. Hundreds of important texts and manuscripts, consisting of millions of words have been integrated into the CHLT open access repository that can also be viewed within the world's oldest and largest cultural heritage database at the Perseus Project in Tufts University, Boston.

CHLT created new text collections written in Early-Modern Latin and Old Norse. It integrated those new books and manuscripts with well-established digital texts, and it created a digital library environment that allows for high-resolution images of pages from rare and fragile printed books, and manuscripts. These are presented alongside transcriptions so that the originals can be viewed alongside diplomatic and normalised versions of the material.

The project successfully developed a host of powerful language analysis tools that will help readers to understand texts written in these difficult languages by offering parsers, which automatically determine the grammatical identity of a word.

This is important because these ancient languages are highly inflected. The meaning of a word does not depend on its position in the sentence, but its grammatical case, which indicates which words are the subject or object of the sentence. Parsers analyse the underlying grammatical context to tease out the meaning. What's more, these parsers were integrated into a digital library reading environment that automatically generates hypertext links. So a user can click on a word, register its identity and look it up in a dictionary. CHLT also built a multilingual information retrieval tool that allows users to enter queries in English and search texts written in Greek and Latin.

Experienced scholars can use the parser to check an unfamiliar word, or a word used in an unusual context. Students and scholars without Greek, Latin or Old Norse can painstakingly translate ancient texts word-by-word. The tool will provide an enormous boost to the study of these ancient languages and culture, while scholars from other fields will have access to texts even if they don't speak the language.

"We've lowered the barrier for access to primary texts, so now it's no longer the academic elite who have access and can read these historically important manuscripts," says Iorizzo. Users can even upload their own texts for parsing and analysis. Those texts will then be added to the library so the collection will grow organically over time.

The word profile tool that integrates statistical data about how often a particular word is used in a set of collections uses a single interface to link words to full citations of the texts in which they appear. Right now, scholars are using this to write the first new Greek-English lexicon to be created in more than one hundred years.

CHLT also created tools that allow for the computational study of writing style. This includes tools to discover common subjects and objects of Greek and Latin verbs, the relative frequency of different grammatical forms, and the distribution of grammatical forms in texts.

The project has revolutionised historical research by introducing new digital library architectures and protocols for resource discovery and metadata sharing in affiliated digital libraries. It represents a major step towards unifying Europe's diverse digital collections.

CHLT supports Open Access and Berlin Declaration policies, and has negotiated a free open-access agreement with Cambridge University Press for an electronic edition of the Greek-English lexicon to be published online simultaneously with the print edition; it has also explored ways that these tools can be used and shared across cooperating digital libraries.

This is another big step toward creating a global infrastructure for cultural heritage. The CHLT consortium now hopes to develop these technologies in a Grid-distributed network capable of linking all of Europe's 100,000-plus 'memory institutions' - libraries, archives and museums, and large-scale digital repositories.

"At present, Europe's memory is preserved in compartmentalised silos of information within separate databases and websites," says Iorizzo. "What we would like to do is to provide an infrastructure that integrates, at a metadata and data level, the rich resources of European Cultural Heritage so that everything can be accessed, searched and preserved by anyone for generations to come."
Dolores Iorizzo
The London e-Science Centre and The Newton Project
Imperial College London
Tel: +44-207-3707786

CHLT project website
Newton Project
Perseus Project
The Stoa Consortium

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

IST Results

Related Infrastructure Articles from Brightsurf:

Climate change undermines the safety of buildings and infrastructure in Europe
The higher temperatures expected over the next 50 years in Europe will accelerate corrosion of buildings, and will expose infrastructure to higher stresses, thus undermining the safety of constructions.

Leveraging biodiversity science infrastructure in the COVID-19 era
The BioScience Talks podcast ( features discussions of topical issues related to the biological sciences.

Multiple flooding sources threaten Honolulu's infrastructure
In a study published in Scientific Reports, researchers at the University of Hawai'i at Mānoa, found in the next few decades, sea level rise will likely cause large and increasing percentages of land area to be impacted simultaneously by the three flood mechanisms.

Cybersecurity, tech infrastructure requires international trust
In new research published in the Journal Technology and Culture, Rebecca Slayton, professor of science and technology studies at Cornell University, uses the field of incident response to shed light on how experts -- and nations -- can more effectively combat cyberwarfare when they foster trust and transcend politics.

Green infrastructure provides benefits that residents are willing to work for, study shows
Urban areas face increasing problems with stormwater management. Green infrastructure, including features such as rain barrels, green roofs, rain gardens, and on-site water treatment, can provide affordable and environmentally sound ways to manage precipitation.

A new approach to making airplane parts, minus the massive infrastructure
MIT engineers have developed a method to produce aerospace-grade composites without the enormous ovens and pressure vessels.

AI could transform how we monitor the structural health of civil infrastructure
The University of Surrey and King's College London have developed a new machine learning algorithm (AI) that could transform the way we monitor major infrastructure - such as dams and bridges.

Climate change could hasten deterioration of US bridge infrastructure
Hussam Mahmoud is studying the toll climate change may take on aging US infrastructure, which includes over 600,000 bridges.

The benefits of updating agricultural drainage infrastructure
The massive underground infrastructure that allows farmers to cultivate crops on much of the world's most productive land has outlived its design life and should be updated, according to a new study.

Using Wall Street secrets to reduce the cost of cloud infrastructure
Inspired by Wall Street financial theories used to invest in the stock market, MIT and Microsoft researchers developed a 'risk-aware' model that improves the performance of cloud-computing infrastructure used across the globe.

Read More: Infrastructure News and Infrastructure Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to