Formula to detect an author's literary 'fingerprint'

December 10, 2009

Using literature written by Thomas Hardy, DH Lawrence and Herman Melville, physicists in Sweden have developed a formula to detect different authors' literary 'fingerprints'.

New research published today, Thursday 10 December, in New Journal of Physics (co-owned by the Institute of Physics and German Physical Society), describes a new concept from a group of Swedish physicists from the Department of Physics at Umeå University called the meta book which uses the frequency with which authors use new words in their literature to find distinct patterns in authors' written styles.

For more than 75 years George Kingsley Zipf's maxim, based on a carefully selected compilation of American English called Brown Corpus, suggested a universal pattern for the frequency of new words used by authors. Zipf's law suggests that the frequency ranking of a word is inversely proportional to its occurrence.

New research suggests however that the truth behind word frequency is less universal than Zipf asserted and has more to do with the author's linguistic ability than any over-arching linguistic rule.

The researchers first found that the occurrence of new words in the texts by Hardy, Lawrence and Melville did begin to drop off in their texts as their book gets longer, despite new settings and plot-twists.

Their evidence also shows however that the rate of unique word drop-off varies for different authors and, most significantly, is consistent across the entire works of any one of the three authors they analysed.

The statistical analysis was applied to entire novels, sections from novels, complete works and amalgamations from different works by the same authors - they all had a unique word-frequency 'fingerprint'.

By using the statistical patterns evident from their study, the researchers have pondered the idea of a meta-book - a code for each author which could represent their entire work, completed or in the mental pipeline.

As the researchers write, "These findings lead us towards the meta book concept - the writing of a text can be described by a process where the author pulls a piece of text out of a large mother book (the meta book) and puts it down on paper. This meta book is an imaginary infinite book which gives a representation of the word frequency characteristics of everything that a certain author could ever think of writing."
-end-
The researchers' paper can be downloaded from Thursday 10 December free at http://www.iop.org/EJ/abstract/1367-2630/11/12/123015.

IOP Publishing

Related Physics Articles from Brightsurf:

Helium, a little atom for big physics
Helium is the simplest multi-body atom. Its energy levels can be calculated with extremely high precision only relying on a few fundamental physical constants and the quantum electrodynamics (QED) theory.

Hyperbolic metamaterials exhibit 2T physics
According to Igor Smolyaninov of the University of Maryland, ''One of the more unusual applications of metamaterials was a theoretical proposal to construct a physical system that would exhibit two-time physics behavior on small scales.''

Challenges and opportunities for women in physics
Women in the United States hold fewer than 25% of bachelor's degrees, 20% of doctoral degrees and 19% of faculty positions in physics.

Indeterminist physics for an open world
Classical physics is characterized by the equations describing the world.

Leptons help in tracking new physics
Electrons with 'colleagues' -- other leptons - are one of many products of collisions observed in the LHCb experiment at the Large Hadron Collider.

Has physics ever been deterministic?
Researchers from the Austrian Academy of Sciences, the University of Vienna and the University of Geneva, have proposed a new interpretation of classical physics without real numbers.

Twisted physics
A new study in the journal Nature shows that superconductivity in bilayer graphene can be turned on or off with a small voltage change, increasing its usefulness for electronic devices.

Physics vs. asthma
A research team from the MIPT Center for Molecular Mechanisms of Aging and Age-Related Diseases has collaborated with colleagues from the U.S., Canada, France, and Germany to determine the spatial structure of the CysLT1 receptor.

2D topological physics from shaking a 1D wire
Published in Physical Review X, this new study propose a realistic scheme to observe a 'cold-atomic quantum Hall effect.'

Helping physics teachers who don't know physics
A shortage of high school physics teachers has led to teachers with little-to-no training taking over physics classrooms, reports show.

Read More: Physics News and Physics Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.