How can you tell a passage is by Shakespeare, Hamilton or Madison?

1. Apparently the best way to distinguish between a text by Hamilton and one by Madison is the use of filler words like “on” or “upon” – words that appeared independent of the content. The latter appears in 1 per 1000 words in Madison but 6 per 1000 in Hamilton. This discovery was made by Frederick Mosteller, the founder of Harvard’s Statistics Department.

2. By contrast the cleverest tool used to authenticate a Shakespeare text is to count the number of words in the text that appear no where else in the Shakespeare canon. The more such words, paradoxically, the better evidence that the text is genuine.
Shakespeare’s 884,640 words included 31,534 distinct words with many occurring three or fewer times. Source: Michael Starbird, Meaning from Data.

3. Have you ever been struck by an author’s obsession with a word that appears an inordinate number of times? The most memorable instance for me was the use of “mild” by Herman Melville in Moby Dick. I wrote a paper about in graduate school. “It’s a mild, mild wind and a mild-looking sky. It’s on such
a day I struck my first whale…”

YOUR TURN: What’s the neatest trick of textual analysis you ever learned in a literature course?