A large part of a book’s sense of style comes from the words the author chooses to use. Was the evening sky scarlet, crimson or red? Did the children laugh with delight or chuckle with glee? Was the sailor depressed, despondent or just a little sad?
What then are the words that help to give Frankenstein its unique identity? And how do we find them?
Well, the how is easy enough. Google has been digitizing every book they can get their hands on and as a public service have released word count and n-gram data based on their massive collection. This data is helpfully split up by year making it extremely easy to calculate what modern written English “should” sound like, where I’m defining “modern” as the time period between 1990 and 2005.
Yes, I am aware that’s over a decade out of date now but analyzing every book in existence is apparently taking Google some time so we’ll be working withing the limits of the data they have provided.
With those records in hand it’s easy enough to go through the entirety of Frankenstein and count how often every word occurred. By comparing those counts to the total word count we get a set of Frankenstein frequencies that we can compare against our modern word frequencies. This generates an interesting list of words that Mary Shelly used significantly more frequently than would be expected.
For instance, Frankenstein included 66 instances of the word “miserable”. That means it shows up almost once for every thousand words in the book, which is 300 times more often than we would expect it to show up in an “average” piece of modern writing. (Whether such an average book actually exists is questionable, but the concept is still useful.)
Not all statistics are as clear though. One interesting limiting factor here is that Frankenstein is a relatively short book, so even words that Mary Shelly only used once, like bauble, wind up with a frequency of 1 in 72,000. But is that frequency “real”? If the book was twice as long would she have used it a second time? There’s really no way to tell.
So while it is definitely interesting to keep track of “rare” words that Frankenstein included only once we cannot attach quite as much significance to them as we do to the unusual words that the author used on a regular enough basis for us to calculate a more robust frequency.
With all that being said, here are some of the more interesting results from the word frequency scan.
First up: Proper nouns. Stories obviously have characters and take place in locations and the names of those people and places are going to show up a lot. So we see that “Frankenstein” and “Elizabeth” are both used hundreds of thousands of times more often than average and that the city of “Ingolstadt” gets over two million times more attention than it has in recent writing.
Of course none of that really means anything. It’s the nature of books to have protagonists and antagonists that are mentioned again and again within that work and seldom at all outside of it. But having to filter out that results to get to the more interesting words took some effort so I thought I might as well mention them.
Slightly more meaningful are what I would call tone words: Words that help establish the unique Gothic Horror atmosphere of the story. This includes 33 instances of “fiend” and 27 instances of “wretch”. There were also 6 instances of the delightfully dark word “malignity”, giving it a frequency over a thousand times higher than our comparison modern set.
Other rare words have less to do with the genre and perhaps more to do with the time period. Frankenstein doesn’t just work, he “toils” a full 7 times (650 times more than modern protagonist would!). Things aren’t annoying, but on six occasions they are “irksome” (that’s 520 times more irksome than expected). And things aren’t done passionately or energetically nearly as often as they are done “ardently”, which has 10 occurrences placing it in the “used more than 500 times more frequently than expected” club.
And finally let us conclude with a list of interesting one-off rare words: perambulations, rankle, auguries, enkindled, adjuration, insatiate and exculpated.