In this blog post, I will explore the data of the novel “Wuthering Heights,” by Emily Bronte, using the online platforms Gutenberg and Voyant tools. I apologise in advance for the blurry screenshots, my MacBook was very frustrated with me throughout this entire process.
So, initially, I wanted to make a prediction about the novel itself (which I’ve never read). I predicted that this novel is a gothic story and has very proper terms like “lady,” “sir,” and possibly “Captain” woven throughout its text. I predicted that it’s set in England in a city called “Wuthering Heights” and that it’s about a young woman who must choose a suitor. After making this prediction, I headed to Gutenberg and searched for the book I’ve wanted to read for such a long time.
I then selected the novel and clicked “Plain text” to get me to the next screen.
I copied all of that text and then pasted it into a google doc entitled “Wuthering Heights.”
I then when to Voyant-tools and pasted the novel once again, then hit the “Reveal” button to bring me to the next screen.
The system had now divided up the text into 10 segments by its words and phrases for me to see.
I clicked on phrases and was delighted to see that one of the most common phrases in the novel is “If you don’t let me in I’ll kill you.”
My initial prediction about what the novel was about seemed to be not as far off as I thought. Words like “master” and “mr” were very common in the novel, with the most common being “Catherine” and “Heathcliff,” who I would guess are the two main characters and possible love interests. That reading would put me closer to the idea of a lady picking a suitor, but with the common phrase being about “killing,” I might be off just a tad.
From these screens, I went on to try some of the tools:
- The trends tools allowed me to see the most common words and when they popped up in each segment of the novel. The word or name “linton” had the most dynamic trend of all the words, peaking and falling twice dramatically in the novel.
- The topics tool allowed me to see what words were often connected with the most common words in the novel, like “Heathcliff” in this picture. “Heathcliff” was most often paired with “Mr.” and “Mrs.”
- The textualarc tool gave a representation of the keywords and their frequency in the novel by having red lines represent how often they were used when hovered over by the mouse.
Some benefits of the tools: The tools allowed me to have visual representations of the most important phrases and words of the novel easily. They made the data more manageable and easily accessible than they would be in regular book form.
Some limitations of the tools: Some of the tools, like the textualarc, became too overloaded with information and this made the visual representation messy and hard to comprehend.
In conclusion, I think I’m going to read “Wurthering Heights.” I mean, if there’s anything this data’s shown me it’s that there’s “killing” in this novel and a character named “Mr. Heathcliff,” and that’s cool enough for me. Also, it was really interesting looking at a novel from its data, before even reading it. The key words, trends, and phrases that the Voyant-tool system picked up from the novel showed a lot about its main subjects, which seem to be the characters “Catherine” and “Heathcliff,” and how they interact like, “If you don’t let me in I’ll kill you.”
I also think these tools could be a great resource to any classroom or researcher. Teachers assigning book reports or research papers on novels or specific authors could provide these sites as key tools for students and researchers could use the sites for easy data access (they wouldn’t have to scrounge through entire novels, only just copy and paste it into the site instead).
-Sarah DeLena