Wednesday, April 23, 2014

Topic Modelling of The Beaver Coat and The Broken Jug

        We have decided to take a different approach to our project.  Instead of using a database to organize information about all the plays we have read in this course, we have narrowed down our scope and are going to do topic modeling to compare and contrast The Broken Jug and The Beaver Coat.  A topic model is “a type of statistical model for discovering the abstract topics that occur in a collection of documents.” (Wikipedia)  Our goal is to learn how both texts present the themes of crimes and justice by analyzing the most common words in each text and how they are used.  

We will use a software application to find out which words appear the most in each play.  Once we know the words, we will compare the lists to see if they are similar or different.  Our hypothesis is that we will find more words associated with the court and justice in The Broken Jug and more words linked to lies and deception in The Beaver Coat.  Since these texts have opposing endings where justice is served in one but not in the other, we expect the words to mirror this contrast.

We will not only be looking at the words and the number of times they appear in each text, but we will also look at how those words are being used and the context they are used in.  This is where we will really see how the two texts, The Broken Jug and The Beaver Coat, present the themes of crime and justice.  The Broken Jug may have a very high number of words associated with justice, but what if they are all in phrases talking about the corruption of justice? Then that changes the whole attitude towards justice.  In The Beaver Coat, it is never actually written that Mrs. Wolff stole the beaver coat.  Instead, the theft is implied and the reader has to piece together the information of what happened.  This may have an impact on the words we find and also the context.

This project seems appropriate for this course because it will allow us to dig deeper into how the theme of justice is portrayed in the texts.  Both plays were written in Germany, with The Broken Jug published in 1808 and The Beaver Coat in 1893.  Since both plays originate from the same location and culture and are relatively close together in time, it is possible that the word lists will be similar, but on the other hand, the time gap could still be responsible for a few differences.

To display this information, we will create charts that show which words appeared the most in each text and charts that show what type of words are used alongside those which appear most. We might also make a Venn diagram, with words unique to one text on one side or the other and common words in the middle.  Hopefully, with these visual representations, it will be clear how each text approaches the theme of justice.

Timmy Eisenbraun
Megan Bell

1 comment:

  1. I believe that you made the right decision to narrow your focus. I can perceive how trying to compare all of the works we went over could be very overwhelming.

    ReplyDelete