I’ve been pretty silent on this space for a while now, because all my time has been consumed with finishing my degree project for MICA’s Data Analytics and Visualization program. But the end is in sight! What have I been up to?
Basically, tearing my hair out. But also, working with a lot of IMDb data, and learning cool things about Python, R, Tableau, Illustrator, Gephi, and horror film.
For the past three months or so, I’ve been pulling together a passion project of mine–seeking a way to prove Carol Clover’s statement in Men, Women, and Chainsaws that “[a] strong prima facie case could be made for horror’s being…the most self-reflexive of cinematic genres” (168). While there is a lot that I wanted to do, I ended up limiting the project very specifically to looking at proportions of films that are and are not horror, as coded by crowd-sourcing in IMDb, in relation to the use of words that are associated with acts of creative expression or making. While Clover looks at the prevalence of eyes in horror, I took a data-driven approach. I thought that by looking at the way horror film thematizes the act of making and creating, I could get at the question of whether horror is the most self-reflexive of genres.
How did you even get into this?
I grew up a tomboy, surrounded by brothers, stepbrothers, and brothers’ friends. I stole my mother’s library copy of Stephen King’s Christine when I was about 12, and I was introduced to Monty Python and the Holy Grail a few years later. The title for this project is drawn from King’s writing memoir, where he exhorts young writers to be willing to destroy their creations if they don’t serve the purpose (and a shout-out to Dan M. for this suggestion!). From Monty Python, my favorite scenes were those with The Black Knight and the Killer Bunny. Why? They were super gory to my young eyes, and also so obviously fake. Like when a boom mic drops down from the corner of a movie you’re watching. This led to a lifelong fascination with horror, especially self-conscious, campy horror. I think my favorite film is Peter Jackson’s 1992 Dead Alive, which if you have seen it, will give you a really good sense of where I’m coming from.
As I grew up, I read scholars like Carol Clover, Noel Carrol, Robin Woods, Laura Mulvey, and Linda Williams, among others. I found in them a kind of validation of my interest in horror. I was really influenced by Tom Gunning’s work on early cinematic spectatorship and the “Cinema of Attractions.” The supposed panic surrounding one of the first moving pictures, Arrival of a Train at La Ciotat (1895), connected in my mind to the kind of language you find around a lot of horror film, epitomized by Wes Craven’s debut, Last House on the Left (1972): “To avoid fainting, keep repeating: it’s only a movie.”
I started to notice how many horror films were about not just the act of looking, but more specifically, the act of creating. Way back in the days of del.icio.us, a link-sharing application that started in (gasp!) 2003, I had started a list of horror films about art. That list sadly died with del.icio.us, but here are just a few–Color Me Blood Red, A Bucket of Blood, Mill of the Stone Women, Peeping Tom, The Theatre Bizarre, Driller Killer, Videodrome, Shutter, The Mystery of the Wax Museum, and so many more. These are just films that are directly about creativity. What about more nuanced takes, like the Frankenstein tradition, or dolls that take on life? Crazed directors creating an entertainment for the gods in Cabin in the Woods? Portraits that come alive? The history of horror film is littered with the remains of Oscar Wilde. There are some really great contemporary films about art school–M.F.A, and Murder Party are my faves.
So, I took the opportunity that was the thesis project to dive more deeply into the subject. I knew I wanted to create a network analysis, but in order for that to happen, I needed data–and data that included text. IMDb makes available daily downloads of their datasets in limited form, but it doesn’t include the information I was looking for, like summaries and synopses. To get this, I had to figure out how to scrape the web, and at a pretty large scale. I was off to a good start with beautiful soup and a Python notebook, but it quickly became apparent that I needed something more robust. With the help of very, very, very patient friends, I ended up with a database that contained the text data about each feature film record in IMDb in the genres of horror, of course, but also drama, action, sci-fi, romance, mystery, and thriller. IMDb captures up to three genres associated with each film, so any given film might be horror, drama, and, say, crime.
This cross-section allowed me to do some comparative analysis between films that contained horror in the genre array, and films that did not include horror. I ended up with a dataset of over 300,000 films. This could be merged with the daily downloads to connect other information. I’ll talk more about the process of keyword development, but for the time being, analysis shows that films in IMDb with “horror” in the genre array is almost 1/3 more likely to be about subjects of making than films without “horror” in the genre array.
To be more specific, about 6% of all horror films included art subjects, while 4% of not-horror films did. While these numbers aren’t staggeringly high, they are significant.
More to come!