I’ve been kicking around an idea for a book project on horror films about art, broadly understood, for a while now. Maybe “art” isn’t quite the right word, but I’m interested in self-conscious horror films and horror films that are about the act of creating something in an aesthetic way. I’ve been thinking about this since I watched The Texas Chain Saw Massacre probably for the 6th or 7th time: the way the film mimics the editing and montage work of Hitchcock’s Psycho, drawing attention to the way it’s been put together; the way the Family creates aesthetic objects (out of human bones and skin) to furnish their home; the opening scene of the rotting corpse perched sculpturally atop a gravestone while the Kodak Land Camera sound in the audio track whinges and flails and stretches. For me, horror films have always been about the distanced act of looking–these are films, representations, that interrogate the act of seeing, of witnessing. Carol Clover’s watershed Men, Women, and Chainsaws has a chapter about this kind of thing–the way horror film is in some sense an amped-up version of film itself. It thematizes the act of looking.
Is this the right topic for my Data Analysis and Visualization capstone project?
I used to have a list on Del.icio.us (does anyone remember that?) site that was dedicated to films about art. In the passage of time, I’ve lost it, but here’s a partial list from recent (and not-so-recent) notes:
- The Texas Chain Saw Massacre
- Murder Party
- Color Me Blood Red
- The House of Wax, The Mystery of the Wax Museum (etc)
- The Picture of Dorian Gray
- A Bucket of Blood
- Driller Killer
- The Stendhal Syndrome
- Peeping Tom
- American Mary
- Behind the Mask: The Rise of Leslie Vernon
- The Last Horror Movie
- The Theatre Bizarre
- Pickman’s Muse (etc)
- Mr. Jones
- The Machine that Kills Bad People
- Un Chien Andalou
- Cabin in the Woods
- House with the Laughing WIndows
- Sweet Home (1988 Japanese)
- Cellar Dweller
- Velvet Buzzsaw
- Human Centipede
- The Abstraction
I had a feeling there were a lot more–and if we went further afield, highly self-conscious films, films about creators more broadly (even pregnancy horror? mad-science-horror?) we could probably add to this list dramatically. What about adaptations of novels or short stories? To what extent can we think of the slasher sequel as meta-horror? Scream? The Blair Witch Project? What about the films we don’t see–the less popular ones, the obscure ones, the non-western ones, the shorts, and so on?
I trolled the Internet looking for film datasets–robust, comprehensive film data sets, including country of origin, budget, return, textual synopses or plot summaries, and so on. IMDB I like because it is crowd-sourced, which makes it accessible and says something about fan base, too. But the datasets available for download don’t have some key information, namely budget, country of orgin, and textual summary information. IMDB Pro accounts don’t give you any more capacious download options, which was a bit of a bummer.
I guess this meant I needed to learn how to scrape data from the web? I spent about three weeks trying to find models of scraping code out there, figured python was the way to go, and set off. That process was (is) long and arduous, and now I’ve got a couple friends helping me as well as a pretty powerful scraper designed for IMDB. In the meantime, I’ve been scraping data, seeing how it parses, realizing I can’t work with some part of the data for some reason, tweaking the code, running it again, discovering that IMDB had changed their CSS, re-doing the code, and on and on and on. I realized that grabbing large chunks of text from the web poses problems for my scraper, so I may need to scale back my plans for doing text-based network analysis on the synopses. Versions of the scraping code have been running on my very unhappy computer for the past 4 weeks! I’ve seriously thought about not working on this project topic, because there’s just so much I need to learn in order to fully explore the topic–or, to even get to an EDA. But I’ll press on for a while longer. Emotional (and coding) support welcome.
So, while that process continues, here are my initial questions:
- What percentage of horror films are thematically related to art, creativity, the act of making (or unmaking)?
- What is the timeline for horror + art films?
- Are there more/less in any particular era? Why might that be?
- What sub-genres/sub-topics are evident? Are there more or less of these in any given decade? Why/not?
- What keywords/topics are shared most and least?
- Overlap between films–for instance, films that are about sculptor are also about x
- Count number of films with shared keywords–need a list!
- Sankey (for a subset)? Network diagram? Arc diagram? Sunburst chart?
- Are there any directors/writers/actors/etc who are more or less active in this genre/subgenre? Where do they intersect? → this may be for a later step.
- Ratings: are these films more highly rated, less highly rated, on par with others in the broad horror genre? Does gender make a difference?
- Average horror rating, average rating of subset
Rating by gender? Info in IMDB Pro, but cannot scrape from that with any ease.Sadface.
- What IMDb genres do these films also fall into? Where is there genre overlap? Radar charts?
- Which films in these genres have most viewer engagement with?
- Number of votes,
number of synopses/summaries submitted? Only pulling the first summary/storyline…sadface.
- Number of votes,
- International/mapping–where are these films being made most?
While the scraper runs in the background, I’ve been able to do a little basic EDA in python / pandas and even pull it into Tableau to get a clearer picture of some aspects of the data. Stay tuned!