Using NLP to analyze song lyrics
Using NLP to analyze song lyrics.
This project uses natural language processing to compare song lyrics from three iconic bands with distinctive song-writing. On the one extreme, Rush’s lyricist, Neil Peart, wrote some of the most erudite and textually challenging lyrics of any rock band. AC/DC, although not considered lyrically weak, is included as an unimpressive contrast to Peart. Queen is the dark-horse in this analysis. All four members of Queen wrote songs individually, and as a group. Earlier songs, especially those written by Freddie Mercury, were extravagantly imaginative. How will Mercury’s creativity show up in the analysis?
This project is a fun analytic approach to exploring something that is easier to feel than to explain. AC/DC songs are easy to enjoy, but also easy to tire of. Queen and Rush are more difficult to acquire a taste for, but garner stronger feelings. This project is also a case study in the use of natural language processing (NLP) algorithms. The big two are present here: topic modeling, and sentiment analysis. I also achieved interesting results with a complexity analysis.
Fork this project to start your own battle of the bands, or clone and submit pull requests if you have any improvements.
I narrated my analysis so that it functions somewhat as a tutorial. I also include links to the resources I used. Each section contributes insights that subsequent sections occasionally leverage. So if you are running the code, run each section in succession.
Section 1 uses API and web scraping techniques to assemble a corpus of text for nearly 500 songs.
Section 2 is a brief overview of the corpus, including summary stats and trend charts.
Section 3 is the first NLP analysis. I measure text complexity using several measures. The quanteda package does almost all of the work, leaving the user with all of the fun.
Section 4 is a topic analysis. I use the stm package to identify topics, then perform a cluster analysis to find similar songs.