Predicting Song Popularity with a Neural Network


Predicting Song Popularity with a Neural Network

Abstract

  • My goal in this project is to create a model that can predict the Spotify popularity score of a song based on the songs structural attributes, which are provided by Spotify. I also wanted to see if I could find trends or insight’s into trends by manipulating the model and comparing the outputs. I used a dataset that was posted to Kaggle by a user who used Spotify’s API to pull the data on over 600,000 song tracks. The plan is to experiment with using different combinations of features and alter the data’s format to try to produce the best results. For instance, I altered the format of the release date data from a full date to just the year it was released, and then I dropped songs made before 2010 and saw the accuracy of my model increase. This was becuase its pretty evident that trends from over a decade ago don’t really have anything in common with what is popular over this decade. Also, the model was using attributes like ‘energy’ and ‘tempo’ as the input data, and the structure of music has dramatically changed, so it didn’t make sense to me to use data from the last century. The model I will be experamenting with will be a sequential model using a spare categorical crossentripy as its loss function. I will have to research a method to compensate for the vast disparity found in my target, which is the popularity score Spotify has calculated. The disparity in the target column is huge, with only 130 songs getting a popularity score between 81 and 100, while 343,715 songs scored between 0 and 60. I will use a confusion matrix to examine the results of the model so that I can check if the model is guessing that all the songs have a low popularity score instead of using the scores of the attributes. That will probably be my greatest challenge.

Video Presentation

Slides