Music Information Retrieval

cassette tapes

Music is ubiquitous in today's world-almost everyone enjoys listening to music. With the rise of streaming platforms, the amount of music available has substantially increased. While users may seemingly benefit from this plethora of available music, at the same time, it has increasingly made it harder for users to explore new music and find songs they like. Personalized access to music libraries and music recommender systems aim to help users discover and retrieve music they like and enjoy. 

To this end, the field of Music Information Retrieval (MIR) strives to make music accessible to all by advancing retrieval applications such as music recommender systems, content-based search, the generation of personalized playlists, or user interfaces that allow to visually explore music collections. This includes gathering machine-readable musical data, the extraction of meaningful features, developing data representations based on these features, methodologies to process and understand that data. Retrieval approaches specifically leverage these representations for indexing music and providing search and retrieval services.

In our research, we develop methods for analyzing user music consumption behavior, investigate deep learning-based feature extraction methods for music content analysis, predicting the potential success and popularity of songs, and distilling sets of features that allow capturing user music preferences for retrieval tasks.

 

Public Datasets

For our research, we employ a variety of datasets that we have curated and utilized in our research and publications. We are happy to share the following datasets:

  • #nowplaying is a dataset that leverages Twitter for the creation of a diverse and constantly updated data set describing the music listening behavior of users. Twitter is frequently facilitated to post which music the respective user is currently listening to. From such tweets, we extract track and artist information and further metadata. You can find the dataset on Zenodo: https://doi.org/10.5281/zenodo.2594482 (CC BY 4.0). Please cite this paper when using the dataset. Please cite this paper when using the dataset.
  • The #nowplaying-RS dataset features context- and content features of listening events. It contains 11.6 million music listening events of 139K users and 346K tracks collected from Twitter. The dataset comes with a rich set of item content features and user context features, as well as timestamps of the listening events. Moreover, some of the user context features imply the cultural origin of the users, and some others—like hashtags—give clues to the emotional state of a user underlying a listening event. You can find the dataset on Zenodo: https://doi.org/10.5281/zenodo.2594537 (CC BY 4.0). Please cite this paper when using the dataset.
  • The Spotify playlists dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists, and the tracks contained in these playlists. You can find the dataset on Zenodo: https://doi.org/10.5281/zenodo.2594556 (CC BY 4.0). Please cite this paper when using the dataset.
  • The Hit Song Prediction dataset features high- and low-level audio descriptors of the songs contained in the Million Song Dataset (extracted via Essentia) for content-based hit song prediction tasks. You can find the dataset on Zenodo: https://doi.org/10.5281/zenodo.3258042 (CC BY 4.0). Please cite this paper when using the dataset.
  • The HSP-S and HSP-L datasets are based on data from AcousticBrainz, Billboard Hot 100, the Million Song Dataset, and last.fm. Both datasets contain audio features, Mel-spectrograms as well as streaming listener- and play-counts. The larger HSP-L dataset contains 73,482 songs, whereas the smaller HSP-S dataset contains 7,736 songs and additionally features Billboard Hot 100 chart measures. You can find the dataset on Zenodo: https://doi.org/10.5281/zenodo.5383858 (CC BY 4.0). Please cite this paper when using the dataset.
Photo by henry perks on Unsplash. 

Team

Publications

2019

Bib Link Download

Hsiao-Tzu Hung, Yu-Hua Chen, Maximilian Mayerl, Michael Vötter, Eva Zangerle and Yi-Hsuan Yang: MediaEval 2019 Emotion and Theme Recognition task: A VQ-VAE Based Approach. In Working Notes Proceedings of the MediaEval 2019 Workshop. ceur-ws.org, 2019.

2018

Bib Download

Martin Pichl and Eva Zangerle: Latent Feature Combination for Multi-Context Music Recommendation. In 2018 International Conference on Content-Based Multimedia Indexing (CBMI), pages 1-6. 2018

Bib Link Download

Eva Zangerle and Martin Pichl: Content-based User Models: Modeling the Many Faces of Musical Preference. In Proceedings of the 19th International Society for Music Information Retrieval Conference 2018 (ISMIR 2018), pages 709-716. 2018

Bib Link

Asmita Poddar, Eva Zangerle and Yi-Hsuan Yang : #nowplaying-RS: A New Benchmark Dataset for Building Context-Aware Music Recommender Systems. In Proceedings of the 15th Sound & Music Computing Conference. 2018

Bib Link Download

Eva Zangerle, Martin Pichl and Markus Schedl: Culture-Aware Music Recommendation. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization (UMAP 2018), pages 357-358. ACM, 2018

Bib Link Download

Benjamin Murauer and Günther Specht: Detecting Music Genre Using Extreme Gradient Boosting. In Companion of the The Web Conference 2018 on The Web Conference 2018, pages 1923-1927. International World Wide Web Conferences Steering Committee, 2018.

Bib Link Download

Eva Zangerle, Michael Tschuggnall, Stefan Wurzinger and Günther Specht: ALF-200k: Towards Extensive Multimodal Analyses of Music Tracks and Playlists. In Advances in Information Retrieval - 39th European Conference on IR Research (ECIR 2018), pages 584-590. Springer, 2018

Bib Link Download

Christian Esswein, Markus Schedl and Eva Zangerle: geMsearch: Personalized Explorative Music Search. In Joint Proceedings of the ACM IUI 2018 Workshops co-located with the 23rd ACM Conference on Intelligent User Interfaces (ACM IUI 2018). ceur-ws.org, 2018

Bib Link Download

Martin Pichl: Multi-Context-Aware Recommender Systems: A Study on Music Recommendation. PhD thesis, University of Innsbruck, Department of Computer Science, 2018.

2017

Bib

Eva Zangerle, Michael Tschuggnall, Stefan Wurzinger and Günther Specht: Analyzing Coherent Characteristics in Music Playlists. In Proceedings of the 4th Digital Humanities Austria Conference (dha 2017), Innsbruck, Austria 2017.

Bib Link

Martin Pichl, Eva Zangerle, Günther Specht and Markus Schedl: Mining Culture-Specific Music Listening Behavior from Social Media Data. In Proceedings of the IEEE International Symposium on Multimedia (ISM 2017), Taichung, Taiwan, December 11-13, 2017, pages 208-215. IEEE Computer Society, 2017

Bib Link

Benjamin Murauer, Maximilian Mayerl, Michael Tschuggnall, Eva Zangerle, Martin Pichl and Günther Specht: Hierarchical Multilabel Classification and Voting for Genre Classification. In CEURS Working Notes Proceedings of the MediaEval 2017 Workshop. CEUR-WS.org, 2017

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: Improving Context-Aware Music Recommender Systems: Beyond the Pre-filtering Approach. In Proceedings of the 2017 ACM International Conference on Multimedia Retrieval (ICMR 2017), pages 201-208. ACM, 2017

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: Understanding User-curated Playlists on Spotify: A Machine Learning Approach. In International Journal of Multimedia Data Engineering and Management (IJMDEM), vol. 8, no. 4. 2017

2016

Bib Link Download

Martin Pichl, Eva Zangerle and Günther Specht: Understanding Playlist Creation on Music Streaming Platforms. In Proceedings of the IEEE Symposium on Multimedia (ISM), pages 475-480. IEEE, 2016

Bib Link Download

Eva Zangerle, Martin Pichl, Benedikt Hupfauf and Günther Specht: Can Microblogs Predict Music Charts? An Analysis of the Relationship Between #Nowplaying Tweets and Music Charts. In Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR 2016), New York City, United States, August 7-11, 2016, pages 365-371.

2015

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?. In Proceedings of 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pages 1360-1365. IEEE, 2015.

Bib Link

Martin Pichl, Eva Zangerle and Günther Specht: #nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations. In Current Trends in Web Engineering, 15th International Conference, ICWE 2015 Workshops (Revised Selected Papers), pages 163-174. Springer, 2015.

2014

Bib Link Download

Martin Pichl, Eva Zangerle and Günther Specht: Combining Spotify and Twitter Data for Generating a Recent and Public Dataset for Music Recommendation. In Proceedings of the 26nd Workshop Grundlagen von Datenbanken (GvDB 2014), Ritten, Italy, vol. 1313, pages 35-40. CEUR-WS.org, Oct. 2014.

Bib Link Download

Eva Zangerle, Martin Pichl, Wolfgang Gassler and Günther Specht: #nowplaying Music Dataset: Extracting Listening Behavior from Twitter. In Proceedings of the 1st ACM International Workshop on Internet-Scale Multimedia Management (WISMM '14), pages 21-26. ACM, June 2014.