RecommenderSystemsColab

Improving Deezer’s Music Recommendation Engine

This repository presents my work on Improving Deezer’s Music Recommendation Engine for the DSG17 Online Phase. This project is a work carried out at HSLU (Hochschule Luzern) by Gerber Andri. It delivers a complete, modular pipeline—from data exploration to advanced modeling—for predicting whether a user will listen to a track for more than 30 seconds.

For an in-depth discussion of the methodology and results, see the accompanying Gerber_Andri_Report.pdf.


Overview

The project is divided into three core notebooks:

  1. EDA.ipynb
    • Purpose: Conduct rigorous exploratory data analysis to assess data quality, detect outliers, and derive key temporal and behavioral features.
    • Outcome: A detailed data dictionary and insights that drive subsequent feature engineering.
  2. Preprocess_Feature_store.ipynb
    • Purpose: Transform raw data into a unified feature store using chunk-wise loading, timestamp conversion, outlier filtering, and engineered features (e.g., user behavior, time bins).
    • Outcome: Consistent, high-quality processed datasets for both training and testing.
  3. Specific_Preprocess_and_Modelling.ipynb
    • Purpose: Build and evaluate recommender models ranging from baseline methods (content-based, collaborative filtering, matrix factorization, PMF, RBM) to advanced approaches (Factorization Machines, NCF/NeuMF, GraphSAGE).
    • Outcome: Models optimized primarily for ROC AUC, with additional metrics (precision, recall, F1) to ensure robust performance.

Key Insights


How to Run

  1. EDA.ipynb:
    • Open the notebook (in Colab or JupyterLab), load your raw data, and perform initial exploration.
  2. Preprocess_Feature_store.ipynb:
    • Run the training and test pipelines to generate processed datasets with uniform features.
  3. Specific_Preprocess_and_Modelling.ipynb:
    • Load the processed data, select and tune models, and generate final predictions for evaluation or deployment.

Final Thoughts

Our approach blends robust feature engineering with state-of-the-art modeling techniques to create a competitive recommendation engine. For further details please contact Andri Gerber1

Enjoy exploring and extending this pipeline!


  1. Email: andri.gerber\@stud.hslu.ch. Department of Business, Lucerne University of Applied Sciences and Arts, Lucerne, Switzerland. HSLU. ORCiD ID..