Difference between revisions of "MIR workshop 2011 day5 lab"

From CCRMA Wiki
Jump to: navigation, search
Line 1: Line 1:
<h1>MIR Workshop 2011 Day 5 Lab</h1>
+
<b>MIR Workshop Lab for Music Recommendation<b>
<h2>Douglas Eck, Google</h2>
+
<b>Douglas Eck, Google<b>
  
  
<h2>Overview<h2>
+
<h2>Overview</h2>
 
This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance.  
 
This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance.  
 
Fast programmers should be able to accomplish considerably more.   
 
Fast programmers should be able to accomplish considerably more.   

Revision as of 16:21, 30 June 2011

MIR Workshop Lab for Music Recommendation<b> <b>Douglas Eck, Google<b>


Overview

This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance. Fast programmers should be able to accomplish considerably more.


  • The basics (some Python code available to help).
    • Calculate acoustic features on CAL500 dataset (students should have already done this.)
    • Read in user tag annotations from same dataset provided by UCSD.
    • Build similarity matrix based on word vectors derived from these annotations.
    • Query similarity matrix with a track to get top hits based on cosine distance.
    • Build second similarity matrix using acoustic features.
    • Query this similarity matrix with track to get top hits based on cosine distance.
  • Extra (I didn't write code for this, but can help students find examples).
    • Query the EchoNest for additional acoustic features and compare to yours.
    • Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure).
    • Compare a 2D visualization of acoustic features versus UCSD user annotations.


Lab code is found in /usr/ccrma/courses/mir20110/cal500_new. A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab. I renamed audio files to match those of UCSD Cal500