Revision as of 00:19, 1 July 2011

MIR Workshop 2011 Day 5 Lab

@@ Line 1: / Line 1: @@
+<h1>MIR Workshop 2011 Day 5 Lab</h1>
+<h2>Douglas Eck, Google</h2>
+<h2>Overview<h2>
+This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance.
+Fast programmers should be able to accomplish considerably more.
+* The basics (some Python code available to help).
+** Calculate acoustic features on CAL500 dataset (students should have already done this.)
+** Read in user tag annotations from same dataset provided by UCSD.
+** Build similarity matrix based on word vectors derived from these annotations.
+** Query similarity matrix with a track to get top hits based on cosine distance.
+** Build second similarity matrix using acoustic features.
+** Query this similarity matrix with track to get top hits based on cosine distance.
+* Extra (I didn't write code for this, but can help students find examples).
+** Query the EchoNest for additional acoustic features and compare to yours.
+** Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure).
+** Compare a 2D visualization of acoustic features versus UCSD user annotations.
 Lab code is found in /usr/ccrma/courses/mir20110/cal500_new.
 A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab.  I renamed audio files to match those of UCSD Cal500