Difference between revisions of "MIR workshop 2011 day5 lab"

From CCRMA Wiki
Jump to: navigation, search
(Created page with 'Lab code is found in /usr/ccrma/courses/mir20110/cal500_new. A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of…')
 
Line 1: Line 1:
 +
<h1>MIR Workshop 2011 Day 5 Lab</h1>
 +
<h2>Douglas Eck, Google</h2>
 +
 +
 +
<h2>Overview<h2>
 +
This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance.
 +
Fast programmers should be able to accomplish considerably more. 
 +
 +
 +
* The basics (some Python code available to help).
 +
** Calculate acoustic features on CAL500 dataset (students should have already done this.)
 +
** Read in user tag annotations from same dataset provided by UCSD.
 +
** Build similarity matrix based on word vectors derived from these annotations.
 +
** Query similarity matrix with a track to get top hits based on cosine distance.
 +
** Build second similarity matrix using acoustic features.
 +
** Query this similarity matrix with track to get top hits based on cosine distance.
 +
 +
* Extra (I didn't write code for this, but can help students find examples).
 +
** Query the EchoNest for additional acoustic features and compare to yours.
 +
** Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure).
 +
** Compare a 2D visualization of acoustic features versus UCSD user annotations.
 +
 +
 +
 
Lab code is found in /usr/ccrma/courses/mir20110/cal500_new.
 
Lab code is found in /usr/ccrma/courses/mir20110/cal500_new.
 
A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab.  I renamed audio files to match those of UCSD Cal500
 
A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab.  I renamed audio files to match those of UCSD Cal500

Revision as of 16:19, 30 June 2011

MIR Workshop 2011 Day 5 Lab

Douglas Eck, Google


Overview<h2> This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance. Fast programmers should be able to accomplish considerably more.
  • The basics (some Python code available to help).
    • Calculate acoustic features on CAL500 dataset (students should have already done this.)
    • Read in user tag annotations from same dataset provided by UCSD.
    • Build similarity matrix based on word vectors derived from these annotations.
    • Query similarity matrix with a track to get top hits based on cosine distance.
    • Build second similarity matrix using acoustic features.
    • Query this similarity matrix with track to get top hits based on cosine distance.
  • Extra (I didn't write code for this, but can help students find examples).
    • Query the EchoNest for additional acoustic features and compare to yours.
    • Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure).
    • Compare a 2D visualization of acoustic features versus UCSD user annotations.
Lab code is found in /usr/ccrma/courses/mir20110/cal500_new. A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab. I renamed audio files to match those of UCSD Cal500