Difference between revisions of "MIR workshop 2011 day5 lab"
From CCRMA Wiki
(Created page with 'Lab code is found in /usr/ccrma/courses/mir20110/cal500_new. A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of…') |
|||
Line 1: | Line 1: | ||
+ | <h1>MIR Workshop 2011 Day 5 Lab</h1> | ||
+ | <h2>Douglas Eck, Google</h2> | ||
+ | |||
+ | |||
+ | <h2>Overview<h2> | ||
+ | This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance. | ||
+ | Fast programmers should be able to accomplish considerably more. | ||
+ | |||
+ | |||
+ | * The basics (some Python code available to help). | ||
+ | ** Calculate acoustic features on CAL500 dataset (students should have already done this.) | ||
+ | ** Read in user tag annotations from same dataset provided by UCSD. | ||
+ | ** Build similarity matrix based on word vectors derived from these annotations. | ||
+ | ** Query similarity matrix with a track to get top hits based on cosine distance. | ||
+ | ** Build second similarity matrix using acoustic features. | ||
+ | ** Query this similarity matrix with track to get top hits based on cosine distance. | ||
+ | |||
+ | * Extra (I didn't write code for this, but can help students find examples). | ||
+ | ** Query the EchoNest for additional acoustic features and compare to yours. | ||
+ | ** Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure). | ||
+ | ** Compare a 2D visualization of acoustic features versus UCSD user annotations. | ||
+ | |||
+ | |||
+ | |||
Lab code is found in /usr/ccrma/courses/mir20110/cal500_new. | Lab code is found in /usr/ccrma/courses/mir20110/cal500_new. | ||
A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab. I renamed audio files to match those of UCSD Cal500 | A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab. I renamed audio files to match those of UCSD Cal500 |
Revision as of 16:19, 30 June 2011
MIR Workshop 2011 Day 5 Lab
Douglas Eck, Google
Overview<h2>
This lab covers the construction of parts of a music recommender. Focus is placed on building a similarity matrix from data and querying that matrix based on cosine distance.
Fast programmers should be able to accomplish considerably more.
- The basics (some Python code available to help).
- Calculate acoustic features on CAL500 dataset (students should have already done this.)
- Read in user tag annotations from same dataset provided by UCSD.
- Build similarity matrix based on word vectors derived from these annotations.
- Query similarity matrix with a track to get top hits based on cosine distance.
- Build second similarity matrix using acoustic features.
- Query this similarity matrix with track to get top hits based on cosine distance.
- Extra (I didn't write code for this, but can help students find examples).
- Query the EchoNest for additional acoustic features and compare to yours.
- Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure).
- Compare a 2D visualization of acoustic features versus UCSD user annotations.
Lab code is found in /usr/ccrma/courses/mir20110/cal500_new.
A previous version was uploaded at /usr/ccrma/courses/mir2011/cal500 but it was using filenames from my University of Montreal lab. I renamed audio files to match those of UCSD Cal500
- Calculate acoustic features on CAL500 dataset (students should have already done this.)
- Read in user tag annotations from same dataset provided by UCSD.
- Build similarity matrix based on word vectors derived from these annotations.
- Query similarity matrix with a track to get top hits based on cosine distance.
- Build second similarity matrix using acoustic features.
- Query this similarity matrix with track to get top hits based on cosine distance.
- Query the EchoNest for additional acoustic features and compare to yours.
- Use the CAL500 user annotations as ground truth and evaluate your audio features (ROC curve or some precision measure).
- Compare a 2D visualization of acoustic features versus UCSD user annotations.