Difference between revisions of "MIR workshop 2008 notes"
(→Analysis / Decision Making) |
(→Clustering and probability density models) |
||
Line 53: | Line 53: | ||
== Clustering and probability density models == | == Clustering and probability density models == | ||
* Density distance measures (centroid distance, EMD, KL-divergence, etc) | * Density distance measures (centroid distance, EMD, KL-divergence, etc) | ||
− | * k-Means | + | * k-Means |
* [http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html Clustering Demo] | * [http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html Clustering Demo] | ||
Revision as of 09:44, 12 July 2010
This page is intended to supplement the lecture material found in the class - providing extra tutorials, support, references for further reading, or demonstration code snippets for those interested in a given topic. Please contribute to this growing list of resources. Do you have a great explanation of how a technique works? Found a great Java applet that illustrates a concept? Discovered a great survey of the field for a particular area? Please add it for the benefit of future students. Thanks!
I encourage you to ADD links and sections - but please do not REMOVE headings or items from the page.
Contents
- 1 Timing and Segmentation
- 2 Feature Extraction
- 3 Analysis / Decision Making
- 4 Model / Data Preparation Techniques
- 5 Evaluation Methodology
- 6 Real-world applications
- 7 Getting Involved in the MIR Community
- 8 Research Databases / Collections of Ground truth data and copyright-cleared music
- 9 MIR Software and Toolboxes
- 10 MIR Topic Areas
Timing and Segmentation
Onset Detection
- Papers:
- Code:
Beat Extraction
- Papers:
- Code:
Tempo Extraction
- Papers:
Feature Extraction
Low Level Features
- Zero Crossing, Temporal centroid, Log Attack time, Attack slope), Spectral features (Centroid, Flux, RMS, Rolloff, Flatness, Kurtosis, Brightness),Spectral bands, Log spectrogram
- Chroma bins
- MFCC
- MPEG-7
Higher-level features
- Key Estimation
- Chord Estimation
- Genre (genre, artist ID, similarity)
- "Fingerprints"
Visualizing and Sonifying Feature data
Analysis / Decision Making
Classification
- Heuristic Analysis
- Distance measures (Euclidean, Manhattan, etc.)
- k-NN
- SVM / One-class SVM
- Resources:
- The interactive Matlab SVM Demo that I demonstrated on Lecture 5 comes from here
- A nice SVM java applet to demo the concepts
- Andrew Moore's SVM Powerpoint Lecture
- User community of SVM enthusiasts
- A practical guide to SVM classification
- SVM Practical (How to get good results without cheating)
- One-class SVM posting
- Code:
- Resources:
Clustering and probability density models
- Density distance measures (centroid distance, EMD, KL-divergence, etc)
- k-Means
- Clustering Demo
Clustering
- GMM
HMM
- High-level introduction to HMM
- “A tutorial on hidden markov models and selected applications in speech recognition” Lawrence Rabiner, Proc. IEEE, 77(2), Feb 1989.
- A self-directed introduction / lab for HMMs
- Matlab Introduction to HMM functions
Nested classifier / Anchor-space / template-based systems
- ?
Model / Data Preparation Techniques
Data Preparation
PCA / LDA
Scaling data
Model organization
- concept, design, data set construction and organization
Evaluation Methodology
Feature selection
Cross Validation
Information Retrieval metrics (precision, recall, F-Measure)
Real-world applications
Audio Segmentation
Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music
Audio Fingerprinting
- P. Cano, E. Batlle, T. Kalker, and J. Haitsma, “A review of algorithms for audio fingerprinting,” in IEEE International Workshop on Multimedia Signal Processing (MMSP), pp. 169 – 173, December 2002. 4 pages.
- "On the comparison of audio fingerprints for extracting quality parameters of compressed audio"
- Finding Structure in Audio for Music Information Retrieval
- "Computer Vision for Music Identification" Y. Ke, D. Hoiem, and R. Sukthankar
The Last.fm fingerprinter uses this approach, code can be checked out from: svn://svn.audioscrobbler.net/recommendation/MusicID/lastfm_fplib
Drum Transcription
Audio Similarity
Music Recommendation / Playlisting
Getting Involved in the MIR Community
Research Databases / Collections of Ground truth data and copyright-cleared music
General MIR Datasets
Download links for the ISMIR 2004 genre classification contest training set:
- http://ismir2004.ismir.net/genre_contest/index.htm
- http://www.iua.upf.es/mtg/ismir2004/contest/Training_Tracks1.tar.gz
- http://www.iua.upf.es/mtg/ismir2004/contest/Training_Tracks2.tar.gz
Tags:
More:
- OLPC Sound Sample Archive (8.5 GB) [1]
- RWC Music Database (n DVDs) [available in Stanford Music library]
- RWC - Sound Instruments Table of Contents
- http://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-i.html
- Univ or Iowa Music Instrument Samples
From Georg Holzmann: LIST OF PUBLIC AVAILABLE MIR DATASETS Downloadable Datasets: - University of Iowa musical instruments samples: http://theremin.music.uiowa.edu/MIS.html Instrument samples recorded by the University of Iowa - ISMIR2004 Audio Description Contest Dataset: http://ismir2004.ismir.net/ISMIR_Contest.html Datasets for - Genre Classification/Artist Identification - Melody Extraction - Tempo Induction - Rhythm Classification - Graham's Melody Extraction Dataset: http://www.ee.columbia.edu/~graham/mirex_melody/ http://labrosa.ee.columbia.edu/projects/melody/ Audio files with correspondig pitch data - MIREX06 Audio Tempo Extraction and Beat Tracking Datasets: http://www.music-ir.org/mirex/2006/index.php/Audio_Tempo_Extraction#Practice_Data - QBSH: A Corpus for Designing QBSH (Query by Singing/Humming) Systems http://neural.cs.nthu.edu.tw/jang2/dataSet/childSong4public/QBSH-corpus/ - Uni Dortmund Music Audio Benchmark Data Set: http://www-ai.cs.uni-dortmund.de/audio.html Songs from different genres and with tags (from garageband.com) - Latin Music Database: http://www.ppgia.pucpr.br/~silla/lmd/ 3.160 music pieces in MP3 Format classified in 10 diferent musical genres (only features online) Orderable Datasets: - RWC Music Database: http://staff.aist.go.jp/m.goto/RWC-MDB/ (many CDs) Datasets for - Pop Music & Royalty-Free Music - Classical Music - Jazz Music - Music Genre - Musical Instrument Sound Additional: AIST RWC Annotations http://staff.aist.go.jp/m.goto/RWC-MDB/AIST-Annotation/ Additional annotations to the RWC database (beat, melody, ...) - McGill University Master Samples: http://www.music.mcgill.ca/resources/mums/html/ 3 DVDs with instrument samples - USPOP2002 Pop Music data set: http://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html (3 DVDs) MFCC features from 706 albums and 8764 tracks (400 artists) with style tags - ENST-Drums: http://perso.telecom-paristech.fr/~gillet/ENST-drums/ An extensive audio-visual database for drum signals processing Free Online Music: - magnatune.com creative commons music: http://magnatune.com/info/press/coverage/ccblog - http://www.garageband.com/ Public domain recordings - http://epitonic.com/ "high quality free and legal mp3 music" - http://www.jamendo.com/ Creative commons licensed music - http://musicbrainz.org/ Get music metadata - http://www.freesound.org/ Collaborative database of Creative Commons licensed sounds (not focused on songs) Webservices: - Networked Environment for Music Analysis: http://nema.lis.uiuc.edu/ A webservices system for submitting code, running it against virtual collections (full use in 2010) - MIREX DIY Framework: http://www.music-ir.org/mirexdiy/ http://www.dlib.org/dlib/december06/downie/12downie.html (useable ?)
MIR Software and Toolboxes
Incomplete but growing list (courtesy of Joern Loviscach): * MARSYAS * jAudio * Chuck * The Sonic Visualizer/Annotator * CLAM * Music-to-Knowledge (M2K) * MIRtoolbox * MA toolbox * Psysound * Praat * IPEM * EchoNest * libxtract * MuBu * Soundspotter * timbreID * openSMILE * MPEG-7 XM * MPEG-7 Audio Encoder * MPEG-7 Audio Analyzer * Sphinx 4 - Java-based open-source speech recognizer http://cmusphinx.sourceforge.net/sphinx4/#capabilities
MIR Topic Areas
From Simon Dixon, Music-IR list, Dec 2008.
MIR Systems - Content-based Querying - Classification (genre/style/mood) - Recommendation / playlist generation - Fingerprinting / DRM - Score following / Audio alignment - Transcription / Annotation - Tempo induction / Beat tracking - Summarisation - Streaming - Text/web mining - Optical music recognition - Database systems / indexing / query languages Human issues - user interfaces, user models - emotion, aesthetics - perception, cognition - social issues - legal and ethical issues - business issues - methodological and philosophical issues Data and metadata - audio - MIDI - score - text/web - KR schemes, standards and protocols - libraries and collections - test sets and evaluation Musical knowledge - Melody and motives - Harmony, chords and tonality - Rhythm, beat, tempo and form - Timbre, instrumentation and voice - Genre, style and mood - Performance - Composition - Ethnomusicology