Difference between revisions of "MIR workshop 2008 notes"
(→Timing and Segmentation) |
(→Tempo Extraction) |
||
Line 17: | Line 17: | ||
** [http://www.iro.umontreal.ca/~pift6080/documents/papers/scheirer_jasa.pdf Tempo and beat analysis of acoustic musical signals] | ** [http://www.iro.umontreal.ca/~pift6080/documents/papers/scheirer_jasa.pdf Tempo and beat analysis of acoustic musical signals] | ||
** [http://staff.aist.go.jp/m.goto/PAPER/JNMR2001goto.pdf An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds]* | ** [http://staff.aist.go.jp/m.goto/PAPER/JNMR2001goto.pdf An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds]* | ||
− | |||
= Feature Extraction = | = Feature Extraction = |
Revision as of 09:40, 12 July 2010
This page is intended to supplement the lecture material found in the class - providing extra tutorials, support, references for further reading, or demonstration code snippets for those interested in a given topic. Please contribute to this growing list of resources. Do you have a great explanation of how a technique works? Found a great Java applet that illustrates a concept? Discovered a great survey of the field for a particular area? Please add it for the benefit of future students. Thanks!
I encourage you to ADD links and sections - but please do not REMOVE headings or items from the page.
Contents
- 1 Timing and Segmentation
- 2 Feature Extraction
- 3 Analysis / Decision Making
- 4 Model / Data Preparation Techniques
- 5 Evaluation Methodology
- 6 Real-world applications
- 7 Getting Involved in the MIR Community
- 8 Research Databases / Collections of Ground truth data and copyright-cleared music
- 9 MIR Software and Toolboxes
- 10 MIR Topic Areas
Timing and Segmentation
Onset Detection
- Papers:
- Code:
Beat Extraction
- Papers:
- Code:
Tempo Extraction
- Papers:
Feature Extraction
Low Level Features
Zero Crossing, Temporal centroid, Log Attack time, Attack slope), Spectral features (Centroid, Flux, RMS, Rolloff, Flatness, Kurtosis, Brightness),Spectral bands, Log spectrogram
Chroma bins
MFCC
Auditory Toolbox (code and docs)
MPEG-7
Higher-level features
Key Estimation
Chord Estimation
Genre (genre, artist ID, similarity)
"Fingerprints"
Visualizing and Sonifying Feature data
Matt Hoffman's feature sonification work
Analysis / Decision Making
Classification
Heuristic Analysis
Distance measures (Euclidean, Manhattan, etc.)
k-NN
SVM / One-class SVM
Resources
- The interactive Matlab SVM Demo that I demonstrated on Lecture 5 comes from here
- A nice SVM java applet to demo the concepts
- Andrew Moore's SVM Powerpoint Lecture
- User community of SVM enthusiasts
- A practical guide to SVM classification
- SVM Practical (How to get good results without cheating)
- One-class SVM posting
Code
Clustering and probability density models
Density distance measures (centroid distance, EMD, KL-divergence, etc)
k-Means
Clustering
GMM
- Simple review of probability with introduction of Bayes Rules
- Good description of conditional probabilities
- EM explained
- Expectation-Maximization Java Applet
- Lab featuring real-world GMM examples for singing detection
- Dan Ellis' Speech and Audio Processing Lectures
HMM
- High-level introduction to HMM
- “A tutorial on hidden markov models and selected applications in speech recognition” Lawrence Rabiner, Proc. IEEE, 77(2), Feb 1989.
- A self-directed introduction / lab for HMMs
- Matlab Introduction to HMM functions
Nested classifier / Anchor-space / template-based systems
Model / Data Preparation Techniques
Data Preparation
PCA / LDA
Scaling data
Model organization
- concept, design, data set construction and organization
Evaluation Methodology
Feature selection
Cross Validation
Information Retrieval metrics (precision, recall, F-Measure)
Real-world applications
Audio Segmentation
Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music
Audio Fingerprinting
- P. Cano, E. Batlle, T. Kalker, and J. Haitsma, “A review of algorithms for audio fingerprinting,” in IEEE International Workshop on Multimedia Signal Processing (MMSP), pp. 169 – 173, December 2002. 4 pages.
- "On the comparison of audio fingerprints for extracting quality parameters of compressed audio"
- Finding Structure in Audio for Music Information Retrieval
- "Computer Vision for Music Identification" Y. Ke, D. Hoiem, and R. Sukthankar
The Last.fm fingerprinter uses this approach, code can be checked out from: svn://svn.audioscrobbler.net/recommendation/MusicID/lastfm_fplib
Drum Transcription
Audio Similarity
Music Recommendation / Playlisting
Getting Involved in the MIR Community
Research Databases / Collections of Ground truth data and copyright-cleared music
General MIR Datasets
Download links for the ISMIR 2004 genre classification contest training set:
- http://ismir2004.ismir.net/genre_contest/index.htm
- http://www.iua.upf.es/mtg/ismir2004/contest/Training_Tracks1.tar.gz
- http://www.iua.upf.es/mtg/ismir2004/contest/Training_Tracks2.tar.gz
Tags:
More:
- OLPC Sound Sample Archive (8.5 GB) [1]
- RWC Music Database (n DVDs) [available in Stanford Music library]
- RWC - Sound Instruments Table of Contents
- http://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-i.html
- Univ or Iowa Music Instrument Samples
From Georg Holzmann: LIST OF PUBLIC AVAILABLE MIR DATASETS Downloadable Datasets: - University of Iowa musical instruments samples: http://theremin.music.uiowa.edu/MIS.html Instrument samples recorded by the University of Iowa - ISMIR2004 Audio Description Contest Dataset: http://ismir2004.ismir.net/ISMIR_Contest.html Datasets for - Genre Classification/Artist Identification - Melody Extraction - Tempo Induction - Rhythm Classification - Graham's Melody Extraction Dataset: http://www.ee.columbia.edu/~graham/mirex_melody/ http://labrosa.ee.columbia.edu/projects/melody/ Audio files with correspondig pitch data - MIREX06 Audio Tempo Extraction and Beat Tracking Datasets: http://www.music-ir.org/mirex/2006/index.php/Audio_Tempo_Extraction#Practice_Data - QBSH: A Corpus for Designing QBSH (Query by Singing/Humming) Systems http://neural.cs.nthu.edu.tw/jang2/dataSet/childSong4public/QBSH-corpus/ - Uni Dortmund Music Audio Benchmark Data Set: http://www-ai.cs.uni-dortmund.de/audio.html Songs from different genres and with tags (from garageband.com) - Latin Music Database: http://www.ppgia.pucpr.br/~silla/lmd/ 3.160 music pieces in MP3 Format classified in 10 diferent musical genres (only features online) Orderable Datasets: - RWC Music Database: http://staff.aist.go.jp/m.goto/RWC-MDB/ (many CDs) Datasets for - Pop Music & Royalty-Free Music - Classical Music - Jazz Music - Music Genre - Musical Instrument Sound Additional: AIST RWC Annotations http://staff.aist.go.jp/m.goto/RWC-MDB/AIST-Annotation/ Additional annotations to the RWC database (beat, melody, ...) - McGill University Master Samples: http://www.music.mcgill.ca/resources/mums/html/ 3 DVDs with instrument samples - USPOP2002 Pop Music data set: http://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html (3 DVDs) MFCC features from 706 albums and 8764 tracks (400 artists) with style tags - ENST-Drums: http://perso.telecom-paristech.fr/~gillet/ENST-drums/ An extensive audio-visual database for drum signals processing Free Online Music: - magnatune.com creative commons music: http://magnatune.com/info/press/coverage/ccblog - http://www.garageband.com/ Public domain recordings - http://epitonic.com/ "high quality free and legal mp3 music" - http://www.jamendo.com/ Creative commons licensed music - http://musicbrainz.org/ Get music metadata - http://www.freesound.org/ Collaborative database of Creative Commons licensed sounds (not focused on songs) Webservices: - Networked Environment for Music Analysis: http://nema.lis.uiuc.edu/ A webservices system for submitting code, running it against virtual collections (full use in 2010) - MIREX DIY Framework: http://www.music-ir.org/mirexdiy/ http://www.dlib.org/dlib/december06/downie/12downie.html (useable ?)
MIR Software and Toolboxes
Incomplete but growing list (courtesy of Joern Loviscach): * MARSYAS * jAudio * Chuck * The Sonic Visualizer/Annotator * CLAM * Music-to-Knowledge (M2K) * MIRtoolbox * MA toolbox * Psysound * Praat * IPEM * EchoNest * libxtract * MuBu * Soundspotter * timbreID * openSMILE * MPEG-7 XM * MPEG-7 Audio Encoder * MPEG-7 Audio Analyzer * Sphinx 4 - Java-based open-source speech recognizer http://cmusphinx.sourceforge.net/sphinx4/#capabilities
MIR Topic Areas
From Simon Dixon, Music-IR list, Dec 2008.
MIR Systems - Content-based Querying - Classification (genre/style/mood) - Recommendation / playlist generation - Fingerprinting / DRM - Score following / Audio alignment - Transcription / Annotation - Tempo induction / Beat tracking - Summarisation - Streaming - Text/web mining - Optical music recognition - Database systems / indexing / query languages Human issues - user interfaces, user models - emotion, aesthetics - perception, cognition - social issues - legal and ethical issues - business issues - methodological and philosophical issues Data and metadata - audio - MIDI - score - text/web - KR schemes, standards and protocols - libraries and collections - test sets and evaluation Musical knowledge - Melody and motives - Harmony, chords and tonality - Rhythm, beat, tempo and form - Timbre, instrumentation and voice - Genre, style and mood - Performance - Composition - Ethnomusicology