Difference between revisions of "Sound Explorer"

From CCRMA Wiki
Jump to: navigation, search
Line 33: Line 33:
 
:::* 'Load File' button: loads a waveform from a WAV file.
 
:::* 'Load File' button: loads a waveform from a WAV file.
 
:::* 'Random' button: randomizes the envelope. Applied to the current envelope mode.
 
:::* 'Random' button: randomizes the envelope. Applied to the current envelope mode.
:* Mouse input: is used to draw envelopes,select ranges, and push buttons.
+
:* Mouse input: is used to draw envelopes, select ranges, and push buttons.
:* Keyboard input: used to control the envelope mode: : 'f' for frequency, 'p' for phase, and 't' for time.
+
:* Keyboard input: used to control the envelope mode: 'f' for frequency, 'p' for phase, and 't' for time.
  
  

Revision as of 18:25, 10 December 2009

Sound Explorer

Idea / Premise

Sound Explorer is an environment for exploring and shaping sounds in real time.


Getting It

Sound Explorer is available for as a Windows Visual Studio project. You can get it here.

Note: I've tried to include as many of the required libraries as possible in this zip file. However you might still need Microsoft's DirectX SDK.


Motivation

The program was created for educational purposes. When learning about new audio and DSP concepts, it's sometimes hard to gain an intuition about how the various theoretical ideas actually sound. It seems therefore that it would be useful to have a tool that lets a user shape waveforms and see and hear the results. Also, from my experience using sndpeek and assignment 3 of this class, there's a strong motivation to make the tool real-time since that helps the user in making the connection between the type of manipulation and the change to the sound.


Product Description

Sound Explorer allows the user to interactively shape waveforms. The results are displayed using a waterfall plot and the audio played back in real time.

The main way the user shapes the sound is through envelopes. Envelopes can be be applied in the time domain or frequency domain. In the frequency domain there is a choice between shaping the spectral envelope or the phase envelope. There is also a way to choose which subset (in time or frequency) to apply the envelope to.


Interface Design

The interface is made out of three parts: graphical display, keyboard input, and mouse input.
  • Graphical display:
The screen is made of these GUI elements:
  • Waterfall display: a waterfall plot of the audio currently being played
  • Envelope window: this is where the user manipulates the waveform. The envelope may be in time, frequency or phase mode. The user draws the envelope with the mouse. Drawing must be from the left to the right.
The envelope mode (time, frequency or phase) can be changed using a right-click menu, or using the keyboard.
  • Range window: in this part of the window, the user highlights which portion of the audio to apply the envelope to. The range's mode changes automatically to match the envelope mode. For example, when the envelope is in frequency mode, the range is in frequency mode as well, and the range specifies which subset of frequencies to apply the spectral envelope to.
The purpose the range element is to allow the user to 'zoom in' on parts of the audio they are interested in and manipulate those specific parts.
  • Buttons: there are currently two buttons:
  • 'Load File' button: loads a waveform from a WAV file.
  • 'Random' button: randomizes the envelope. Applied to the current envelope mode.
  • Mouse input: is used to draw envelopes, select ranges, and push buttons.
  • Keyboard input: used to control the envelope mode: 'f' for frequency, 'p' for phase, and 't' for time.


Software Design

  • The project uses polymorphism to simplify the management of the various parts of the GUI. There is a base class called 'UI_Element' from which all the classes that form the GUI are descended. This allows the main display function to be very simple. That function's core looks like this:
 // Draw *everything* ----------------
 for(int i=0; i<ELEM_LAST; i++) {
   if(elements[i]->isActive()) {
     elements[i]->draw();
   }
 }
Once the project was restructured with this hierarchy, adding new GUI elements became easier.
  • The other important class is the AudioEngine. This is the piece responsible for all audio processing, for example: FFT's, overlap-add, application of envelopes to the audio, and output to RtAudio. Containing the audio processing in a separate class ensure that the GUI elements needs to know nothing about audio samples.


Author

Roy Fejgin


Milestones

Planned:

  • 11/16/09:
  • Waterfall window; "Time domain edit" window; audio rendering.
  • 11/23/09:
  • Apply-to window
  • 12/07/09:
  • Frequency domain processing:
  • "Frequency-domain edit" window
  • More sophisticated DSP: overlap add
  • Add harmonic series

Actual:

  • 11/16/09:
  • Waterfall window; audio rendering; file loading; time domain envelope manipulation (without GUI).
  • 11/30/09:
  • Envelope drawing (GUI); overlap-add resynthesis; frequency and phase envelope.
  • 12/09/09:
  • Major code redesign
  • Range selection; buttons: randomize envelope, load file.


Ideas for the Future

The real time visualization via the waterfall display and the real time audio would be a useful basis for demonstrating many additional audio concepts, some of which were intended to be part of this project but to which I did not get. The first enhancements I would make are:

  • interactive synthesis: add harmonic series or noise. Control non-harmonicity level of the harmonic series.
  • more envelope-range combination: e.g apply time-domain envelope to a frequency range.
  • 'finalize' button: apply the current envelopes to the waveforms so the user can generate new ones on top of them.


Credits

The program uses:

  • FFT routines from ChucK.
Authors: Ge Wang and Perry R. Cook.
  • Code from Tapestrea for the 'open file' dialog.
Authors: Ananya Misra, Perry R. Cook, and Ge Wang.
  • STK for reading the WAV file.
Authors: Perry R. Cook and Gary P. Scavone.
  • RtAudio for real-time audio.
Author: Gary P. Scavone.