Difference between revisions of "Sound Explorer"

From CCRMA Wiki
Jump to: navigation, search
 
(17 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
== Idea / Premise ==
 
== Idea / Premise ==
 
Sound Explorer is an environment for exploring and shaping sounds in real time.
 
Sound Explorer is an environment for exploring and shaping sounds in real time.
 +
 +
 +
== Getting It ==
 +
Sound Explorer is available as a Windows Visual Studio project. You can get it [http://ccrma.stanford.edu/~rfejgin/256a/finalproj/snd-explorer.zip here].
 +
 +
Note: I've tried to include as many of the required libraries as possible in this zip file. However you might still need Microsoft's [http://www.microsoft.com/downloads/details.aspx?FamilyID=b66e14b8-8505-4b17-bf80-edb2df5abad4&displaylang=en#dx DirectX SDK].
 +
 +
== Screenshot ==
 +
[[File:Snd-exp.jpg|800px]]
 +
  
 
== Motivation ==
 
== Motivation ==
As I looked at the waterfall plots generated by sndpeek and previous assignments in this class, it was fascinating to see so much information about the audio being played in one glance. Correlating the audio with the visual display gave me insight into time and frequency domain properties of the audio. The purpose of the project is to allow the user to further explore properties of audio by allowing them to manipulate waveforms and listen to the results in real time.
+
The program was created for educational purposes. When learning about new audio and DSP concepts, it's sometimes hard to gain an intuition about how the various theoretical ideas actually sound. It seems therefore that it would be useful to have a tool that lets a user shape waveforms and see and hear the results. Also, from my experience using sndpeek and assignment 3 of this class, there's a strong motivation to make the tool real-time since that helps the user in making the connection between the type of manipulation and the change to the sound.
  
== Product Description ==
 
Sound Explorer will allow the user to interactively construct and shape sets of waveforms. The results will be displayed on the waterfall display and the audio played back in real time. The ways in which the waveform can be shaped are:
 
* Frequency domain:
 
:* Generate a harmonic series starting at a given frequency
 
::* Control the amount of non-harmonicity (i.e. how much the partials deviate from multiples of the the base frequency).
 
:* Generate white noise
 
:* Draw and apply spectral envelope
 
* Time domain:
 
:* Draw and apply a time-domain envelope
 
  
In addition, for most of the above shaping methods, there will be a way to control which part of the waveform to apply them to. For example, the it will be possible to apply a time-domain envelope to a subset of the spectrum.
+
== Project Description ==
 +
Sound Explorer allows the user to interactively shape waveforms. The results are displayed using a waterfall plot and the audio played back in real time.  
  
 +
The main way the user shapes the sound is through envelopes. Envelopes can be be applied in the time domain or frequency domain. In the frequency domain there is a choice between shaping the spectral envelope or the phase envelope. There is also a way to choose which subset (in time or frequency) to apply the envelope to.
  
== Design ==
+
 
* Interface:
+
== Interface Design ==
:The interface is made out of three elements: graphical display, keyboard input, and mouse input.
+
:The interface is made out of three parts: graphical display, keyboard input, and mouse input.
:* Graphical display
+
:* Graphical display:
:: The screen is divided into three parts
+
:: The screen is made of these GUI elements:
 
::* Waterfall display: a waterfall plot of the audio currently being played
 
::* Waterfall display: a waterfall plot of the audio currently being played
::* Edit window: this is where the user manipulates the waveform. At any time, this window is either in "additive mode" or "envelope mode". Each of those modes be in the frequency or time domain.
+
::* Envelope window: this is where the user manipulates the waveform. The envelope may be in time, frequency or phase mode. The user draws the envelope with the mouse. Drawing must be from the left to the right.
::* Apply-to window: in this window, the user highlights which portion of the audio to apply the edit to. The window can be in the time or frequency domain.
+
:::The envelope mode (time, frequency or phase) can be changed using a right-click menu, or using the keyboard.
:* Mouse input: is used to draw envelopes and select ranges (in the apply-to window).
+
::* Range window: in this part of the window, the user highlights which portion of the audio to apply the envelope to. The range's mode changes automatically to match the envelope mode. For example, when the envelope is in frequency mode, the range is in frequency mode as well, and the range specifies which subset of frequencies to apply the spectral envelope to.
:* Keyboard input: used to control modes and various parameters.
+
:::The purpose the range element is to allow the user to 'zoom in' on parts of the audio they are interested in and manipulate those specific parts.
 +
::* Buttons: there are currently two buttons:
 +
:::* 'Load File' button: loads a waveform from a WAV file.
 +
:::* 'Random' button: randomizes the envelope. Applied to the current envelope mode.
 +
:* Mouse input: is used to draw envelopes, select ranges, and push buttons.
 +
:* Keyboard input: used to control the envelope mode: 'f' for frequency, 'p' for phase, and 't' for time.
  
* Software
 
:* The program will use OpenGL for graphics, RtAudio for audio, and FFT routines from the Chuck.
 
:* I will attempt to construct the program using the model / view / controller design patten. The model, for example, will contain the current (and next) set of waveforms, the current envelope values, and the range and domain(s) to which the envelope(s) is applied.
 
* Real Time interaction
 
**  The end goal is to have the user's interactions reflected in audio and graphics immediately. Initially, however, there may be two steps involved: 1) edit the wave:form 2) apply the changes and hear/see them.
 
  
== Testing ==
+
== Software Design ==
The software will be tested by letting a user try it out and evaluate the:
+
:* The project uses polymorphism to simplify the management of the various parts of the GUI. There is a base class called 'UI_Element' from which all the classes that form the GUI are descended. This allows the main display function to be very simple. That function's core looks like this:
* flexibility / expressiveness
+
  // Draw *everything* ----------------
* sound quality
+
  for(int i=0; i<ELEM_LAST; i++) {
* sound-annoyingness level
+
    if(elements[i]->isActive()) {
 +
      elements[i]->draw();
 +
    }
 +
  }
 +
:: Once the project was restructured with this hierarchy, adding new GUI elements became easier.
  
== Team ==
+
:* The other important class is the AudioEngine. This is the piece responsible for all audio processing, for example: FFT's, overlap-add, application of envelopes to the audio, and output to RtAudio. Containing the audio processing in a separate class ensures that the GUI elements needs to know nothing about audio samples.
 +
 
 +
 
 +
== Author ==
 
Roy Fejgin
 
Roy Fejgin
 +
  
 
== Milestones ==
 
== Milestones ==
 +
''' Planned:'''
 
* 11/16/09:
 
* 11/16/09:
 
:* Waterfall window; "Time domain edit" window; audio rendering.
 
:* Waterfall window; "Time domain edit" window; audio rendering.
Line 56: Line 68:
 
::* More sophisticated DSP: overlap add
 
::* More sophisticated DSP: overlap add
 
::* Add harmonic series
 
::* Add harmonic series
 +
 +
''' Actual:'''
 +
* 11/16/09:
 +
:* Waterfall window; audio rendering; file loading; time domain envelope manipulation (without GUI).
 +
* 11/30/09:
 +
:* Envelope drawing (GUI); overlap-add resynthesis; frequency and phase envelope.
 +
* 12/09/09:
 +
:* Major code redesign
 +
:* Range selection; buttons: randomize envelope, load file.
 +
 +
 +
== Ideas for the Future ==
 +
The real time visualization via the waterfall display and the real time audio would be a useful basis for demonstrating many additional audio concepts, some of which were intended to be part of this project but to which I did not get. The first enhancements I would make are:
 +
:* interactive synthesis: add harmonic series or noise. Control non-harmonicity level of the harmonic series.
 +
:* more envelope-range combination: e.g apply time-domain envelope to a '''frequency''' range.
 +
:* 'finalize' button: apply the current envelopes to the waveforms so the user can generate new ones on top of them.
 +
 +
 +
== Credits ==
 +
The program uses:
 +
:* FFT routines from ChucK.
 +
:::Authors: Ge Wang and Perry R. Cook.
 +
:* Code from Tapestrea for the 'open file' dialog.
 +
:::Authors: Ananya Misra, Perry R. Cook, and Ge Wang.
 +
:* STK for reading the WAV file.
 +
:::Authors: Perry R. Cook and Gary P. Scavone.
 +
:* RtAudio
 +
:::Author: Gary P. Scavone.

Latest revision as of 00:58, 5 April 2010

Sound Explorer

Idea / Premise

Sound Explorer is an environment for exploring and shaping sounds in real time.


Getting It

Sound Explorer is available as a Windows Visual Studio project. You can get it here.

Note: I've tried to include as many of the required libraries as possible in this zip file. However you might still need Microsoft's DirectX SDK.

Screenshot

Snd-exp.jpg


Motivation

The program was created for educational purposes. When learning about new audio and DSP concepts, it's sometimes hard to gain an intuition about how the various theoretical ideas actually sound. It seems therefore that it would be useful to have a tool that lets a user shape waveforms and see and hear the results. Also, from my experience using sndpeek and assignment 3 of this class, there's a strong motivation to make the tool real-time since that helps the user in making the connection between the type of manipulation and the change to the sound.


Project Description

Sound Explorer allows the user to interactively shape waveforms. The results are displayed using a waterfall plot and the audio played back in real time.

The main way the user shapes the sound is through envelopes. Envelopes can be be applied in the time domain or frequency domain. In the frequency domain there is a choice between shaping the spectral envelope or the phase envelope. There is also a way to choose which subset (in time or frequency) to apply the envelope to.


Interface Design

The interface is made out of three parts: graphical display, keyboard input, and mouse input.
  • Graphical display:
The screen is made of these GUI elements:
  • Waterfall display: a waterfall plot of the audio currently being played
  • Envelope window: this is where the user manipulates the waveform. The envelope may be in time, frequency or phase mode. The user draws the envelope with the mouse. Drawing must be from the left to the right.
The envelope mode (time, frequency or phase) can be changed using a right-click menu, or using the keyboard.
  • Range window: in this part of the window, the user highlights which portion of the audio to apply the envelope to. The range's mode changes automatically to match the envelope mode. For example, when the envelope is in frequency mode, the range is in frequency mode as well, and the range specifies which subset of frequencies to apply the spectral envelope to.
The purpose the range element is to allow the user to 'zoom in' on parts of the audio they are interested in and manipulate those specific parts.
  • Buttons: there are currently two buttons:
  • 'Load File' button: loads a waveform from a WAV file.
  • 'Random' button: randomizes the envelope. Applied to the current envelope mode.
  • Mouse input: is used to draw envelopes, select ranges, and push buttons.
  • Keyboard input: used to control the envelope mode: 'f' for frequency, 'p' for phase, and 't' for time.


Software Design

  • The project uses polymorphism to simplify the management of the various parts of the GUI. There is a base class called 'UI_Element' from which all the classes that form the GUI are descended. This allows the main display function to be very simple. That function's core looks like this:
 // Draw *everything* ----------------
 for(int i=0; i<ELEM_LAST; i++) {
   if(elements[i]->isActive()) {
     elements[i]->draw();
   }
 }
Once the project was restructured with this hierarchy, adding new GUI elements became easier.
  • The other important class is the AudioEngine. This is the piece responsible for all audio processing, for example: FFT's, overlap-add, application of envelopes to the audio, and output to RtAudio. Containing the audio processing in a separate class ensures that the GUI elements needs to know nothing about audio samples.


Author

Roy Fejgin


Milestones

Planned:

  • 11/16/09:
  • Waterfall window; "Time domain edit" window; audio rendering.
  • 11/23/09:
  • Apply-to window
  • 12/07/09:
  • Frequency domain processing:
  • "Frequency-domain edit" window
  • More sophisticated DSP: overlap add
  • Add harmonic series

Actual:

  • 11/16/09:
  • Waterfall window; audio rendering; file loading; time domain envelope manipulation (without GUI).
  • 11/30/09:
  • Envelope drawing (GUI); overlap-add resynthesis; frequency and phase envelope.
  • 12/09/09:
  • Major code redesign
  • Range selection; buttons: randomize envelope, load file.


Ideas for the Future

The real time visualization via the waterfall display and the real time audio would be a useful basis for demonstrating many additional audio concepts, some of which were intended to be part of this project but to which I did not get. The first enhancements I would make are:

  • interactive synthesis: add harmonic series or noise. Control non-harmonicity level of the harmonic series.
  • more envelope-range combination: e.g apply time-domain envelope to a frequency range.
  • 'finalize' button: apply the current envelopes to the waveforms so the user can generate new ones on top of them.


Credits

The program uses:

  • FFT routines from ChucK.
Authors: Ge Wang and Perry R. Cook.
  • Code from Tapestrea for the 'open file' dialog.
Authors: Ananya Misra, Perry R. Cook, and Ge Wang.
  • STK for reading the WAV file.
Authors: Perry R. Cook and Gary P. Scavone.
  • RtAudio
Author: Gary P. Scavone.