Skip to content

Instantly share code, notes, and snippets.

@charlieroberts
Last active September 19, 2017 03:56
Show Gist options
  • Save charlieroberts/13a69830f813f026d9a886510e9dc455 to your computer and use it in GitHub Desktop.
Save charlieroberts/13a69830f813f026d9a886510e9dc455 to your computer and use it in GitHub Desktop.
lecture notes on the fundamentals of the web audio API and analysis for IGM 330.02, Fall 2017

Web Audio API

Basics

  • Two versions, one by Mozilla and one by Chrome
    • Do DSP in JavaScript (via per-sample callbacks) vs do DSP using pre-built C++ nodes scripted from JavaScript.

      • Chrome wins! DSP is mainly done using built-in C++ nodes and controlling them via JS
    • Nodes are assembled into a graph. Nodes are also called unit generators, a carryover from the original music programming languages

    • You can still do JavaScript DSP using the ScriptProcessor node; see http://charlie-roberts.com/gibberish & http://charlie-roberts.com/genish

The AudioContext (and an oscillator)

  • First, we need to grab an AudioContext object.

    • With canvas, the context is responsible for all drawing operations
    • With audio, the context is primarily responsible for creating new audio nodes. In this way it serves as a factory object.
    • The audio context also has a destination property that represents the digital to audio convertor on the computer... anything node connected to it will send sound to the speakers / headphone.
  • Try it out:

var ctx = new AudioContext()
var sin = ctx.createOscillator()
sin.connect( ctx.destination )

// now tell our sin oscillator to start running
// the 0 argument means start now!
sin.start( 0 )

// we can change the frequency....
sin.frequency.value = 220

// we can also tell the oscillator to gradually ramp
// to a new frequency. Time is measured in seconds since
// the audio context was first created, to get a relative time value
// we can use the ctx.currentTime property

sin.frequency.linearRampToValueAtTime( 1760, ctx.currentTime + 30 )

// to stop...
// sin.stop()

Analysis (slightly simplified for purposes of this class)

  • The sine oscillator we just heard is the fundamental unit of all audio synthesis
  • Jospeh Fourier proved that any wave (not just sound) can be represented by sums of sine waves with different frequencies and amplitudes (well, different phases too... but don't worry about that).
  • So how can we determine which frequencies are present in a wave?
  • That means we just take the dot product of sine waves at different frequencies with the audio we're analyzing. This is simplified; there's lots of tricks under the hood to make the Fast Fourier Transform (the algorithm that usually performs this analysis) efficient.
  • The WebAudio API creates readings divided in to frequency bins
    • Sampling rate = 44100 samples per second... nyquist frequency = 22050 Hz.
    • The fftSize determines how many samples are used to generate each FFT result.
    • The frequencyBinCount is a read-only property that is always equal to fftSize / 2. This gives us the nunber of different values we can use for our frequency visualization.
      • the default fftSize is 1024, for 512 bins... this means everytime we run our FFT the results consist of 512 different values representing the strengths of different frequency bins in our signal.
    • 22050 / 512 = 43 Hz per bin

Perception of frequency (pitch)

  • Frequency is not perceived linearly!!! It is logarithmic (same with amplitude!)
  • If pitch refers to our perception of frequency, then a frequency change of 60 to 70 Hz represents a much larger change in pitch then a change of 2000 to 2010 Hz.

Read the frequency of our sine oscillator

// go ahead and reuse our previous code
// except lets make a much longer, wider frequency sweep
sin.frequency.value = 0
sin.frequency.linearRampToValueAtTime( 880 * 4, ctx.currentTime + 30 )

var analyser = ctx.createAnalyser() // british spelling!

// set FFT size
analyser.fftSize = 64
console.log( analyser.frequencyBinCount ) // > 16

// connect our sin oscillator to our analyser node
sin.connect( analyser )

// create a typed JS array to hold analysis results
var results = new Uint8Array( analyser.frequencyBinCount )

// every second, get our results and print them
var loop = setInterval( function() {
  analyser.getByteFrequencyData( results )
  console.log( results.toString() )
}, 1000 )
  • So how about those results?
    • Note that the frequency registers a maximum strength across multiple bins
      • this is due to how the FFT works internally
    • We can get better precision by increasing the window size
      • however, this comes at the expense of temporal resolution!
      • classic tradeoff of FFTs: you can have frequency resolution, or temporal resolution, but not both :(

Homework: create a simple visualization of the FFT

  • use a square or saw oscillator instead of a sine oscillator
  • while sine waves only contain a single frequency (ideally), square and saw waves contain many overtone frequencies.
    • overtones are multiples of a base frequncy:
      • for example, 220 Hz has overtones of 440, 660, 880, 1110 etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment