Let us make an audio visualizer using typescript

Bharat Kalluri / 2020-12-08

Let us make an audio visualizer using typescript

Making an audio visualizer has always been a dream for me. Ever since I saw those winamp visualizations. Recently I thought I will give it another shot. Apparently it is not hard as it looks. Web Audio API simplifies a lot of the hard parts for us.

What we will be building by the end

Go ahead and pick and mp3 or an ogg file!

The Web Audio API

What we are building is called a frequency bar graph. For every frame paint, we get the frequencies as integers. We use those integers to plot the bars on the graph. Frequencies are retrieved by doing a Fast Fourier Transform (FFT) on the song's signal.

This is where it gets complicated. Fourier transform is one of the most beautiful mathematical equations. But I do not have enough intuition about it to explain just as yet. Maybe one day when it clicks, I will have a clear writeup on it. For now the gist of it is that, it takes a complex signal and breaks it into simpler signals. From which we can extract frequencies. (This is how my understanding is, I might not be a hundred percent accurate here).

How do we get these frequencies? The web audio API has a utility called analyzer. From the analyzer, we can get the byte frequency data. We need to set the size of the FFT (Fast Fourier Transform) and we will get back those many frequencies. One condition here is that it must be a power of 2 between 32 and 32768.

One more concept to understand is that WebAudio works in terms of nodes. These individual nodes form a graph and this graph is executed. For example, the web audio API has a node called the Gain Node. If we set the gain value to 1, that means it is the loudest. 0 means silence. A media element can be created and a Gain node can be attached to it. And then the gain node is connected to the destination (which is the speaker). So when the program is run, the audio flows through all the nodes before reaching the destination.

Analyzer is also a node. One advantage of such an approach is that a node can have multiple nodes which are its children. One media element can route to a destination and an analyzer at the same time. This is an important idea to digest.

node visualization

node visualization

The actual visualization is not that hard. All the values will be plotted as rectangles on a canvas with a color. That's it. One function to be aware of here is the requestAnimationFrame. this method tells the browser that we wish to perform an animation and requests the browser to call a specified function. Typically this matches the refresh rate of the display.

Lets code!

Now that the theory is out of the way. Let us first start by creating a template in HTML of what we want.

<input type="file" id="audioPicker" accept="audio/*" />
<audio>No support for audio</audio>
<canvas id="canvas"></canvas>

A simple input, to pick up the audio file. A audio tag to actually play the audio and a canvas element to draw the frequency bar graph. Lets jump into the typescript piece now.

const audioContext = new (window.AudioContext || window.webkitAudioContext)();
const audioElement = document.getElementById('audio');
const audioPickerElement = document.getElementById('audioPicker');
const canvasElement = document.getElementById('canvas') as HTMLCanvasElement;
canvasElement.width = window.innerWidth;
canvasElement.height = window.innerHeight;
const canvasCtx = canvasElement.getContext('2d');

Some basic declarations, we get the AudioContext first. This interface represents the audio processing graph built from different nodes. Then we use the getElementById to get the audioElement, audioPickerElement and canvasElement. We will be mutating them soon. Then we set the canvas height and width. And finally get the canvas context, using which we can draw on the canvas.

audioPickerElement.onchange = function () {
// @ts-ignore
const files = this.files;
audioElement.src = URL.createObjectURL(files[0]);
audioElement.load();
audioElement.play();
};

Staring simple, on file change. The first file is retrieved and set as the source of the audio element. Then the audioElement is loaded and played. At this point, you should be able to pick a song and should here the audio playing. Now let us involve the WebAudio piece.

// ...from now the code you see here is a part of the onchange function of audiopicker in the above example
const track = audioContext.createMediaElementSource(audioElement);
track.connect(audioContext.destination);
// Analyzer node
const analyser = audioContext.createAnalyser();
analyser.fftSize = 128;
track.connect(analyser);

This is the graph construction. The base element of the graph is the createMediaElementSource, which is the actual audio source. That is connected to a destination (audio out/speaker) on one end and a analyzer. This is what the illustration was showing earlier. We skipped the gain node for simplicity. An analyzer node can provide realtime frequency and time domain analysis. We will be needing the frequency data from this later on. The FFT size is set to 128, so that we only get 128 values, which we can plot. Too many values means the bars will be much thinner.

// Creating the array to store the frequency data
const bufferLength = analyser.frequencyBinCount;
const dataArray = new Uint8Array(bufferLength);

From here, an Uint8Array of the fftSize needs to be created to store all the frequency data will keep flowing to.

// Some useful constants
const WIDTH = canvasElement.width;
const HEIGHT = canvasElement.height;
const barWidth = (WIDTH / bufferLength) * 2.5;
let barHeight;
let x = 0;
// Colors used for plotting
const MATTE_BLACK = '#1A202C';
const WHITE = '#FFFFFF';
// The function which will get called on each repaint
function draw() {
requestAnimationFrame(draw);
if (canvasCtx !== null) {
x = 0;
analyser.getByteFrequencyData(dataArray);
canvasCtx.fillStyle = WHITE;
canvasCtx.fillRect(0, 0, WIDTH, HEIGHT);
for (let i = 0; i < bufferLength; i++) {
barHeight = dataArray[i];
canvasCtx.fillStyle = MATTE_BLACK;
canvasCtx.fillRect(x, 0, barWidth, barHeight);
x += barWidth + 3;
}
}
}
draw();

Some constants such as width, height, barWidth (width is multiplied by 2.5 just to make the bars look bigger, try without the 2.5 as well to see why we do this) and some colors are defined for convenience. Now to the important part. The draw function is the function that actually paints the canvas. On each invocation, we first call the requestAnimationFrame with the same function as input so that it calls the draw function around 60 times per second (if you have a 60 hertz display). Remember that all of this is happening async.

Inside the function, we start with x=0, Which is (0,0) in the canvas. Then we use the getByteFrequencyData function of the analyzer to populate the frequency data to the dataArray we declared earlier. I suggest you to have a look at the data array to get a taste of what actually is being populated (caution: be careful while doing this, if a song is loaded and you have a console log in the draw function, huge arrays of size 128 will be logged in the console >30 times per second, this will crash the browser or really slow it down).

Now that the data is in an array, set the canvas background to white. And for each element in the array, plot a rectangle. The coordinates will be from (x, y, width, height) => (0, 0, barWidth, barHeight). Fill this rectangle with MATTE_BLACK and increment the x by barWidth + 3. So assuming the bar width is 50px, then the second rectangle will have the following co-ordinates (53, 0, barWidth, barHeight). This keeps going on for each element in the array. That is how one frame is drawn.

This repeats >30 (based on the refresh rate of the display) times in one second to give you a feel like it is a smooth continuous moving graph. 🤯

We just scratched the surface here, there are a lot of awesome visualizations out there. This is one of my favorites.

This article was inspired by MDN's article about visualizations with the Web Audio API.

Subscribe to the newsletter

Spotify album cover

NowReading logDashboardUses