Buildtide
Author: Hussain Mir Ali

I am interested in web and mobile technologies.

If you have any questions or feedback then message me at devtips@Buildtide.com.

Siri visualization in Browser

Sound recording and playback are not visually appealing features in most web and mobile applications. But since sound is a 3D wave it can be visualized easily in 2D via x-y plane where amplitude and period decide the shape of the wave. Lately with the introduction of Siri in iOS devices the visualization of sound has become an eye-catching UI for the user. In this blog post a single web page is implemented to record sound. The sound is visualized using siriwavejs library to make the process of recording better presentable.




Fig 1.0 iOS 9 style



Fig 2.0 Default style



Technical Background



Fig 3.0 Web Audio Block Diagram

The block diagram in Fig 3.0 shows  how the audio is processed from source to the destination node. The 'source' in this case is the microphone from user's computing device. 
The 'Analyser' represents the AnalyserNode which is used to get the frequency data. The 'Processor' node represents the ScriptProcessorNode which is used to detect change in audio signal via its '.onaudioprocess' event handler and the amplitude data is also collected using ScriptProcessorNode. The 'destination' is the output device(speakers/headphones) for the computer. But in this case the 'Processor' node outputs a buffer with all 0 values for amplitude because no processing is done on the output buffer. 

AnalyserNode:

The frequency data from AnalyserNode is collected using FFT(Fast Fourier Transform). The size for FFT is set to be 4096. Default sampling rate is 48000 Hz for the AnalyserNode.

Sampling Rate: 48000 Hz
Frequency band: (Sampling Rate)/2 = 24000 Hz
FFT size: 4096
#spectral data points: (FFT size)/ 2 = 2048
Spectral data point resolution: (Frequency band)/(FFT size) = 24000/2048 = 11.7185 Hz


From calculations it is evident that AnalyserNode will return 2048 data points for each time step and the resolution of those data points increases by 11.7185 Hz. Each spectral data point represents magnitude in dB for specific frequency resolution.

In the code the speed for siriwave is set using following equation: 
speed =  ((1+spectral data index)*spectral data point resolution)/(Frequency band)
           =   ((1+spectral data index)*11.7185)/24000

'spectral data index' represents the index which has the highest magnitude in dB.

ScriptProcessorNode:

The buffer size used for the script processor node is 1024 with 1 input and 1 output channel. The PCM data is retrieved from this node. It is scaled between -1 and 1 for amplitude at a single time step with a total of 1024 time steps.

In the code the amplitude for siriwave is set using the following equation:
amplitude = (max amplitude)*10


Live Demo Link:

https://siriwavejs.herokuapp.com/

If issues occur directly copy link and paste in browser.

Demo source code:

https://github.com/husenxce/siriwaveDemo/tree/master

siwirwavejs library link:

https://github.com/kopiro/siriwavejs