Spectrograms are widely used in the fields of music, sonar, radar, and others. A spectrogram is a graphic representation of the frequency and intensity of a sound as it travels through time. Visualization into spectrogram can be done easily using python library called librosa. Librosa usually used for music and audio analysis. It provides the building blocks necessary to create music information retrieval systems. The code below is used to install the librosa library. We also need to install numpy because there is some calculation.
pip install librosa
pip install numpy
The code below is used to display audio into spectrogram. We are using .au file. You can use another audio format too.
import numpy as np
y, sr = librosa.load('pop.00007.au')
D = np.abs(librosa.stft(y))
db = librosa.amplitude_to_db(D,ref=np.max)
librosa.display.specshow(db, sr=sr, y_axis='hz', x_axis='time')
First we need to import the library. The third line is used to read the audio. It is returning audio signal and sample rate value. librosa.stft(y) returns an array of complex numbers, as one would expect from a Discrete Fourier Transform (DFT). These complex numbers give us phase and amplitude of the audio signal. The amplitude is converted into decibel by using librosa.amplitude_to_db. And the last we plot the decibel and sample rate by using librosa.display.specshow. y_axis and x_axis received a several parameter. You can see the accepted parameter in the list below.Frequency types:
- ‘linear’, ‘fft’, ‘hz’ : frequency range is determined by the FFT window and sampling rate.
- ‘log’ : the spectrum is displayed on a log scale.
- ‘fft_note’: the spectrum is displayed on a log scale with pitches marked.
- ‘fft_svara’: the spectrum is displayed on a log scale with svara marked.
- ‘mel’ : frequencies are determined by the mel scale.
- ‘cqt_hz’ : frequencies are determined by the CQT scale.
- ‘cqt_note’ : pitches are determined by the CQT scale.
- ‘cqt_svara’ : like cqt_note but using Hindustani or Carnatic svara
- ‘time’ : markers are shown as milliseconds, seconds, minutes, or hours. Values are plotted in units of seconds.
- ‘s’ : markers are shown as seconds.
- ‘ms’ : markers are shown as milliseconds.
- ‘lag’ : like time, but past the halfway point counts as negative values.
- ‘lag_s’ : same as lag, but in seconds.
- ‘lag_ms’ : same as lag, but in milliseconds.
This is the link of the audio visualization code which written in python.https://github.com/Garudabyte/audio-visualization
We highly recommend you to run the program in jupyter notebook. You can install jupyter notebook by following the instruction in the link below.https://garudabyte.com/how-to-open-ipynb-file/
You can also see another audio visualization in the link below.https://garudabyte.com/visualize-audio-using-python-into-waveplot/https://garudabyte.com/visualize-audio-using-python-into-chromagram/