Difference between revisions of "Spectrograms"
Martinwguy (talk | contribs) (→Software) |
Martinwguy (talk | contribs) (→Software) |
||
Line 63: | Line 63: | ||
Graphical programs that can directly display log-frequency-axis spectrograms are: | Graphical programs that can directly display log-frequency-axis spectrograms are: | ||
* the free audio editor [http://www.audacityteam.org Audacity], though the output is blockier than ours | * the free audio editor [http://www.audacityteam.org Audacity], though the output is blockier than ours | ||
− | * the free audio file viewer "sonic-visualiser", which also has a Constant-Q spectrogram VAMP plugin | + | * the free audio file viewer "[http://www.sonicvisualiser.org sonic-visualiser]", which also has a Constant-Q spectrogram VAMP plugin |
=References= | =References= | ||
<references/> | <references/> |
Revision as of 00:19, 4 February 2016
Spectrograms are used in the WikiDelia to visualise the sonic content of Delia's pieces of music.
In each spectrgram, time runs from left to right, low frequencies are at the bottom and high ones at the top and the light at each point in the graph represents the energy in the sound at one frequency at a particular moment (or, rather, in one frequency band around a particular moment.)
As well as helping us understand the internal structure of Delia's pieces and her instruments and effects, these also help us recreate conventional scores from her sound files, for example:
Contents
Logarithmic frequency axis
Most FFT-based spectrograph programs's output has a linear frequency axis, usually from 0Hz to 22050Hz for a CD-quality piece, in which the top half of the graph represents the top octave of the sound, the inaudible 11025-22050Hz band, with all the musical detail crushed into the bottom rows of pixels.
Even if you zoom in on the interesting part of the spectrogram, the top half of the graph always represents the top octave of the visible frequency range.
What we would like is for each octave to be given the same height in the graph.
The spectrograms used in the WikiDelia are not the usual kind. Their vertical scale is logarithmic, which gives the same number of pixel rows per semitone.
Not only does this give a graphic representation to music similar to conventional score notation for the notes and rhythms but also give a characteristic graphical footprint to different notes of the same instrument, of the same height for higher and lower notes.
Usage in the WikiDelia
The spectrogram of a piece goes in three places:
- On the piece's page in a section Spectrogram usually just above Availability so that the Listen button is near.
- Spectrograms of complete pieces are on the Audio page
- in delia-derbyshire.net/spectrograms
For example the piece Air has File:Air.ogg and File:Air - Spectrogram.jpg, used by the MediaWiki macros {{Spectrogram|Air - Spectrogram}} and {{Spectrogallery|Air}}
Get spectrograms of your music!
I am happy to run the log spectrum analyser on your music. You can specify:
- lowest pitch (usually 27.5Hz)
- number of octaves (usually 9, to 14080Hz)
- number of pixels per semitone on the frequency axis (usually 8)
- number of pixel columns per second on the time axis (usually 50)
Optionally the software can superimpose single-pixel black and white lines at the frequencies of the piano keys and three-pixel-wide white lines at the positions of the manuscript stave lines, see the example on the right.
If this interests you, please Make a small donation and email delia.derbyshire.net@gmail.com attaching the sound file you would like turned into a picture.
Software
The WikiDelia's spectrum analyser, "mkjpg", was written specifically for it using a modified version of sndfile-spectrogram to prepare a linear spectrogram which is then distorted by an ImageMagick script to give it a logarithmic frequency axis.
The program "Sox" can also be used to produce the linear spectrogram, but you need this modified version to remove the limits on output image size, to normalise the output's brightness, and to make it 250 times faster and not need 16GB of RAM.
An alternative technique would be to perform a Constant-Q tranform directly instead of distorting a linear spectrogram. Candidates are:
- Judith Brown's brute force algorithm, "logft" from 1988-91.
- Brown and Puckett's efficient algorithm,[1][2][3] using a precomputed FFT temporal kernel (a what?)
- An optimized version of the above, "constant-q-cpp", doing octave decimation of the signal to save compute time.[4][5]
but my results with the implementations I have found have so far been disapponting: crisper at the top but completely losing temporal detail in the lower frequency range.
Graphical programs that can directly display log-frequency-axis spectrograms are:
- the free audio editor Audacity, though the output is blockier than ours
- the free audio file viewer "sonic-visualiser", which also has a Constant-Q spectrogram VAMP plugin
References
- ↑ An efficient algorithm for the calculation of a constant Q transform by Brown and Puckette.
- ↑ The Constant Q Transform, an implementation in Matlab by Benjamin Blankertz
- ↑ An earlier implementation in more C-like C++ in a pitch detection plugin for Supercollider, licensed under GPL.
- ↑ Constant-Q Transform Toolbox for Music Processing: An optimization in MATLAB of Brown and Puckette's efficient Constant-Q algorithm.
- ↑ C++ Constant-Q at soundsoftware.co.uk, a C++ implementation of the above with permissive license.