Monday, 13 April 2015

BioAcoustica: Does MP3 work for wildlife sound?

MP3 has become the dominant format for sharing digital music, while many researchers in bioacoustics still use the raw waveform (WAV files). MP3 has the advantage of significantly smaller file size, which becomes a concern when creating potentially very large collections of sound like we are doing with BioAcoustica. Storing and serving MP3 files would be much less resource-intensive than storing and serving WAV files.

All other things being equal therefore, MP3 has an advantage over WAV. The disadvantages of MP3 are that is a lossy compression format: some of the original data is lost when encoding a sound file as MP3. The general acceptance of the MP3 format for high-fidelity audio is due, in large part, to the MP3 standard discarding data that the ears and/or brain cannot detect or process. Frequencies above the threshold of human hearing (approximately 20kHz in young humans) can safely be discarded. The MP3 format also discards information that the brain does not perceive, using various methods from the field of psychoacoustics.

The examples below are comparisons of a WAV recording of the cicada Platypleura haglundi from this recording on BioAcoustica, and a high quality (320kbit/s) MP3 conversion of the same file.

Frequency Response
The frequency vs amplitude charts for the WAV (left) and MP3 (right) files show that as well as removing any frequencies above 20kHz the conversion to MP3 lowers the relative amplitude of the low frequency sounds. On playback it is hard to tell the difference between the two files, showing that the MP3 format is capable of accurately reproducing how humans hear sound, while discarding a lot of information.

 The spectrograms showing WAV (top) and MP3 (bottom) demonstrate what the effect of discarding higher frequencies has on visually identifiable features in the call. The WAV file shows clearly repeating sounds in the 17-21kHz range, while this information is lost in the MP3.

While high-quality MP3 files will prove adequate for those wishing to learn to identify acoustic species by ear, for other methods MP3 encoding can discard useful information. This is particularly true where components of the sound are above 20kHz, and in applications where the relative amplitudes of different frequencies may prove useful in identification.