Audio De-Thumping Using Huang's Empirical Mode Decomposition

Paulo A. A. Esquef and Guilherme S. Welter

Coordination of Systems and Control
National Laboratory for Scientific Computing - MCTI
Petrópolis, Brazil

Companion Webpage with Audio Examples of the Paper Published in the Proceedings of the

14th International Conference on Digital Audio Effects, 2011

1. Abstract

In the context of audio restoration, sound transfer of broken disks usually produces audio signals corrupted with long pulses of low-frequency content, also called thumps. This paper presents a method for audio de-thumping based on Huang's Empirical Mode Decomposition (EMD), provided the pulse locations are known beforehand. Thus, the EMD is used as a means to obtain pulse estimates to be subtracted from the degraded signals. Despite its simplicity, the method is demonstrated to tackle well the challenging problem of superimposed pulses. Performance assessment against selected competing solutions reveals that the proposed solution tends to produce superior de-thumping results.

PDF version of the paper here.

2. Graphical User Interface for the Calibration of Processing Parameters (in Matlab)

Click here to download the ZIP file.

Click on the image below to download a demonstration video (.wmv) on the use of GUI.

3. Reference Test Signals

a. pop: a 14-second long exerpt of Finnish pop music with male and female singing;
b. jazz: an 8-second long excerpt of jazz quartet music with drums, bass, guitar and sax;
c classic: a 13-second long excerpt of orchestral music with a continuously sustained bass chord, slowly varying string passage and percussion;
d. ethnic: an 11-second long excerpt of Brazilian music featuring male singing, folk fiddle, and prominent percussion beating;
e. drums: an 11-second long solo of jazz drums;
f. bass: a 13-second long of acoustic bass with sparse notes;
g. singing: a 20-second long excerpt of solo pop singing a capella.

Click on the signal names above to download the corresponding .WAV files.

4. Corrupted and Restored Signals

	PAQM
Reference Test Signals	Corrupted	EMD	CEEMD	TPSW	ARS
pop	0.2012	0.0243	0.0283	0.0362	0.0463
jazz	0.3175	0.0121	0.0311	0.0276	0.0607
classic	0.3207	0.0105	0.0233	0.0173	0.0334
ethnic	0.1716	0.0315	0.0299	0.0510	0.0481
drums	0.4465	0.0289	0.0969	0.0486	0.0143
bass	0.3975	0.0151	0.0304	0.0312	0.1884
singing	0.6163	0.0916	0.2819	0.1105	0.0665

ARS = AR separation de-thumping method with processing setup defined in [1].
TPSW = TPSW-based audio de-thumping with processing setup defined in [1].
The smaller (near zero) the PAQM, the more similar perceptually to the reference the processed signals are supposed to be.
The smallest value within the set of restored versions for each reference signal is highlighted.

[1] P. A. A. Esquef, L. W. P. Biscainho, and V. Välimäki, “An efficient algorithm for the restoration of audio signals corrupted with low-frequency pulses,” J. Audio Eng. Soc., vol. 51, no. 6, pp. 502–517, June 2003.

Author: Paulo Esquef.
Contact: pesquef@lncc.br
Last Modified: 03.08.2012