Acoustic Characterisation of Environments (ACE) Challenge
Researchers: James Eaton, Alastair H. Moore, Nikolay D. Gaubitch, Patrick Naylor
Several established parameters and metrics have been used to characterize the acoustics of a room. The most important are the Direct-To-Reverberant Ratio (DRR), the Reverberation Time (T60) and the reflection coefficient. The acoustic characteristics of a room based on such parameters can be used to predict the quality and intelligibility of speech signals in that room. Recently, several important methods in speech enhancement and speech recognition have been developed that show an increase in performance compared to the predecessors but do require knowledge of one or more fundamental acoustical parameters such as the T60. Traditionally, these parameters have been estimated using carefully measured Acoustic Impulse Responses (AIRs). However, in most applications it is not practical or even possible to measure the acoustic impulse response. Consequently, there is increasing research activity in the estimation of such parameters directly from speech and audio signals.
ACE Challenge
Overview
The ACE Challenge was part of the programme of Challenges organised by the IEEE Audio and Acoustic Signal Processing Technical Committee.
The aim of this challenge was to evaluate state-of-the-art algorithms for blind acoustic parameter estimation from speech and to promote the emerging area of research in this field. Participants will evaluate their algorithms for T60 and DRR estimation against the ‘ground truth’ values provided with the data-sets. Furthermore, they are expected to present the results in a paper describing the method used.
- Data: A data-set specifically designed for the challenge tasks was provided using anechoic speech convolved with AIRs measured from real rooms with additive noise recorded under the same conditions. This included speech from male and female talkers in different sized rooms and different noise conditions for a single microphone and for microphone arrays with two (laptop), three (mobile), five (cruciform), eight (linear), and thirty-two (spherical) microphones
- Task 1: Single-microphone fullband T60 and DRR estimation
- Task 2: Multi-microphone fullband T60 and DRR estimation
- Task 3: Single-microphone T60 and DRR estimation in 1/3-octave ISO subbands
- Task 4: Multi-microphone T60 and DRR estimation in 1/3-octave ISO subbands
- Evaluation: The evaluation metrics were to be based on the ground truth values determined using established techniques across a range of dimensions in addition to T60 and DRR such as SNR, talker, and utterance length.
ACE Challenge Results
We received over 100 results submissions from 9 teams across the world. Many thanks to all the participants. An ACE Corpus conference paper was presented at WASPAA 2015 in Lecture Session 5. The paper submissions by participants were presented as posters in Poster Session 3.
The papers presented are listed below:
In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2015:
- J. Eaton, N. D. Gaubitch, A. H. Moore, and P. A. Naylor, The ACE challenge – corpus description and performance evaluation
- F. Lim, M. R. P. Thomas, P. A. Naylor and I. J. Tashev, Acoustic blur kernel with sliding window for blind estimation of reverberation time
- T. de M. Prego, A. A. de Lima, R. Zambrano-López, and S. L. Netto, Blind estimators for reverberation time and direct-to-reverberant energy ratio using subband speech decomposition
In Proc. ACE Challenge Workshop, a satellite event of WASPAA, New Paltz, NY, USA, 2015:
- H. Chen, P. N. Samarasinghe, T. D. Abhayapala, and W. Zhang, Estimation of the direct-to-reverberant energy ratio using a spherical microphone array
- J. Eaton and P. A. Naylor, Reverberation time estimation on the ACE corpus using the SDD method
- J. Eaton and P. A. Naylor, Direct-to-reverberant ratio estimation on the ACE corpus using a two-channel beamformer
- Y. Hioka and K. Niwa, PSD estimation in beamspace for estimating direct-to-reverberant ratio from a reverberant speech signal
- H. W. Löllmann, A. Brendel, P. Vary and W. Kellermann, Single-channel maximum-likelihood T60 estimation exploiting subband information
- P. P. Parada, D. Sharma, T. van Waterschoot, and P. A. Naylor, Evaluating the non-intrusive room acoustics algorithm with the ACE challenge
- M. Senoussaoui, J. F. Santos, and T. H. Falk, SRMR variants for improved blind room acoustics characterization
- F. Xiong, S. Goetze, and B. T. Meyer, Joint estimation of reverberation time and direct-to-reverberation ratio from speech using auditory inspired features
Analysis of the results is provided in an IEEE/ACM Transactions on Audio, Speech, and Language Processing journal paper:
- J. Eaton; N. D. Gaubitch; A. H. Moore; P. A. Naylor, "Estimation of room acoustic parameters: The ACE Challenge," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no.10, pp.1681-1693, Oct. 2016.
supported by a technical report containing full details of the results of the challenge and additional information on the room configurations.
- J. Eaton, N. D. Gaubitch, A. H. Moore, and P. A. Naylor, "ACE Challenge results technical report," Imperial College London, 2016.
ACE Corpus
The ACE corpus is freely available under the Creative Commons Attribution-NoDerivatives 4.0 International License.
The corpus and the ACE Challenge are described in the following journal paper:
- J. Eaton; N. D. Gaubitch; A. H. Moore; P. A. Naylor, "Estimation of room acoustic parameters: The ACE Challenge," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no.10, pp.1681-1693, Oct. 2016.
Please cite this whenever you use any part of the corpus.
The corpus comprises the following components:
Documentation and software
- Corpus instructions including software operating instructions
- Software to generate new datasets from the corpus materials (Matlab)
- T60 and DRR measurements in fullband and ISO-266 preferred frequency bands
- Room dimensions and approximate positions of microphones and sources
Anechoic speech
Comprising Development (Dev): 4 male talkers, 2 utterances each, and Evaluation (Eval): 5 male and 5 female talkers, 5 utterances each, recorded using the anechoic chamber at TU Delft at fs=48 kHz in 16-bit format. Plain text (.txt) transcriptions of each .wav file are included.
RIRs and noise by microphone configuration
Each archive below contains the set of fs=48 kHz 16-bit RIRs, ambient, fan and babble noise .wav files for each room and microphone position for that microphone configuration, recorded in 7 different rooms in the Dept. of Electrical and Electronic Engineering at Imperial College London.
- Single-channel (based on cruciform channel 1) 417 MB
- 2-channel laptop 1.05 GB
- 3-channel mobile 1.59 GB
- 5-channel cruciform 2.84 GB
- 8-channel linear 4.24 GB
- 32-channel spherical 14.2 GB
The ACE Corpus speech, RIRs, and noises are available by registering here.
If you have already registered click here to obtain the corpus.
Update
An additional set of RIRs have been added to the corpus recorded using the same equipment in Office 2 using 7 DPA4060 microphones in various positions in the room including the ceiling, wall, floor, table, under the table and in a bookcase. The T60 and DRR measurements and room dimensions are included, along with 7-channel ambient and fan noise recordings. The file size is 335 MB.
ACE Corpus by Imperial College London and University of Delft is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Contact us
Address
Speech and Audio Processing Lab
CSP Group, EEE Department
Imperial College London
Exhibition Road, London, SW7 2AZ, United Kingdom