University of Surrey Institute of Advanced Studies

Spatial audio and sensory evaluation techniques

Workshop Report

The first international workshop on 'Spatial audio and sensory evaluation techniques' was run at the University of Surrey on 6-7 April 2006, organised by Francis Rumsey and Slawomir Zielinski of the Institute of Sound Recording, in association with the Audio Engineering Society's British Section. There were 35 participants, including five PhD students from Surrey and a further five from other institutions. Delegates were drawn from a wide range of countries including USA and Canada (4) and Continental Europe (11).

As a result of the workshop, delegates identified the major issues to be addressed in forthcoming work on the sensory evaluation of spatial audio, in particular the need for broader agreement about attribute definitions and reference stimuli, the importance of separating hedonic and descriptive judgments, the need to take account of listening context and the importance of valid physical metrics for spatial audio quality evaluation. One of the most important academic outcomes of the seminar was the development of a greater degree of interdisciplinary understanding between those 'hard' scientists who might have wished that human variables could be minimised or eliminated, and those who felt such factors were really the key issues of future research. There was an overall plea arising from some delegates for a pragmatic approach to future research that avoided too much obsession with the finer points of methodology, that could be summarised in the words "Let's get on and do it".

Programme committee: Søren Bech (B&O, Denmark), Jan Berg (Luleå University of Technology, Sweden), Durand Begault (NASA, USA), Gilbert Soulodre (CRC, Canada), Thomas Sporer (Fraunhofer Institute, Germany), Francis Rumsey (UniS, UK), Slawomir Zielinski (UniS, UK)

Day 1

Spatial audio and sensory evaluation techniques - context, history and aims

Francis Rumsey opened the workshop with a review of the current state of the art in this field, explaining the background to spatial sound quality evaluation and its roots in concert hall and loudspeaker studies. He cited experiments in which it was shown that spatial quality could contribute as much as 30% of overall audio quality evaluations, making it an important factor. In particular he opened up questions about the reference point for evaluation of spatial sound quality, in the light of the fact that most listeners do not have a 'concert hall original' available for comparison. Perhaps it is important to find out what makes spatial audio sound pleasing and what factors consumers care about.

International standards for sound quality evaluation

Thomas Sporer provided an overview of the international standards situation in sound quality evaluation, describing standards that relate to subjective evaluation as well as those that aim at 'objective' evaluation of perceived quality based on physical metrics. He showed that while some standards included options for evaluating spatial attributes, these were very rarely used, and that normally only a mean opinion score was obtained. Similarly, objective measures, even though they could evaluate stereo signals, did not take spatial quality changes into account.

Can Reproduced Sound be Evaluated using Measures Designed for Concert Halls?

Gilbert Soulodre went on to describe the background to spatial audio quality evaluation in concert hall studies. He showed how good and bad halls had been distinguished on the basis of a range of metrics, including some that related to spatial quality. Apparent source width and listener envelopment had been found to represent two key dimensions of spatial quality, and these were related to early and late parts of the reflected sound. In concert hall design such issues could make or break the hall's success and were integral to the design of the building. Careers could be at stake. In the case of concert halls, the audience for the acoustician's results consists of musicians, audience, conductors, and so forth, whereas in audio subjective tests the audience is usually other audio engineers.

What are the requirements of a listening panel for evaluating spatial audio quality?

Nick Zacharov considered the various ways of choosing listeners to evaluate spatial audio quality using descriptive analysis. He outlined the principles of descriptive analysis showing how it could be used to map consumers' hedonic responses onto expert ratings of descriptive attributes of stimuli. Based on ISO definitions used in other fields such as food science, he proceeded to try to define the difference between assessors with different degrees of expertise, and delegates entered into a spirited discussion of what these different categories might mean in the case of audio. He summarised the characteristics of assessor objectivity in terms of repeatability, agreement and discrimination and looked at various ways of evaluating assessor performance.

How do we determine the attribute scales and questions that we should ask of subjects when evaluating spatial audio quality?

Jan Berg reviewed a range of different methods, many borrowed from other fields, that could be used to determine appropriate attribute scales and response formats for listener data. He discussed the issue of spatial audio scene complexity and the importance of defining questions carefully, as well as considering the differences between relative and absolute responses, and graphical and verbal response formats. Construct masking was an issue that could arise easily under some circumstances and it was important to be aware of its possibility. Some evidence from recent experiments in different centres suggested that the chosen method of language development might not be too crucial as similar results had been obtained using a range of methods.

Day 2

On some biases encountered in modern listening tests

Slawek Zielinski provided a strong set of reasons why hedonic responses would always be strongly biased and as such should be avoided in audio quality evaluation. He highlighted between- and within-subject inconsistency as well as problems of discrepancy between words and actions, as well as situational, mood and context biases. He recommended that if one was interested in what listeners liked, then it would probably be more informative and ecologically valid to observe how they behave when left alone than to ask them questions under controlled conditions. Delegates were stimulated into a lively discussion of these issues, many wishing to defend the need for consumer response evaluation and feeling that one had to attempt to account for these factors. The fact that it was difficult did not mean one should avoid such experiments, although the author's points were indeed highly important.

Preference versus reference: listeners as participants in sound reproduction

Durand Begault tackled the difficult topic raised by Zielinski, considering the essential differences between experiments based on listener preference and those based on comparison to some form of reference signal. He pointed out that to date 'fidelity' had been a basic tenet of audio reproduction that assumed comparison to some form of reference version of a recording. An alternative model was to consider the listener as a participant in the reproduction process, giving rise to a more malleable concept of 'ideal reproduction'. Listeners adapt well to multiple frames of reference and cognitive associations with specific environmental contexts may influence expectations. Taking these things into account in future may lead to a more ecologically valid approach to quality evaluation.

The sentimental quality of domestic sound recordings

David Frohlich had been studying consumer responses to sound recordings accompanying still photographs and had found that such short sound clips evoked strong cognitive and affective associations for the people that had taken them. Such types of responses could influence listener evaluations of sound recordings in unexpected ways and would contribute to the idiosyncratic features described earlier by Zielinski. Perhaps it was important to study such phenomena deliberately rather than attempt to avoid or eliminate them.

Contextual effects in sensory evaluation of spatial audio: Integral factor or nuisance?

William Martens brought these discussions together in a paper considering contextual effects in audio quality evaluation. Whereas it could be argued that these should be minimised in spatial audio experiments there was an alternative position that could regard them as worthy of study in their own right. The suppositions of the experimenter and the various biases of the subject were key factors dictating the outcomes of experiments. The presuppositions of science are often taken for its findings. He showed the results of a recent experiment on multichannel microphone techniques that highlighted some of these factors.

Evaluating complex scenes and signals

Søren Bech chaired a closing discussion that attempted to address the question of evaluating complex scenes and signals by considering a particular problem in testing automotive audio systems. He provided panelists with a scenario involving the evaluation of the spatial attributes of a car audio system using typical audio prorgramme material, which gave rise to lively discussions about the relative merits of different approaches. Simulations might be favoured in certain circumstances owing to practical and safety issues but it was difficult when using simulations to take into account all the important contextual effects that would be present when driving.

The papers, presentations, and audio recordings of the proceedings may be found here.

Francis Rumsey
2 May 2006