A model of attention-driven scene analysis

Malcolm Slaney*, Trevor Agus, Shih Chii Liu, Merve Kaya, Mounya Elhilali

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Parsing complex acoustic scenes involves an intricate interplay between bottom-up, stimulus-driven salient elements in the scene with top-down, goal-directed, mechanisms that shift our attention to particular parts of the scene. Here, we present a framework for exploring the interaction between these two processes in a simulated cocktail party setting. The model shows improved digit recognition in a multi-talker environment with a goal of tracking the source uttering the highest value. This work highlights the relevance of both data-driven and goal-driven processes in tackling real multi-talker, multi-source sound analysis.

Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
Pages145-148
Number of pages4
DOIs
Publication statusPublished - 23 Oct 2012
Externally publishedYes
EventIEEE International Conference on Acoustics, Speech, and Signal Processing 2012 - Kyoto, Japan
Duration: 25 Mar 201230 Mar 2012
https://doi.org/10.1109/ICASSP15465.2012

Conference

ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing 2012
Abbreviated titleICASSP'2012
Country/TerritoryJapan
CityKyoto
Period25/03/201230/03/2012
Internet address

Keywords

  • Attention
  • Auditory Scene Analysis
  • Cognition
  • Digit Recognition
  • Saliency

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A model of attention-driven scene analysis'. Together they form a unique fingerprint.

Cite this