Target Detection and Tracking With Heterogeneous Sensors

Huiyu Zhou, Murtaza Taj, Andrea Cavallaro

Research output: Contribution to journalArticle

68 Citations (Scopus)
1 Downloads (Pure)

Abstract

We present a multimodal detection and tracking algorithm for sensors composed of a camera mounted between two microphones. Target localization is performed on color-based change detection in the video modality and on time difference of arrival (TDOA) estimation between the two microphones in the audio modality. The TDOA is computed by multiband generalized cross correlation (GCC) analysis. The estimated directions of arrival are then postprocessed using a Riccati Kalman filter. The visual and audio estimates are finally integrated, at the likelihood level, into a particle filter (PF) that uses a zero-order motion model, and a weighted probabilistic data association (WPDA) scheme. We demonstrate that the Kalman filtering (KF) improves the accuracy of the audio source localization and that the WPDA helps to enhance the tracking performance of sensor fusion in reverberant scenarios. The combination of multiband GCC, KF, and WPDA within the particle filtering framework improves the performance of the algorithm in noisy scenarios. We also show how the proposed audiovisual tracker summarizes the observed scene by generating metadata that can be transmitted to other network nodes instead of transmitting the raw images and can be used for very low bit rate communication. Moreover, the generated metadata can also be used to detect and monitor events of interest.
Original languageEnglish
Pages (from-to)503-513
Number of pages11
JournalIEEE Journal of Selected Topics in Signal Processing
Volume2
Issue number4
DOIs
Publication statusPublished - Aug 2008

Fingerprint

Microphones
Metadata
Target tracking
Direction of arrival
Sensors
Kalman filters
Fusion reactions
Cameras
Color
Communication
Time difference of arrival

Cite this

Zhou, Huiyu ; Taj, Murtaza ; Cavallaro, Andrea. / Target Detection and Tracking With Heterogeneous Sensors. In: IEEE Journal of Selected Topics in Signal Processing. 2008 ; Vol. 2, No. 4. pp. 503-513.
@article{160e9c8433724c6e8e96a37045265788,
title = "Target Detection and Tracking With Heterogeneous Sensors",
abstract = "We present a multimodal detection and tracking algorithm for sensors composed of a camera mounted between two microphones. Target localization is performed on color-based change detection in the video modality and on time difference of arrival (TDOA) estimation between the two microphones in the audio modality. The TDOA is computed by multiband generalized cross correlation (GCC) analysis. The estimated directions of arrival are then postprocessed using a Riccati Kalman filter. The visual and audio estimates are finally integrated, at the likelihood level, into a particle filter (PF) that uses a zero-order motion model, and a weighted probabilistic data association (WPDA) scheme. We demonstrate that the Kalman filtering (KF) improves the accuracy of the audio source localization and that the WPDA helps to enhance the tracking performance of sensor fusion in reverberant scenarios. The combination of multiband GCC, KF, and WPDA within the particle filtering framework improves the performance of the algorithm in noisy scenarios. We also show how the proposed audiovisual tracker summarizes the observed scene by generating metadata that can be transmitted to other network nodes instead of transmitting the raw images and can be used for very low bit rate communication. Moreover, the generated metadata can also be used to detect and monitor events of interest.",
author = "Huiyu Zhou and Murtaza Taj and Andrea Cavallaro",
year = "2008",
month = "8",
doi = "10.1109/JSTSP.2008.2001429",
language = "English",
volume = "2",
pages = "503--513",
journal = "IEEE Journal of Selected Topics in Signal Processing",
issn = "1932-4553",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

Target Detection and Tracking With Heterogeneous Sensors. / Zhou, Huiyu; Taj, Murtaza; Cavallaro, Andrea.

In: IEEE Journal of Selected Topics in Signal Processing, Vol. 2, No. 4, 08.2008, p. 503-513.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Target Detection and Tracking With Heterogeneous Sensors

AU - Zhou, Huiyu

AU - Taj, Murtaza

AU - Cavallaro, Andrea

PY - 2008/8

Y1 - 2008/8

N2 - We present a multimodal detection and tracking algorithm for sensors composed of a camera mounted between two microphones. Target localization is performed on color-based change detection in the video modality and on time difference of arrival (TDOA) estimation between the two microphones in the audio modality. The TDOA is computed by multiband generalized cross correlation (GCC) analysis. The estimated directions of arrival are then postprocessed using a Riccati Kalman filter. The visual and audio estimates are finally integrated, at the likelihood level, into a particle filter (PF) that uses a zero-order motion model, and a weighted probabilistic data association (WPDA) scheme. We demonstrate that the Kalman filtering (KF) improves the accuracy of the audio source localization and that the WPDA helps to enhance the tracking performance of sensor fusion in reverberant scenarios. The combination of multiband GCC, KF, and WPDA within the particle filtering framework improves the performance of the algorithm in noisy scenarios. We also show how the proposed audiovisual tracker summarizes the observed scene by generating metadata that can be transmitted to other network nodes instead of transmitting the raw images and can be used for very low bit rate communication. Moreover, the generated metadata can also be used to detect and monitor events of interest.

AB - We present a multimodal detection and tracking algorithm for sensors composed of a camera mounted between two microphones. Target localization is performed on color-based change detection in the video modality and on time difference of arrival (TDOA) estimation between the two microphones in the audio modality. The TDOA is computed by multiband generalized cross correlation (GCC) analysis. The estimated directions of arrival are then postprocessed using a Riccati Kalman filter. The visual and audio estimates are finally integrated, at the likelihood level, into a particle filter (PF) that uses a zero-order motion model, and a weighted probabilistic data association (WPDA) scheme. We demonstrate that the Kalman filtering (KF) improves the accuracy of the audio source localization and that the WPDA helps to enhance the tracking performance of sensor fusion in reverberant scenarios. The combination of multiband GCC, KF, and WPDA within the particle filtering framework improves the performance of the algorithm in noisy scenarios. We also show how the proposed audiovisual tracker summarizes the observed scene by generating metadata that can be transmitted to other network nodes instead of transmitting the raw images and can be used for very low bit rate communication. Moreover, the generated metadata can also be used to detect and monitor events of interest.

UR - http://www.scopus.com/inward/record.url?scp=54049156157&partnerID=8YFLogxK

U2 - 10.1109/JSTSP.2008.2001429

DO - 10.1109/JSTSP.2008.2001429

M3 - Article

VL - 2

SP - 503

EP - 513

JO - IEEE Journal of Selected Topics in Signal Processing

JF - IEEE Journal of Selected Topics in Signal Processing

SN - 1932-4553

IS - 4

ER -