TY - JOUR
T1 - Development of a machine learning detector for North Atlantic humpback whale song
AU - Kather, Vincent
AU - Seipel, Fabian
AU - Berges, Benoit
AU - Davis, Genevieve
AU - Gibson, Catherine
AU - Harvey, Matt
AU - Henry, Lea-Anne
AU - Stevenson, Andrew
AU - Risch, Denise
PY - 2024/3/1
Y1 - 2024/3/1
N2 - The study of humpback whale song using passive acoustic monitoring devices requires bioacousticians to manually review hours of audio recordings to annotate the signals. To vastly reduce the time of manual annotation through automation, a machine learning model was developed. Convolutional neural networks have made major advances in the previous decade, leading to a wide range of applications, including the detection of frequency modulated vocalizations by cetaceans. A large dataset of over 60 000 audio segments of 4 s length is collected from the North Atlantic and used to fine-tune an existing model for humpback whale song detection in the North Pacific (see Allen, Harvey, Harrell, Jansen, Merkens, Wall, Cattiau, and Oleson (2021). Front. Mar. Sci. 8, 607321). Furthermore, different data augmentation techniques (time-shift, noise augmentation, and masking) are used to artificially increase the variability within the training set. Retraining and augmentation yield F-score values of 0.88 on context window basis and 0.89 on hourly basis with false positive rates of 0.05 on context window basis and 0.01 on hourly basis. If necessary, usage and retraining of the existing model is made convenient by a framework (AcoDet, acoustic detector) built during this project. Combining the tools provided by this framework could save researchers hours of manual annotation time and, thus, accelerate their research.
AB - The study of humpback whale song using passive acoustic monitoring devices requires bioacousticians to manually review hours of audio recordings to annotate the signals. To vastly reduce the time of manual annotation through automation, a machine learning model was developed. Convolutional neural networks have made major advances in the previous decade, leading to a wide range of applications, including the detection of frequency modulated vocalizations by cetaceans. A large dataset of over 60 000 audio segments of 4 s length is collected from the North Atlantic and used to fine-tune an existing model for humpback whale song detection in the North Pacific (see Allen, Harvey, Harrell, Jansen, Merkens, Wall, Cattiau, and Oleson (2021). Front. Mar. Sci. 8, 607321). Furthermore, different data augmentation techniques (time-shift, noise augmentation, and masking) are used to artificially increase the variability within the training set. Retraining and augmentation yield F-score values of 0.88 on context window basis and 0.89 on hourly basis with false positive rates of 0.05 on context window basis and 0.01 on hourly basis. If necessary, usage and retraining of the existing model is made convenient by a framework (AcoDet, acoustic detector) built during this project. Combining the tools provided by this framework could save researchers hours of manual annotation time and, thus, accelerate their research.
KW - Humpback Whale
KW - Sound Spectrography
KW - Animals
KW - Time Factors
KW - Acoustics
KW - Seasons
KW - Vocalization, Animal
U2 - 10.1121/10.0025275
DO - 10.1121/10.0025275
M3 - Article
C2 - 38477612
SN - 1520-8524
VL - 155
SP - 2050
EP - 2064
JO - The Journal of the Acoustical Society of America
JF - The Journal of the Acoustical Society of America
IS - 3
ER -