Home // IARIA Congress 2024, The 2024 IARIA Annual Congress on Frontiers in Science, Technology, Services, and Applications // View article


Audio vs. Visual Approach to Monitor the Critically Endangered Species Atlapetes blancae: Developing Deep Learning Models with Limited Data

Authors:
Julian D. Santamaria P
Jhony H. Giraldo
Angelica Diaz-Pulido
Claudia Isaza

Keywords: Atlapetes blancae identification; Computer vision; Bioacoustics; Passive monitoring.

Abstract:
Using artificial intelligence algorithms for animal passive monitoring is a cost-effective tool. This kind of data analysis permits detailed and efficient tracking of species, as exemplified by the case of the endemic Antioquia brushfinch (Atlapetes blancae). Atlapetes blancae is from the high-elevation plateau of Santa Rosa de Osos in Antioquia Colombia. These birds are currently listed as critically endangered by the International Union for Conservation of Nature (IUCN). Their population is estimated at approximately 108 individuals. Sound recorders and camera traps are important tools for longterm monitoring as they provide extensive registers of data. However, analyzing this data is a labor-intensive process that requires experts to manually process the extensive amount of information. Additionally, identifying acoustic patterns for the Atlapetes blancae species based on artificial intelligent algorithms is problematic due to the lack of labeled data and the complexity of the vocalizations. This study introduces a novel methodology for real-environment audio analysis, addressing the challenge of unlabeled registers using a semi-automatic approach. We leverage the Learning Algorithm for Multivariate Data Analysis (LAMDA) and KiwiNet convolutional network architecture for audio recognition. Additionally, we analyze the videos using Multi-Layer Robust Principal Component Analysis (Multi-layer RPCA) to obtain cropped images from the video, which are then processed using a ResNet-18 architecture for classification. Finally, we compare both models to identify strengths and limitations. With a collection of 7,147 audio recordings and 17,159 videos, only 11 audio and 48 video recordings contain Atlapetes blancae presence. Our approach achieves F-measure average scores of 0.823 and 0.562 for audio and video analysis, respectively. Notably, in this case, the audio model is more robust than the video model.

Pages: 72 to 80

Copyright: Copyright (c) IARIA, 2024

Publication date: June 30, 2024

Published in: conference

ISBN: 978-1-68558-180-0

Location: Porto, Portugal

Dates: from June 30, 2024 to July 4, 2024