J Acoust Soc Am. 2025 Jan 1;157(1):328-339. doi: 10.1121/10.0034865.
ABSTRACT
Odontocetes are capable of dynamically changing their echolocation clicks to efficiently detect targets, and learning their clicking strategy can facilitate the design of man-made detecting signals. In this study, we developed deep convolutional generative adversarial networks guided by an acoustic feature vector (AF-DCGANs) to synthesize narrowband clicks of the finless porpoise (Neophocaena phocaenoides sunameri) and broadband clicks of the bottlenose dolphins (Tursiops truncatus). The average short-time objective intelligibility (STOI), spectral correlation coefficient (Spe-CORR), waveform correlation coefficient (Wave-CORR), and dynamic time warping distance (DTW-Distance) of the synthetic clicks were 0.975, 0.968, 0.877, and 0.992, respectively. AF-DCGAN outperformed the minimum phase signal reconstruction (MPSR) method and variational quantized variational autoencoders (VQ-VAE) by 5.9% and 3.7% in STOI, 5.2% and 3.5% in Spe-CORR, and 5.8% and 2.8% in Wave-CORR, respectively. In addition, AF-DCGAN reduced DTW-Distances by 29.9% and 9.4% compared to MPSR and VQ-VAE, respectively. Results showed that AF-DCGAN was robust in synthesizing both narrowband and broadband clicks that can produce a substantial number of high-fidelity odontocetes' clicks with flexibility in modulating parameters. Employing AF-DCGAN to synthesize odontocete-like clicks could advance the development of a click database, offering promising applications in the research of biomimetic target detection and recognition.
PMID:39821639 | DOI:10.1121/10.0034865