AS-FCRNet: a lightweight multi-frame acoustic–seismic fusion network for high-precision ground moving target recognition on UGS

Zheyu Liu College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
kunsheng Xing College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
Wei Wang College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
Nan Wang College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China

Article ID: 3645

Keywords: fusion of acoustic and seismic features, multi-frame temporal modeling, lightweight neural network, deep learning, long short-term memory (LSTM)

Abstract

To address the challenges of high computational complexity and temporal modeling difficulties caused by high-dimensional data in acoustic and seismic signal classification, this paper proposes a multi-stage dimensionality reduction and classification framework based on the integration of Mel-frequency spectrum feature extraction, Convolutional Neural Networks (CNNs), and Long Short-Term Memory (LSTM) networks. The method significantly reduces computational complexity while maintaining competitive classification accuracy through progressive feature compression and acoustic-seismic feature fusion. Specifically, Mel-frequency spectrum feature extraction is first performed on dual-channel input signals (acoustic and seismic) to extract perceptually relevant physical features aligned with human auditory characteristics. Then, a lightweight CNN is designed to perform further feature extraction on log-Mel energy representations; in the fusion stage, we investigate three fusion strategies (information-level, feature-level, and decision-level fusion) for acoustic and seismic signals to identify the optimal approach, before fusing the information and compressing the fused features into a short vector for subsequent temporal modeling. A sequence of compact feature vectors extracted from consecutive frames (e.g., four-frame segments) is fed into an LSTM network to capture temporal dependencies, and the final classification is performed based on the output of the last time step. Experimental results demonstrate that the proposed approach effectively balances inference efficiency and model performance, achieving accurate and reliable classification results with low computational complexity.

Published

2025-08-29

How to Cite

Liu, Z., Xing, kunsheng, Wang, W., & Wang, N. (2025). AS-FCRNet: a lightweight multi-frame acoustic–seismic fusion network for high-precision ground moving target recognition on UGS. Sound & Vibration, 59(4). https://doi.org/10.59400/sv3645

Download Citation

Issue

Vol. 59 No. 4 (2025)

Section

Article

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

[1]William PE, Hoffman MW. Classification of military ground vehicles using time domain harmonics’ amplitudes. IEEE Transactions on Instrumentation and Measurement. 2011; 60(11): 3720–3731. doi: 10.1109/TIM.2011.2135110

[2]Prado G, Johnson R. Changing requirements and solutions for unattended ground sensors. In: Carapezza EM (editor). Unmanned/Unattended Sensors and Sensor Networks IV, Proceedings of the Optics/Photonics in Security and Defence; 5 October 2007; Florence, Italy. SPIE. 2007. p. 67360X. doi: 10.1117/12.748638

[3]Tian Y, Qi H, Wang X. Target detection and classification using seismic signal processing in unattended ground sensor systems. In: Proceedings of the 2002 International Conference on Acoustics Speech and Signal Processing; 13–17 May 2002; Orlando, FL, USA. p. IV-4172-IV–4172. doi: 10.1109/ICASSP.2002.5745620

[4]Bin K, Jiang Y, Fu R, et al. Multimodal attention transformer encoder for acoustic-seismic fusion target recognition. arXiv preprint. 2025. doi: 10.2139/ssrn.5254396

[5]Weisser A, Miles K, Richardson MJ, et al. Conversational distance adaptation in noise and its effect on signal-to-noise ratio in realistic listening environments. The Journal of the Acoustical Society of America. 2021; 149(4): 2896–2907. doi: 10.1121/10.0004774

[6]Ekpezu AO, Wiafe I, Katsriku F, et al. Using deep learning for acoustic event classification: The case of natural disasters. The Journal of the Acoustical Society of America. 2021; 149(4): 2926–2935. doi: 10.1121/10.0004771

[7]Yuan Y, Shen Q, Xi W, et al. Multidisciplinary design optimization of dynamic positioning system for semi-submersible platform. Ocean Engineering. 2023; 285: 115426. doi: 10.1016/j.oceaneng.2023.115426

[8]Yuan Y, Yang Q, Ren J, et al. Short-term power load forecasting based on SKDR hybrid model. Electrical Engineering. 2025; 107(5): 5769–5785. doi: 10.1007/s00202-024-02821-x

[9]George J, Mary L, Riyas K. Vehicle detection and classification from acoustic signal using ANN and KNN. In: Proceedings of the 2013 International Conference on Control Communication and Computing (ICCC); 13–15 December 2013; Thiruvananthapuram, India. pp. 436–439. doi: 10.1109/ICCC.2013.6731694

[10]Jin X, Sarkar S, Ray A, et al. Target detection and classification using seismic and PIR sensors. IEEE Sensors Journal. 2012; 12(6): 1709–1718. doi: 10.1109/JSEN.2011.2177257

[11]Ozkaya SG, Baygin M, Dogan S, et al. Machine learning-based equipment sound classification for advanced construction management and site supervision. World Journal of Advanced Research and Reviews. 2025; 26(3): 317–329. doi: 10.30574/wjarr.2025.26.3.2178

[12]Bin K, Lin J, Tong X, et al. Moving target recognition with seismic sensing: A review. Measurement. 2021; 181: 109584. doi: 10.1016/j.measurement.2021.109584

[13]Dibazar AA, Yousefi A, Park HO, et al. Intelligent acoustic and vibration recognition/alert systems for security breaching detection, close proximity danger identification, and perimeter protection. In: Proceedings of the 2010 IEEE International Conference on Technologies for HomelandSecurity (HST); 8–11 November 2010; Waltham, MA, USA. pp. 351–356. doi: 10.1109/THS.2010.5654931

[14]Cunningham P, Delany SJ. K-Nearest neighbour classifiers—a tutorial. ACM Computing Surveys. 2022; 54(6): 1–25. doi: 10.1145/3459665

[15]Kalra M, Kumar S, Das B. Analysis of instantaneous amplitude and frequency of EWT modes for automatic target classification. In: Proceedings of the 2021 IEEE Bombay Section Signature Conference (IBSSC); 18 November 2021; Gwalior, India. pp. 1–6. doi: 10.1109/IBSSC53889.2021.9673272

[16]Narayanaswami R, Gandhe A, Tyurina A, et al. Sensor fusion and feature-based human/animal classification for unattended ground sensors. In: 2010 IEEE International Conference on Technologies for Homeland Security (HST); 8–10 November 2010; Waltham, MA, USA. pp. 344–350. doi: 10.1109/THS.2010.5655025

[17]Cyriac S, Harsha BM, Woon Kim Y. Seismic activity-based human intrusion detection using deep neural networks. In: 2022 13th International Conference on Information and Communication Technology Convergence (ICTC); 19 October 2022; Jeju Island, Republic of Korea. pp. 130–135. doi: 10.1109/ICTC55196.2022.9952913

[18]Damarla T, Mehmood A, Sabatier J. Detection of people and animals using non-imaging sensors. In: Proceedings of the 14th International Conference on Information Fusion; 5 July 2011; Chicago, IL, USA. pp. 1–8. Available online: https://ieeexplore.ieee.org/abstract/document/5977674

[19]Park HO, Dibazar AA, Berger TW. Cadence analysis of temporal gait patterns for seismic discrimination between human and quadruped footsteps. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing; 19–24 April 2009; Taipei, Taiwan. pp. 1749–1752. doi: 10.1109/ICASSP.2009.4959942

[20]Wang Y, Cheng X, Zhou P, et al. Convolutional neural network-based moving ground target classification using raw seismic waveforms as input. IEEE Sensors Journal. 2019; 19(14): 5751–5759. doi: 10.1109/JSEN.2019.2907051

[21]Jin G, Ye B, Wu Y, et al. Vehicle classification based on seismic signatures using convolutional neural network. IEEE Geoscience and Remote Sensing Letters. 2019; 16(4): 628–632. doi: 10.1109/LGRS.2018.2879687

[22]Tran VT, Tsai WH. Acoustic-based emergency vehicle detection using convolutional neural networks. IEEE Access. 2020; 8: 75702–75713. doi: 10.1109/ACCESS.2020.2988986

[23]Yu Y, Rashidi M, Samali B, et al. Crack detection of concrete structures using deep convolutional neural networks optimized by enhanced chicken swarm algorithm. Structural Health Monitoring. 2022; 21(5): 2244–2263. doi: 10.1177/14759217211053546

[24]Zhao X, Wang L, Zhang Y, et al. A review of convolutional neural networks in computer vision. Artificial Intelligence Review. 2024; 57(4): 99. doi: 10.1007/s10462-024-10721-6

[25]Li J, Han L, Li X, et al. An evaluation of deep neural network models for music classification using spectrograms. Multimedia Tools and Applications. 2022; 81(4): 4621–4647. doi: 10.1007/s11042-020-10465-9

[26]Wang Z, Ma Y, Gao J, et al. Remaining useful life prediction for solid-state lithium batteries based on spatial–temporal relations and neuronal ODE-assisted KAN. Reliability Engineering & System Safety. 2025; 260: 111003. doi: 10.1016/j.ress.2025.111003

[27]Yuan Y, Yang Q, Ren J, et al. Short-term wind power prediction based on IBOA-AdaBoost-RVM. Journal of King Saud University - Science. 2024; 36(11): 103550. doi: 10.1016/j.jksus.2024.103550

[28]Yuan Y, Yang Q, Wang G, et al. Combined improved tuna swarm optimization with graph convolutional neural network for remaining useful life of engine. Quality and Reliability Engineering International. 2025; 41(1): 174–91. doi: 10.1002/qre.3651

[29]Xing K, Wang N, Wang W, et al. CNN-based multiterrain moving target recognition model for unattended ground sensor systems. Journal of Sensors. 2022; 2022: 1–10. doi: 10.1155/2022/7542114

[30]Bin K, Lin J, Tong X. Edge intelligence-based moving target classification using compressed seismic measurements and convolutional neural networks. IEEE Geoscience and Remote Sensing Letters. 2022; 19: 1–5. doi: 10.1109/LGRS.2021.3055795

[31]Akter R, Islam MdR, Debnath SK, et al. A hybrid CNN-LSTM model for environmental sound classification: Leveraging feature engineering and transfer learning. Digital Signal Processing. 2025; 163: 105234. doi: 10.1016/j.dsp.2025.105234

[32]Mohine S, Bansod BS, Bhalla R, et al. Acoustic modality based hybrid deep 1D CNN-BiLSTM algorithm for moving vehicle classification. IEEE Transactions on Intelligent Transportation Systems. 2022; 23(9): 16206–16216. doi: 10.1109/TITS.2022.3148783

[33]Nie T, Wang S, Wang Y, et al. An effective recognition of moving target seismic anomaly for security region based on deep bidirectional LSTM combined CNN. Multimedia Tools and Applications. 2023; 83(22): 61645–61658. doi: 10.1007/s11042-023-14382-5

[34]Sun L, Zhang Z, Tang H, et al. Vehicle acoustic and seismic synchronization signal classification using long-term features. IEEE Sensors Journal. 2023; 23(10): 10871–10878. doi: 10.1109/JSEN.2023.3263572

[35]Abdul ZKh, Al-Talabani AK. Mel frequency cepstral coefficient and its applications: A review. IEEE Access. 2022; 10: 122136–122158. doi: 10.1109/ACCESS.2022.3223444

[36]Chollet F. Xception: deep learning with depthwise separable convolutions. arXiv preprint. 2016. doi: 10.48550/ARXIV.1610.02357

[37]Dong S, Chen Z. A multi-level feature fusion network for remote sensing image segmentation. Sensors. 2021; 21(4): 1267. doi: 10.3390/s21041267

[38]Oh SI, Kang HB. Object detection and classification by decision-level fusion for intelligent vehicle systems. Sensors. 2017; 17(1): 207. doi: 10.3390/s17010207

[39]Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1997; 9(8): 1735–1780. doi: 10.1162/neco.1997.9.8.1735

[40]Liu S, Jiang W, Wu L, et al. Real-time classification of rubber wood boards using an SSR-based CNN. IEEE Transactions on Instrumentation and Measurement. 2020; 69(11): 8725–8734. doi: 10.1109/TIM.2020.3001370

Editor-in-Chief

Prof. Jun Yang

Institute of Acoustics, Chinese Academy of Sciences, China

ISSN

1541-0161 (Print)

2693-1443 (Online)

Publication Frequency

Bi-monthly

Indexing

Web of Science Coverage

Emerging Sources Citation Index (IF 4.2, Q1)

Elsevier Solutions

Scopus (2025 CiteScore 2.0);

Portico, etc.

About the Publisher

Academic Publishing insists on taking academic exchange and publication as the main line, carrying out comprehensive management based on science and technology, and fully exploring excellent international publishing resources. Within 5 years, it will form a strategic framework and scale with science (S), technology (T), medicine (M), education (E), and humanities and arts (H) as the main publishing fields. Academic Publishing is headquartered in Singapore and based in Malaysia, with the United States and China providing the main scientific and academic resources. At the same time, it has established long-term good cooperative relations with other publishing companies, scientific research communities, and academic organizations in more than a dozen countries and regions. Academic Publishing uses English and Chinese as its main publishing languages, mainly publishing books, journals, and conference papers in print and online. The vast majority of publications follow the international open access policy, providing stable and long-term quality and professional publications. With the joint efforts of the expert team and our professional editorial team, our publications will gradually be indexed by international databases in stages to provide convenient and professional retrieval for various scholars. At the same time, manuscripts we accept will be subject to the peer review principle, and cutting-edge and innovative research articles will be preferentially accepted for peer reference and discussion. All kinds of our publications are welcome for peer to contribute, access, and download.

more

Member of ASC

Volume Arrangement

Featured Articles

New scaling of critical damping and reduced frequency for mechanically excited systems

This paper introduces a universal framework for understanding the vibration responses of systems subjected to harmonic excitation. By examining a simplified cylinder-spring-damper model, the study refurbishes traditional scaling methods for the excitation frequency ratio and critical damping ratio. The findings indicate that in damped systems, the maximum amplitude of vibration does not align with the natural frequency. This observation leads to the introduction of a new scaling method for reduced frequency. This new approach aligns resonance peaks at the new reduced velocity of 1.0 across different damping ratios, providing a consistent characterization of vibration behavior. A new critical damping ratio of 0.707 is identified for an excited system as opposed to the traditional damping ratio of 1.0 for an unexcited system. Key properties such as maximum amplitude, phase lag, bandwidth, and quality factor are analyzed, demonstrating that the proposed reduced frequency and critical damping ratio effectively capture the dynamics of both damped and undamped excited systems. The findings offer significant insights for practical applications in engineering and various scientific fields.

Ultrasonic wave velocity as a universal metric for defect detection in timber structures: A case study on Japanese cedar wood (Cryptomeria japonica)

This study makes significant contributions to the field of ultrasonic testing (UT) by offering a novel approach to the identification of artificially introduced defects within Japanese cedar wood (Cryptomeria japonica). The findings are of particular relevance for the heritage conservation and construction sectors, where non-invasive defect detection is paramount. The study establishes a robust framework for assessing the structural integrity of timber by correlating ultrasonic wave velocity reductions with defect size and distribution. Big-sized defects led to more substantial decreases in wave velocity. The study establishes a robust framework for assessing the structural integrity of historical timber by correlating ultrasonic wave velocity reductions with defect size and distribution. This framework has the potential to be applicable to diverse wood species and defect types.

Vehicle structural road noise prediction based on an improved Long Short-Term Memory method

The control of vehicle interior noise has become a critical metric for assessing noise, vibration, and harshness (NVH) in vehicles. During the initial phases of vehicle development, accurately predicting the impact of road noise on interior noise is essential for reducing noise levels and expediting the product development cycle. In recent years, data-driven methods based on machine learning have gained significant attention due to their robust capability in navigating complex data mapping relationships. Notably, surrogate models have demonstrated exceptional performance in this domain. Numerous researchers have integrated diverse intelligent algorithms into the study of vehicle noise, leveraging advantages such as the elimination of precise modeling requirements, extensive solution space exploration, continuous learning from data, and robust algorithmic versatility. However, in NVH engineering applications, data-driven models face inherent limitations, particularly in interpretability and stability. To address these issues, this paper introduces an improved Long Short-Term Memory (LSTM) network that combines knowledge and data. Inspired by the physical information neural network concept, this approach incorporates values calculated through empirical formulas into the neural network as constraints. Comparative assessments with traditional LSTM networks highlight the advantages of this deep learning model. By integrating empirical formulas constraints, the model not only enhances interpretability but also achieves robust generalization with fewer data samples. The proposed method is validated on a specific vehicle model, showing significant improvements in prediction accuracy and efficiency.