Cargando…

Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning

Aiming at the existing Direction of Arrival (DOA) methods based on neural network, a large number of samples are required to achieve signal-scene adaptation and accurate angle estimation. In the coherent signal environment, the problems of a larger amount of training sample data are required. In thi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wu, Zihan, Wang, Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9918895/ https://www.ncbi.nlm.nih.gov/pubmed/36772585 http://dx.doi.org/10.3390/s23031546

_version_	1784886689241497600
author	Wu, Zihan Wang, Jun
author_facet	Wu, Zihan Wang, Jun
author_sort	Wu, Zihan
collection	PubMed
description	Aiming at the existing Direction of Arrival (DOA) methods based on neural network, a large number of samples are required to achieve signal-scene adaptation and accurate angle estimation. In the coherent signal environment, the problems of a larger amount of training sample data are required. In this paper, the DOA of coherent signal is converted into the DOA parameter estimation of the angle interval of incident signal. The accurate estimation of coherent DOA under the condition of small samples based on meta−reinforcement learning (MRL) is realized. The meta−reinforcement learning method in this paper models the process of angle interval estimation of coherent signals as a Markov decision process. In the inner loop layer, the sequence to sequence (S2S) neural network is used to express the angular interval feature sequence of the incident signal DOA. The strategy learning of the existence of angle interval under small samples is realized through making full use of the context relevance of spatial spectral sequence through S2S neural network. Thus, according to the optimal strategy, the output sequence is sequentially determined to give the angle interval of the incident signal. Finally, DOA is obtained through one-dimensional spectral peak search according to the angle interval obtained. The experiment shows that the meta−reinforcement learning algorithm based on S2S neural network can quickly converge to the optimal state by only updating the gradient of S2S neural network parameters with a small sample set when a new signal environment appears.
format	Online Article Text
id	pubmed-9918895
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-99188952023-02-12 Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning Wu, Zihan Wang, Jun Sensors (Basel) Article Aiming at the existing Direction of Arrival (DOA) methods based on neural network, a large number of samples are required to achieve signal-scene adaptation and accurate angle estimation. In the coherent signal environment, the problems of a larger amount of training sample data are required. In this paper, the DOA of coherent signal is converted into the DOA parameter estimation of the angle interval of incident signal. The accurate estimation of coherent DOA under the condition of small samples based on meta−reinforcement learning (MRL) is realized. The meta−reinforcement learning method in this paper models the process of angle interval estimation of coherent signals as a Markov decision process. In the inner loop layer, the sequence to sequence (S2S) neural network is used to express the angular interval feature sequence of the incident signal DOA. The strategy learning of the existence of angle interval under small samples is realized through making full use of the context relevance of spatial spectral sequence through S2S neural network. Thus, according to the optimal strategy, the output sequence is sequentially determined to give the angle interval of the incident signal. Finally, DOA is obtained through one-dimensional spectral peak search according to the angle interval obtained. The experiment shows that the meta−reinforcement learning algorithm based on S2S neural network can quickly converge to the optimal state by only updating the gradient of S2S neural network parameters with a small sample set when a new signal environment appears. MDPI 2023-01-31 /pmc/articles/PMC9918895/ /pubmed/36772585 http://dx.doi.org/10.3390/s23031546 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Wu, Zihan Wang, Jun Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning
title	Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning
title_full	Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning
title_fullStr	Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning
title_full_unstemmed	Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning
title_short	Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning
title_sort	small sample coherent doa estimation method based on s2s neural network meta reinforcement learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9918895/ https://www.ncbi.nlm.nih.gov/pubmed/36772585 http://dx.doi.org/10.3390/s23031546
work_keys_str_mv	AT wuzihan smallsamplecoherentdoaestimationmethodbasedons2sneuralnetworkmetareinforcementlearning AT wangjun smallsamplecoherentdoaestimationmethodbasedons2sneuralnetworkmetareinforcementlearning

Small Sample Coherent DOA Estimation Method Based on S2S Neural Network Meta Reinforcement Learning

Ejemplares similares