Two of the researchers behind the AI model, Jacob Vogel and Lijun An, show the results of their study. Image source: Lund University; photo: Emma Nyberg News • Deep joint-learning proteomics model The symptom profiles of different neurodegenerative diseases often overlap, and diagnosing age-related cognitive symptoms is complex. A patient may have multiple overlapping disease

B
Abstract
In recent years, deep learning techniques for motor imagery (MI) electroencephalography (EEG) decoding have shown great potential for advancing brain–computer interfaces (BCI). However, non-linearities and subject differences in EEG signals negatively impact the robustness, applicability, and generalization of MI-EEG decoding, hindering BCI development. This paper proposes a novel network based on Bridge structures and Attention mechanisms (BANet). Inspired by transformer architectures, BANet incorporates multiple innovative Bridge blocks to extract temporal features of EEG signals from both local and global perspectives. These blocks model complex temporal relationships, enabling BANet to leverage information from diverse angles and enhance decoding performance. Additionally, attention mechanisms and inception architecture are employed to focus on and extract EEG features across multiple dimensions and scales, further optimizing the utilization of temporal-spatial features. In various validation experiments, BANet demonstrates superior performance compared with recent representative deep learning methods, confirming its effectiveness in extracting local and global temporal and spatial features and its potential for developing robust BCI applications.
Introduction
Brain–Computer Interface (BCI) technology is a multidisciplinary field that merges artificial intelligence (AI) and neuroscience. It monitors brain activity through sensors and AI, facilitating direct interaction between the human brain and external devices. As an innovative neural computing technology, BCI is implemented in applications such as robotic arms [1], Virtual Reality (VR) systems [2], and wheelchairs [3].
Motor imagery (MI) is a widely adopted paradigm in BCI systems, utilizing electroencephalogram (EEG) signals generated when users mentally imagine body movements. This technique enhances BCI applications by enabling control without interference from the external environment [4]. During the capture of MI EEG signals, the cerebral cortex produces two rhythmic signals: the mu rhythm (8–13 Hz) and the beta rhythm (13–32 Hz). Additionally, the associated brain regions display Event Related Desynchronization (ERD) and Event Related Synchronization (ERS) phenomena. These phenomena are critical for understanding the neural mechanisms underlying MI-based BCI systems [5]. Machine learning plays a crucial role in the development of BCI systems, particularly in classification tasks. Several advanced algorithms, such as Artificial Neural Networks (ANNs) [6], Support Vector Machines (SVMs) [7], and Bayesian classifiers [8], have been employed. However, traditional machine learning approaches often struggle with EEG decoding because of their limited ability to extract features effectively. For instance, Common Spatial Patterns (CSP) can extract spatial features but are susceptible to noise and artifacts that may obscure critical information, often neglecting the temporal aspects of EEG signals.
With the advancement of deep learning, its applications are expanding across various fields [9]. By harnessing its strengths in processing high-dimensional data and modeling non-linear relationships, researchers are progressively utilizing deep learning methods to interpret complex MI-EEG signals [4]. Different neural network architectures are employed to enhance feature extraction and improve decoding accuracy. For example, Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Autoencoders (AEs) are effective models in this context. Notably, Poonam et al. [10] combined Sparse Nonnegative Matrix Factorization with CNNs, achieving better performance compared to using CNNs alone. Wang et al. [11] proposed a network named IFNet, which enhances the mutual learning of spatial features between low-frequency and high-frequency bands. Liu et al. [12] introduced a weight-sharing network that integrates CNN and LSTM, reducing computational load and speeding up network training. Additionally, Phadikar et al. [13] proposed a motor imagery EEG decoding framework that maps EEG signals into AEs weight vectors, extracts multiple discriminative features, and employs SVM for multiclass classification.
EEG signals are typically analyzed for motor imagery (MI) task classification by extracting combinations of spatial, spectral, and temporal features. Among these, the integration of spatial and temporal features has attracted considerable interest from researchers. Liu et al. [14] proposed an adaptive 3D CNN that assigned significant weights to spatial channels related to motor information and temporal features, effectively reducing the impact of artifacts with EEG temporal-spatial features. Temporal Convolutional Networks (TCNs) have shown great potential for decoding temporal features [15]. Ingolfsson et al. [16] combined EEGNet and TCN architectures, achieving improved extraction of temporal-spatial features and classification performance. Zhao et al. [17] proposed a model that integrates CNNs with a novel attention mechanism within the Transformer architecture to improve the modeling of temporal-spatial feature dependencies in MI-EEG signals.
Furthermore, attention is increasingly being recognized as a factor in MI-EEG decoding. The attention mechanism allows networks to discern which parts of the data are significant and which are not, facilitating a form of perceptual and attentional behavior similar to that of humans. Liu et al. [18] introduced network that combined temporal-spatial attention blocks to enhance subject-specific MI-EEG classification. Sun et al. [19] used squeeze-and-excitation (SE) attention blocks with CNNs to improve the robustness of image feature extraction. Amin et al. [20] introduced an attention-inception CNN combined with LSTM, resulting in a model with fewer parameters and lower processing time for MI-EEG decoding. Similarly, some researchers have explored the combination of attention mechanisms and temporal convolutional networks. Zhang et al. [21] proposed AMSTCNet, which incorporates SE and Efficient Channel Attention (ECA) blocks. This architecture is designed to capture deep temporal information from the signal and dynamically fuse features at various scales. Altaheri et al. [22] utilized multi-head self-attention (MSA) in conjunction with TCN to highlight the most important information in EEG time series signals. Additionally, Liang et al. [23] proposed a model consisting of an inception block, MSA, temporal TCN, and layer fusion to improve the interpretability of the EEG decoding framework.
In this paper, aiming to enhance the model’s capability for extracting temporal-spatial features, thereby improving robustness, applicability, and generalization while advancing real-world applications of BCIs, a novel method named BANet is proposed. This method utilizes a bridge structure and multiple attention mechanisms for decoding MI-EEG signals. By integrating the Convolutional ECA block, Bridge block, and Inception-based temporal convolutional network (TCN) block, BANet effectively captures both local and global temporal and spatial features in the signal, thereby enhancing the representation of temporal-spatial features. This approach overcomes the limitations of CNNs and LSTMs in capturing global information and comprehensively understanding complex features, while also addressing the inadequate local feature extraction typically observed in Transformer structures. Experimental results indicate that the proposed network achieves outstanding decoding performance and enhances the model’s real-time deployment capabilities, highlighting its practical application potential in BCI technology. To facilitate researchers in improving the network proposed in this paper, our network settings can be found in our GitHub repository: https://github.com/woaixueximax/BANet.
The main contributions and innovations of this research are summarized as follows:
- 1)
An efficient end-to-end BANet is proposed for MI-EEG classification, achieving improved robustness, applicability, and generalization by simultaneously addressing local and global temporal features, and spatial features. The model is evaluated on the BCI Competition IV-2a (BCI-2a) dataset (four classification tasks) and the 2b dataset (two classification tasks), and is compared with EEGNet (2018) [24], EEGNeX (2024) [25], EEG-TCNet (2020) [16], TransNet (2024) [26], Conformer (2023) [27], ATCNet (2023) [22], EISATC-Fusion (2024) [23].
- 2)
To model the complex temporal relationships of EEG features and leverage information in diverse perspectives, a novel Bridge Block is proposed that effectively and comprehensively extracts both global and local temporal features.
- 3)
To enhance the model’s focus on critical features while minimizing redundant information, the ECA and MSA mechanisms have been integrated into the proposed network. This integrated approach prioritizes important spatial and temporal features by assigning greater weight to highly correlated MI-EEG features, facilitating more efficient decoding.
- 4)
Three experiments are designed to evaluate the robustness, applicability, and generalization of the proposed model. Additionally, the T-SNE and Grad-CAM methods are employed to visually assess the decoding process and the model’s focus on critical temporal-spatial features, thereby validating its effectiveness in extracting both local and global temporal and spatial features.
The remainder of this paper is structured as follows: Section 2 introduces the proposed model and data preprocessing methods; Section 3 details the experimental setup and analyzes the results; and Section 4 provides a summary of this work.
Access through your organization
Check access to the full text by signing in through your organization.
Access through your organization
Section snippets
Dataset description and preprocessing
To evaluate the decoding performance of BANet, the input to the network consists of the BCI Competition IV datasets 2a [28] and 2b [29], which involve different channels and classification tasks.
Experimental details and performance metrics
Three types of experiments are conducted, within-subject, short-train, and cross-subject, to evaluate the performance of the proposed network. The within-subject experiment is designed to evaluate the model’s decoding performance. In this experiment, the model is trained using the first session of the BCI-2a dataset and the first three sessions of the BCI-2b dataset, with testing performed on the last session of BCI-2a and the last two sessions of BCI-2b, this experiment allows for the
Conclusion
This paper presents BANet, a multi-attention network with an integrated bridge block designed for decoding MI-EEG signals. The architecture of BANet consists of three designed blocks and two attention modules, which are capable of emphasizing significant temporal and spatial features. The bridge block facilitates a comprehensive extraction of features, enabling the capture of both local and global EEG signal features. Experiments have been designed to evaluate BANet’s performance across
CRediT authorship contribution statement
Xuejian Wu: Writing – review & editing, Writing – original draft, Visualization, Resources, Methodology, Data curation, Conceptualization. Yaqi Chu: Visualization, Supervision, Methodology, Funding acquisition. Yang Luo: Validation, Supervision, Investigation. Yiwen Zhao: Supervision, Investigation, Formal analysis. Xingang Zhao: Supervision, Resources, Methodology.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Xuejian Wu received his B.E. degree in Mechanical Engineering from Wuhan University of Technology in 2023. He is currently pursuing a Ph.D. degree at the State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences. Additionally, he is affiliated with the University of Chinese Academy of Sciences. His research focuses on biomedical signal processing, Brain–Computer Interface (BCI), pattern recognition, and deep learning.
Cited by (0)

Xuejian Wu received his B.E. degree in Mechanical Engineering from Wuhan University of Technology in 2023. He is currently pursuing a Ph.D. degree at the State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences. Additionally, he is affiliated with the University of Chinese Academy of Sciences. His research focuses on biomedical signal processing, Brain–Computer Interface (BCI), pattern recognition, and deep learning.

Yaqi Chu received the M.E. degree from Shenyang Ligong University, Shenyang, China, in 2015, and the Ph.D. degree in mechatronic engineering from the University of Chinese Academy of Sciences, Beijing, China, in 2021.
He is currently an assistant professor with the Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China. His research interests include EEG/EMG signal decoding, brain–computer interface, human–machine interaction, and AI-driven robot control.

Yang Luo received the M. E. and the Ph.D. degree in in mechatronic engineering from the University of Harbin Institute of Technology, in 2014, 2019, respectively. He is currently an associate professor with the Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China. His research interests include human–machine interaction, robot systems and control.

Yiwen Zhao received the B.Sc. degree in control science and engineering and the M.Sc. degree in mechanical and electrical engineering from Harbin Institute of Technology, in 1995 and 1997, respectively, and the Ph.D. degree in mechanical and electrical engineering from Shenyang Institute of Automation, Chinese Academy of Science in 2000. Since 2000, he has been with the State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, where he is currently a Professor. His research interests include medical robots, autonomous mobile robots, and intelligent system control.

Xingang Zhao received the B.E. and M.E. degrees in mechanics from Jilin University, in 2000 and 2004, respectively, and the Ph.D. degree in pattern recognition and intelligent systems from Shenyang Institute of Automation, Chinese Academy of Sciences, in 2008. From 2015 to 2016, he was a Visiting Scientist at the Rehabilitation Institute of Chicago, Chicago, USA. He is currently a Professor at Shenyang Institute of Automation, Chinese Academy of Sciences. His research interests include medical robots, rehabilitation robots, robot control, and pattern recognition.
- ☆
-
This research was mainly funded by the National Natural Science Foundation of China under Grants 62203430, 62273335, U22A2067, and 92048203, the National Key Research and Development Program under Grant 2024YFB4709804, and 2022YFB4703200, the Independent Project of State Key Laboratory of Robotics under Grant 2023-Z05, and in part by the Natural Science Foundation of Liaoning Province under Grant 2023JH26/10200017. (Corresponding author: Yaqi Chu, Xingang Zhao.).
View full text
© 2026 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
