New research from Sandia National Laboratories suggests that brain-inspired neuromorphic computers are just as adept at solving complex mathematical equations as they are at speeding up neural networks and could eventually pave the way to ultra-efficient supercomputers. Running on around 20 watts, the human brain is able to process vast quantities of sensory information from

A
Introduction
Women’s health is a critical pillar of global public health, encompassing physical, mental, and reproductive well-being across all stages of life. Due to a combination of biological, social, and cultural factors, women are disproportionately affected by certain health conditions [1]. According to the World Health Organization (WHO), over 35 million new cancer cases are projected by 2050, representing a 77 % increase from the estimated 20 million cases in 2022 [2]. Among these, GCs significantly impact women’s health and quality of life and pose a growing burden on healthcare systems worldwide. GCs encompass a range of malignancies, including cervical, ovarian, uterine (especially endometrial), and vaginal cancers, which collectively rank among the most prevalent and life-threatening conditions affecting women. In addition to their direct physical effects, these cancers also influence mental health and may predispose patients to second primary malignancies. Studies show that women diagnosed with a GCs have an elevated risk of developing another, with a mean interval of approximately 60 months between primary and secondary diagnoses [3].
Globally, the age-standardized rates (ASR) per 100,000 women are estimated at 13.3 for cervical cancer, 6.0 for endometrial cancer, and 3.4 for ovarian cancer [4]. Cervical cancer ranks as the second most common cancer among women, following breast cancer, with uterine and ovarian cancers in third and fourth place, respectively. These statistics underscore the critical need for early and accurate diagnosis. Despite its importance, the diagnostic process for GCs faces several challenges. It largely depends on the manual interpretation of cytological and histopathological images by medical specialists [5]. Given the complex anatomical structures involved, such manual processes are prone to misdiagnosis, potentially leading to treatment delays or life-threatening consequences. These challenges are particularly acute in resource-limited settings, where access to expert care is often lacking. Therefore, there is a pressing need to develop automated, accurate, and interpretable diagnostic systems to support timely and effective clinical decision-making.
In response, researchers have turned to artificial intelligence methods and DL, which have achieved great success in the field of medicine, especially medical images such as lesion classification [6], [7], [8], [9], dealing with complex organs [10], and brain tumors [11], [12]. The ability of the above examples and many others to discover and use CNNs to detect intricate patterns makes this approach promising. However, despite the power of these networks and their potential application in detecting GCs, they still face challenges that hinder their effectiveness. Among these challenges is the clarity and sharpness of images, which causes significant recognition failures, especially in the field of tissues and cancers. And also due to the focus of some experiments on creating a dataset of cancer cells that exceeds the number of normal cells, which in turn leads to bias and inaccuracy in reliable predictions [13], [14], [15]. Furthermore, with the rise of CNNs and deep models, these technologies were new to humans and have invaded all fields, even reaching medicine. So, there is a growing concern about their “black-box” nature, where models take inputs and produce results without an understandable reason for these outputs [16]. There was a need to understand the model’s interpretations and methods for classifying or issuing results based on the input. Users were unaware of the model’s potential application, particularly in the field of medicine, and the potential for threatening lives or issuing incorrect decisions based on models without a reliable basis.
In 2016, the Defense Advanced Research Projects Agency (DARPA) explained that dramatic success in machine learning has led to a torrent of AI applications. Continued advances promise to produce autonomous systems that will perceive, learn, decide, and act on their own. However, the effectiveness of these systems is limited by the machine’s current inability to explain its decisions and actions to human users. DARPA developed the XAI program, which produces more explainable models while maintaining a high level of learning performance (prediction accuracy), and enables human users to understand, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners [17]. Nevertheless, XAI has not been widely used in disease and tumor classifications, especially in GCs, leading to a gap between scientific research and practical, trustworthy medical applications [18], [19], [20]. To bridge this gap, we came up with some questions that help us build our study. Therefore, this study is guided by the following research questions:
- •
Q1: Is it possible to build a single infrastructure model capable of handling different types of data with the same accuracy?
- •
Q2: How effective is the proposed preprocessing and augmentation pipeline in addressing noise and class imbalance in datasets?
- •
Q3: Can combining the strengths of CNNs improve the accuracy of GCs’ image classification compared to individual architectures?
- •
Q4: To what extent can integrating XAI methods provide clinically meaningful explanations for machine predictions?
- •
Q5: Can the high-accuracy models built be relied upon in clinical applications?
This study was conducted to answer these questions. This study proposes the RIRXEnsemble model, a novel ensemble of deep CNNs specifically designed for the early and accurate classification of GCs images. The ensemble integrates three powerful architectures: ResNet50V2, InceptionResNetV2, and Xception, each independently pre-trained to ensure diversity in feature extraction and reduce model bias. These models were selected for their complementary strengths: The ResNet50V2 model enhances gradient flow and classification performance through residual learning with pre-activation, while the InceptionResNetV2 model combines multi-scale feature learning with the efficiency of residual connections, improving depth and representation. The Xception model leverages depthwise separable convolutions to achieve high accuracy with fewer parameters.
The outputs of these sub-models are fused at the feature level, enabling the ensemble to generate more robust and generalized predictions across both histopathological and cytological image types. To further enhance performance, the model incorporates a preprocessing pipeline that reduces image noise and highlights diagnostically relevant structures, addressing issues common in low-quality medical images. In addition, data augmentation techniques were applied to balance class distributions, mitigate model bias, and improve generalizability. Recognizing the importance of transparency in clinical settings, the model integrates XAI techniques such as Grad-CAM and LIME. These provide interpretable visual explanations of the model’s predictions, helping clinicians understand and trust the system’s decisions. Finally, a web-based application was developed to allow real-time classification of medical images. This platform enables users to upload images, receive prediction results, and view visual interpretations, making the system practical and accessible in low-resource or remote healthcare settings.
- •
Proposing the RIRXEnsemble model that combines ResNet50V2, InceptionResNetV2, and Xception to improve classification accuracy and robustness.
- •
Designing a comprehensive preprocessing and augmentation pipeline to enhance image quality and address class imbalance.
- •
Training sub-models independently to extract diverse features, which are then fused to form a highly accurate ensemble.
- •
Demonstrating the model’s effectiveness on both cytological (CIVa) and histopathological (Herlev) datasets.
- •
Proving the model’s generalizability by using the Mendeley LBC dataset to test the model.
- •
Incorporating XAI techniques (Grad-CAM and LIME) to provide clear, interpretable visual explanations of predictions.
- •
Developing an interactive web platform for uploading, classifying, and interpreting gynecological cancer cell images in real-time.
The remainder of this paper is structured as follows: Section 2 presents a comprehensive overview of previous studies on GCs diagnosis using DL and XAI methods. Section 3 shows the proposed methodology, including the dataset description, data preprocessing techniques, and data augmentation methods. Also, the architecture of the RIRXEnsembleModel, training procedures, XAI techniques, and web page development. Section 4 presents the experimental results, detailed performance, and evaluation metrics, followed by a comprehensive discussion and validation to confirm the reliability and significance of the findings. It also shows the result of integrating XAI techniques and the development of the final web application. Section 5 summarizes the key conclusions of the study and the final results and sheds light on future work.
Section snippets
Literature review
Due to continuous developments in the world, the emergence of modern technologies, and the spread of cancer and other diseases, the field of medicine has seen growing interest in the use of technology and artificial intelligence techniques. DL has the potential to reshape healthcare, save lives, and improve health and well-being [21]. CNNs have been used by many researchers to detect GCs, especially in different types of data sets, including histopathological, cytological, and MRI, due to the
Methodology
This section describes the methodology adopted in our study, which includes all stages involved in building and evaluating the proposed framework for GCs diagnosis. The proposed framework follows a systematic workflow that begins with acquiring the datasets of GCs from patient image collections, as described in Section 3.1. Next, image pre-processing techniques, detailed in Section 3.2, are applied to remove noise from the images and enhance image clarity, followed by the enhancement process to
Results and discussion
This section provides a comprehensive evaluation and discussion of the individual models’ performance and the proposed RIRXEnsemble model across datasets. Where the Section 4.1 discusses the settings for training the proposed models, the Section 4.2 details the metrics used in the evaluation process, and the Section 4.3 explains the training procedures for each data set. The Section 4.4 presents an ablation study conducted on the preprocessing pipeline to verify the effectiveness of the
Conclusions and future work
In this study, we proposed the RIRXEnsembleModel, an interpretable ensemble DL framework, for histopathological and cytological classification of GC diseases, aiming to enhance the diagnostic performance while maintaining transparency and trustworthiness for medical decisions. The model combined the strengths of multiple deep CNN architectures, integrated through a feature fusion extraction, and then applied feature-level fusion. Also, a series of successive preprocessing operations was
Compliance with ethical standards
This article does not contain studies with human participants or animals carried out by any of the authors.
Funding
Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R746), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
CRediT authorship contribution statement
Marwa M. Emam: Writing – review & editing, Visualization, Software, Methodology, Formal analysis, Conceptualization. Doaa S. Ibrahim: Writing – review & editing, Validation, Software, Resources, Data curation. Nagwan Abdel Samee: Writing – review & editing, Validation, Formal analysis. Essam H. Houssein: Writing – review & editing, Supervision, Methodology, Formal analysis.
Declaration of competing interest
No conflict of interest exists.
Acknowledgements
The authors extend their appreciation to the Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R746), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
