Skip to content
o

O

Introduction

Drones, or Unmanned Aerial Vehicles (UAVs), are now widely used across multiple sectors, including military operations, logistics, agriculture, surveillance and disaster management [1]. Because UAVs can take aerial photos and monitor large areas in real time, security and surveillance applications have drawn a lot of attention [2]. Current advancements in artificial intelligence (AI), particularly deep learning (DL), have enabled the integration of intelligent computer vision systems into drones, thereby greatly enhancing their autonomous decision-making capabilities [3]. One such critical capability is face detection and recognition, which allows UAVs to identify and track individuals from aerial footage [4]. This technology uses deep learning architectures to accurately detect human faces and match them against a database, even under varying conditions of altitude, angle and lighting [5]. Despite the significant potential of deep learning-based face recognition systems on UAVs, several technical and operational challenges still need to be addressed [6]. To begin with, factors such as low resolution at high altitudes, motion blur, occlusions, and varying illumination cause substantial fluctuations in image quality, making aerial face recognition inherently more challenging [7]. Secondly, real-time processing constraints on drones with limited onboard computational resources create major challenges, particularly for computationally intensive deep learning models [8]. Moreover, variations in face orientation and background clutter further complicate accurate detection and identification [9]. There is also privacy concerns, ethical issues and regulatory barriers associated with using facial recognition technologies in public spaces, particularly from airborne platforms [10]. Ensuring robustness, efficiency and fairness in these systems requires solving both hardware limitations and algorithmic complexities [11]. A rising need for autonomous and scalable surveillance solutions is met by the incorporation of DL-based face detection and recognition algorithms into drones [12,13]. In large-scale public events, border security, law enforcement activities and search and rescue efforts, drones equipped with face recognition can help rapidly identify persons of interest, locate missing individuals and enhance situational awareness without requiring extensive human involvement [14,15]. Moreover, in crisis zones or inaccessible areas, UAVs serve as the only viable means of aerial monitoring, providing a critical edge in time-sensitive scenarios [16,17]. The fusion of UAV mobility with deep learning vision systems enables automated, real-time and intelligent decision-making from the sky [[18], [19], [20]]. Thus, developing efficient, accurate and lightweight face recognition algorithms tailored for drone platforms is essential for advancing next-generation surveillance and reconnaissance technologies [[21], [22], [23], [24], [25]].

Several research projects were suggested in the literature pertaining to face detection and recognition on drones using deep learning; several recent works are reviewed here;

Rostami et al. [26] presented a DL-driven framework for face detection and recognition on drones. Their method makes usage of a cutting-edge face detection and identification model that was created to improve recognition performance, especially when query images were taken from long distances or at high altitudes, where there were few facial details. Furthermore, the objective was to employ deep neural networks to perform these tasks and achieve above-average recognition accuracy. The Drone Face dataset was used to experimentally evaluate the suggested framework against state-of-the-art models, demonstrating that their approach achieved competitive performance across all detection and recognition protocols. Overall, drones equipped with DL-based face detection and recognition was able to identify individuals from aerial viewpoints in real time with high accuracy and efficiency.

Almalki et al. [27] presented coupling multiple function drones with AI in the fight against the coronavirus epidemic. Here, a couple of multiple function drones with AI to supply wireless services that fight against the Coronavirus epidemic. The suggested drone-eye system uses the CNN method with Modified Artificial Neural Network (MANN) for face mask recognition of persons dressing masks in the community. It also has thermal imaging cameras and AI outline. The device performs simple diagnostic functions, like increasing body temperature, to lessen the likelihood of illness spreading through intimate contact. The advantages of their strategy include quick and flexible deployment for jobs such as crowd surveillance, thermal scanning, and contactless distribution, which improve containment and response times.

Madasamy et al. [28] introduced OSDDY; a deep YOLO embedded system-driven object surveillance detection method using a tiny drone. It was the new deep you only look once (deep YOLO V3) technique for detecting numerous objects. During the training and testing stages, their method examines the entire frame. An interdisciplinary field for object detection was computer vision. In order to provide vehicle detection, pose estimation, and surveillance, an object detection relay was essential. The suggested method has been skilled on a vast amount of small drones in a variety of settings, including an open field and a maritime area with a complicated background. Advantages of OSDDY include real-time object surveillance, enhanced detection accuracy, and flexibility in deployment due to its embedded system integration with small drones and utilization of the deep YOLO algorithm.

Diez-Tomillo et al. [29] introduced UWS-YOLO, a ML technique according to convolutional neural networks (CNNs) in order to meet these rigorous requirements. The main advantages of UWS-YOLO were its outstanding speed, precision and capacity to manage intricate UAV operations. For real-time facial recognition in UAV applications, the algorithm offers a portable and well-balanced solution.

Gowroju et al. [30] suggested an LBPNet, a machine learning CNNs based on LBP that can identify phoney face images. Since the suggested system uses the LBP technique for feature extraction, it compares LBPNet with NLBPNet. Furthermore, forged feature learning was made possible by the suggested paired learning technique, which enables the detection module to recognize phoney images produced by a new GAN, even if it was not used throughout the training process. The speed and efficiency of the detecting process can be further increased by utilizing a drone-based system to take high-resolution pictures and movies from various angles.

Khan et al. [31] suggested architecture was modelled after multi-task cascaded convolutional neural networks (MTCNN), which was why MTCNN++ was suggested. In the framework, as the number of neurones rose, the layer density altered. MTCNN has three internal layers: R-Net, P-Net, and Osingle bondNet. The performance of the enhanced Net-Layer MTCNN (MTCNN++) was found to be comparable to or superior to that of the MTCNN library. Moreover, a 20 % dropout has been applied to improve the face recognition framework through both face clarity and face count. Because the preprocessing was done dynamically, MTCNN++ performs better than previous versions.

Yu et al. [32] presented YOLO-FaceV2, a cutting-edge real-time face detector based on the YOLOv5 architecture. In order to improve the receptive field and gather multi-scale pixel data for accurate small-face detection, the method introduces a Receptive Field Enhancement (RFE) module. The Separated and Enhancement Attention Module (SEAM) was an attention mechanism that effectively targets the occlusion-affected regions to address occlusion issues. Furthermore, a Slide Weight Function (SWF) was suggested to reduce the difference between difficult and simple samples.

Despite the growing advancements in DL-based face detection and recognition methods for drone applications, they exhibit notable drawbacks. Many existing approaches exhibit reduced performance under occlusion, low-resolution imagery, or high-altitude conditions, which limits their robustness in real-world drone environments. Some rely heavily on specific datasets, reducing generalizability across diverse conditions. Others lack adaptability to evolving threats like GAN-generated fake faces or perform inadequately when encountering unseen variations. Additionally, methods such as CNN-based or YOLO variants, while effective in certain scenarios, lack the fine-grained spatial attention or dynamic feature enhancement necessary for consistent small-face detection, particularly in crowded or cluttered scenes. These disadvantages have inspired this effort.

The novelty of the proposed FDRD-LAPINNsingle bondCLO approach lies in the development of an optimized Loss-Attentional Physics-Informed Neural Network that uniquely combines physics-based learning with attention-driven feature refinement for drone-driven face detection and recognition. Unlike conventional methods, it integrates the MCQKF) for robust image pre-processing and the Synchro-Transient-Extracting Transform (STET) for precise extraction of diverse facial shapes. The model’s performance is further enhanced using Clouded Leopard Optimization (CLO) to fine-tune network weights, enabling high accuracy in gender classification even under noisy, low-resolution and dynamic aerial conditions.

This paper’s significant contribution is as follows:

  • A novel framework combining Loss-Attentional Physics-Informed Neural Network (LAPINN) with Clouded Leopard Optimization (CLO) for enhanced drone-based face detection and recognition.

  • The MCQKF is introduced for noise-resilient image normalization and resizing, improving input quality under varying drone conditions.

  • Synchro-Transient-Extracting Transform (STET) is utilized to capture distinct facial shape features (e.g., oval, square, heart) for refined face representation.

  • Clouded Leopard Optimization is utilized to fine-tune LAPINN weight parameters, significantly improving detection and recognition accuracy.

  • The significant improvements in performance over current approaches demonstrate the usefulness and potential influence of the proposed strategy in several real-world scenarios, such as security, search and rescue and surveillance.

The remaining paper is organised as follows: the Proposed Methodology is described in Sector 2, the findings and discussions are presented in Sector 3, and the paper is concluded in Sector 4.

Access through your organization

Check access to the full text by signing in through your organization.

Access through your organization

Section snippets

Proposed methodology

In the proposed methodology, an Optimized Loss-Attentional Physics-Informed Neural Network-based framework for drone-mounted face detection and recognition (FDRD-LAPINNsingle bondCLO) is introduced. This system is designed to enhance facial detection and classification performance in aerial environments. The integration of facial recognition on drones strengthens security, supports effective surveillance, and enables rapid emergency response by allowing drones to quickly identify individuals facilitating

Result with discussion

The outcome of the FDRD-LAPINNsingle bondCLO technique is discussed. The FDRD-LAPINNsingle bondCLO method is executed in Python. Accuracy, precision, recall, F1 score, detection rate, recognition rate, and AUC are some of the metrics used to evaluate the proposed FDRD-LAPINNsingle bondCLO approach’s performance. A comparison of the techniques obtained results with other methods, such as FDRD-DL-CNN, CMD-AIFCP-MANN and ES-OSDDY-DCNN. Table 1 illustrates the output result of the FDRD-LAPINNsingle bondCLO Methodology.

Conclusion

This study introduced an optimized face detection and recognition framework, FDRD-LAPINNsingle bondCLO, tailored for UAV-based surveillance and monitoring systems. By combining the MCQKF for image pre-processing, STET for robust facial feature extraction, and the LAPINN optimized with CLO, the proposed approach substantially improves face recognition performance. Experimental results demonstrated superior accuracy, precision, recall, and detection rates analysed to existing state-of-the-art techniques.

Ethical approval and consent to participate

None of the writers of this article have conducted any research using human subjects.

Human and animal ethics

Not valid

Consent for publication

Not valid

Availability of supporting data

Since this study does not produce or analyze any new data, it is ineligible for data sharing.

Materials and methods

Not valid

Results and discussions

Not valid

Funding

Governmental, commercial, or nonprofit funding agencies did not expressly provide funding for this research.

CRediT authorship contribution statement

P Visu: Supervision. A Mohan: Writing – original draft. T Hannah Rose Esther: Supervision.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

Not valid

© 2026 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.

colind88

Back To Top