Chen Qian, PhD, Shanghai Jiao Tong University, China Qian Chen is a tenure-track Associate Professor and Ph.D. advisor at the School of Artificial Intelligence, Shanghai Jiao Tong University, where he founded the Agentic AI Lab (AAIL) to explore human–agent collaboration and build toward a symbiotic worldware. He received his Ph.D. in Software Engineering from Tsinghua University
Enhancing human-dog interaction through deep learning and explainable AI – Scientific Reports
- Article
- Open access
- Published:
- Michał Kopczyński1 &
- Michał Czubenko1
Scientific Reports (2026) Cite this article
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.
Subjects
Abstract
This article ventures into the novel field of recognizing dogs’ emotions, employing the Deep Learning and Transfer Learning techniques. A unique aspect of our study was the creation of a robust dataset from scratch, based on the latest research in the field of dogs’ emotion recognition. Using this dataset, we trained a model employing a transfer learning approach, fine-tuning established architectures to accurately identify dogs’ emotions. Our results demonstrate substantial success rates, underscoring the efficacy of these methodologies in enhancing human-dog interaction and improving animal welfare. To ensure that our model’s predictions were transparent and interpretable, we incorporated an eXplainable Artificial Intelligence (XAI) approach, utilizing Gradient-weighted Class Activation Mapping (Grad-CAM). This technique offers visual elucidations of the model’s predictions, identifying the critical regions within the images for the emotional classification. Our findings underscore the substantial potential of deep learning and transfer learning approaches for understanding canine emotions. We evaluated ten image classification models, including convolutional neural network architectures, transformer-based models (ConvNeXt), and a classical machine learning approach based on Support Vector Machines (SVM) using DINO v2 features. The highest performance was achieved by the YOLO v11 (You Only Look Once) architecture, with both accuracy and F1-score reaching 0.84. Notably, the DINO v2-SVM model ranked second, attaining an accuracy of 0.75 and an F1-score of 0.76, while the ensemble model combining MobileNet, EfficientNet, and ResNet50 (Residual Network with 50 layers) achieved third place, with both metrics equal to 0.75.
Acknowledgements
Thanks to the developers team: Tymoteusz Byrwa, Jakub Kłopotek-Główczewski, Maksymilian Terebus, Krzysztof Dymanowski.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Reprints and permissions
About this article
Cite this article
Kopczyński, M., Czubenko, M. Enhancing human-dog interaction through deep learning and explainable AI. Sci Rep (2026). https://doi.org/10.1038/s41598-026-51009-9
Download citation
-
Received:
-
Accepted:
-
Published:
-
DOI: https://doi.org/10.1038/s41598-026-51009-9
