Enhancing human-dog interaction through deep learning and explainable AI – Scientific Reports

1 July 2026
colind88
News Feed

Article
Open access
Published: 01 July 2026

Michał Kopczyński¹ &
Michał Czubenko¹

Scientific Reports (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

This article ventures into the novel field of recognizing dogs’ emotions, employing the Deep Learning and Transfer Learning techniques. A unique aspect of our study was the creation of a robust dataset from scratch, based on the latest research in the field of dogs’ emotion recognition. Using this dataset, we trained a model employing a transfer learning approach, fine-tuning established architectures to accurately identify dogs’ emotions. Our results demonstrate substantial success rates, underscoring the efficacy of these methodologies in enhancing human-dog interaction and improving animal welfare. To ensure that our model’s predictions were transparent and interpretable, we incorporated an eXplainable Artificial Intelligence (XAI) approach, utilizing Gradient-weighted Class Activation Mapping (Grad-CAM). This technique offers visual elucidations of the model’s predictions, identifying the critical regions within the images for the emotional classification. Our findings underscore the substantial potential of deep learning and transfer learning approaches for understanding canine emotions. We evaluated ten image classification models, including convolutional neural network architectures, transformer-based models (ConvNeXt), and a classical machine learning approach based on Support Vector Machines (SVM) using DINO v2 features. The highest performance was achieved by the YOLO v11 (You Only Look Once) architecture, with both accuracy and F1-score reaching 0.84. Notably, the DINO v2-SVM model ranked second, attaining an accuracy of 0.75 and an F1-score of 0.76, while the ensemble model combining MobileNet, EfficientNet, and ResNet50 (Residual Network with 50 layers) achieved third place, with both metrics equal to 0.75.

Acknowledgements

Thanks to the developers team: Tymoteusz Byrwa, Jakub Kłopotek-Główczewski, Maksymilian Terebus, Krzysztof Dymanowski.

Author information

Authors and Affiliations

Department of Decision Systems and Robotics, Faculty of Electronics Telecommunications and Informatics, Gdańsk University of Technology, Narutowicza 11/12, 80-233, Gdańsk, Pomeranian, Poland

Michał Kopczyński & Michał Czubenko

Authors

Michał Kopczyński
Michał Czubenko

Corresponding author

Correspondence to Michał Czubenko.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kopczyński, M., Czubenko, M. Enhancing human-dog interaction through deep learning and explainable AI. Sci Rep (2026). https://doi.org/10.1038/s41598-026-51009-9

Download citation

Received: 18 July 2025
Accepted: 24 April 2026
Published: 01 July 2026
DOI: https://doi.org/10.1038/s41598-026-51009-9

Enhancing human-dog interaction through deep learning and explainable AI – Scientific Reports

Subjects

Abstract

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

colind88

REACH OUT!

Subjects

Abstract

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Share This

colind88

Related Posts

REACH OUT!