I was unsure if my parents would notice that the voice on the other end wasn’t mine — or that it was mine, sort of, but it wasn’t me. The voice said hello, asked my dad how he was doing, and asked again when he didn’t respond quickly enough. “What is that, Gaby?” He realized
Generative deep learning for foundational video translation in ultrasound – Scientific Reports
Data availability
Due to the sensitive nature of patient data, we are not able to make these data publicly available at this time. Code will be made available upon publication. The Corresponding Author is the point of contact.
References
-
Ferreira, D. L., Lau, C., Salaymang, Z. & Arnaout, R. Self-supervised learning for label-free segmentation in cardiac ultrasound. Nat. Commun. 16, 4070. https://doi.org/10.1038/s41467-025-59451-5 (2025).
Google Scholar
-
Dabiri, Y. et al. Mitral valve atlas for artificial intelligence predictions of MitraClip intervention outcomes. Front. Cardiovasc. Med. 8, 759675. https://doi.org/10.3389/fcvm.2021.759675 (2021).
Google Scholar
-
Datar, Y. et al. Myocardial texture analysis of echocardiograms in cardiac transthyretin amyloidosis. J. Am. Soc. Echocardiogr. 37, 570–573. https://doi.org/10.1016/j.echo.2024.02.005 (2024).
Google Scholar
-
Madani, A., Arnaout, R., Mofrad, M. & Arnaout, R. Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit. Med. https://doi.org/10.1038/s41746-017-0013-1 (2018).
Google Scholar
-
Holste, G. et al. Complete AI-enabled echocardiography interpretation with multitask deep learning. JAMA 334, 306–318. https://doi.org/10.1001/jama.2025.8731 (2025).
Google Scholar
-
Arnaout, R. et al. An ensemble of neural networks provides expert-level prenatal detection of complex congenital heart disease. Nat. Med. 27, 882–891. https://doi.org/10.1038/s41591-021-01342-5 (2021).
Google Scholar
-
Athalye, C. et al. Deep-learning model for prenatal congenital heart disease screening generalizes to community setting and outperforms clinical detection. Ultrasound Obstet. Gynecol. 63, 44–52. https://doi.org/10.1002/uog.27503 (2024).
Google Scholar
-
Kornblith, A. E. et al. Development and validation of a deep learning strategy for automated view classification of pediatric focused assessment with sonography for trauma. J. Ultrasound Med. 41, 1915–1924. https://doi.org/10.1002/jum.15868 (2022).
Google Scholar
-
Reddy, A., Rizvi, S., Moon-Grady, A. J. & Arnaout, R. Improving prenatal detection of congenital heart disease with a scalable composite analysis of 6 fetal cardiac ultrasound biometrics. J. Am. Soc. Echocardiogr. 37, 1186–1188. https://doi.org/10.1016/j.echo.2024.08.007 (2024).
Google Scholar
-
Arnaout, R. Can machine learning help simplify the measurement of diastolic function in echocardiography?. JACC Cardiovasc. Imaging 14, 2105–2106. https://doi.org/10.1016/j.jcmg.2021.06.007 (2021).
Google Scholar
-
Chinn, E., Arora, R., Arnaout, R. & Arnaout, R. Enriching medical imaging training sets enables more efficient machine learning. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocad055 (2023).
Google Scholar
-
Ferreira, D. L. & Arnaout, R. Are foundation models efficient for medical image segmentation?. J. Am. Soc. Echocardiogr. 38, 514–516. https://doi.org/10.1016/j.echo.2025.02.001 (2025).
Google Scholar
-
Arnaout, R. Adapting vision-language AI models to cardiology tasks. Nat. Med. 30, 1245–1246. https://doi.org/10.1038/s41591-024-02956-1 (2024).
Google Scholar
-
Sachdeva, R. et al. Novel techniques in imaging congenital heart disease. J. Am. Coll. Cardiol. 83, 63–81. https://doi.org/10.1016/j.jacc.2023.10.025 (2024).
Google Scholar
-
Miao, B. Y. et al. The mi-claim-gen checklist for generative artificial intelligence in health. Nat. Med. 31, 1394–1398. https://doi.org/10.1038/s41591-024-03470-0 (2025).
Google Scholar
-
Dey, D. et al. Proceedings of the NHLBI workshop on artificial intelligence in cardiovascular imaging: translation to patient care. JACC Cardiovasc. Imaging 16, 1209–1223. https://doi.org/10.1016/j.jcmg.2023.05.012 (2023).
Google Scholar
-
Arnaout, R. Chatgpt helped me write this talk title, but can it read an echocardiogram?. J. Am. Soc. Echocardiogr. https://doi.org/10.1016/j.echo.2023.07.007 (2023).
Google Scholar
-
Couch, J., Arnaout, R. & Arnaout, R. Beyond size and class balance: alpha as a new dataset quality metric for deep learning. arXiv:2407.15724v2 (2024).
-
Nguyen, P. et al. greylock: A python package for measuring the composition of complex datasets. http://arxiv.org/abs/2401.00102 (2023). ArXiv:2401.00102.
-
Athalye, C. & Arnaout, R. Domain-guided data augmentation for deep learning on medical imaging. PLOS One 18, e0282532. https://doi.org/10.1371/journal.pone.0282532 (2023).
Google Scholar
-
AIUM curriculum for fundamentals of ultrasound physics and instrumentation. J. Ultrasound Med. 38, 1933–1935. https://doi.org/10.1002/jum.15088 (2019).
-
Farahani, A., Voghoei, S., Rasheed, K. & Arabnia, H. R. A brief review of domain adaptation. In Stahlbock, R. et al. (eds.) Advances in Data Science and Information Engineering 877–894. https://doi.org/10.1007/978-3-030-71704-9_65 (Springer International Publishing, 2021).
-
Islam, K. T. et al. Improving portable low-field MRI image quality through image-to-image translation using paired low- and high-field images. Sci. Rep. https://doi.org/10.1038/s41598-023-48438-1 (2023).
Google Scholar
-
Yang, Q. et al. MRI cross-modality image-to-image translation. Sci. Rep. 10, 3753. https://doi.org/10.1038/s41598-020-60520-6 (2020).
Google Scholar
-
Armanious, K. et al. MedGAN: medical image translation using GANs. Comput. Med. Imaging Graph. 79, 101684. https://doi.org/10.1016/j.compmedimag.2019.101684 (2020).
Google Scholar
-
Yu, J. et al. Free-form image inpainting with gated convolution. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 4470–4479. https://doi.org/10.1109/ICCV.2019.00457 (2019).
-
Iizuka, S., Simo-Serra, E. & Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. 36, 107:1-107:14. https://doi.org/10.1145/3072959.3073659 (2017).
Google Scholar
-
Chang, Y.-L., Liu, Z. Y. & Lee, K.-Y. Free-form video inpainting with 3D Gated convolution and temporal PatchGAN. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 9065–9074. https://doi.org/10.1109/ICCV.2019.00916 (2019).
-
Liu, G. et al. Image inpainting for irregular holes using partial convolutions. In Computer Vision – ECCV 2018, vol. 11215 of Lecture Notes in Computer Science 89–105. https://doi.org/10.1007/978-3-030-01252-6_6 (Springer, 2018).
-
Armanious, K., Mecky, Y., Gatidis, S. & Yang, B. Adversarial inpainting of medical image modalities. In ICASSP 2019 – 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 3267–3271. https://doi.org/10.1109/ICASSP.2019.8682677 (2019).
-
Li, B. et al. AIVUS: guidewire artifacts inpainting for intravascular ultrasound imaging with united spatiotemporal aggregation learning. IEEE Trans. Comput. Imaging 8, 679–692. https://doi.org/10.1109/TCI.2022.3194722 (2022).
Google Scholar
-
Quan, W., Chen, J., Liu, Y., Yan, D.-M. & Wonka, P. Deep learning-based image and video inpainting: a survey. Int. J. Comput. Vis. 132, 2367–2400. https://doi.org/10.1007/s11263-023-01977-6 (2024).
Google Scholar
-
Shmelkov, K., Schmid, C. & Alahari, K. How good is my GAN? In Ferrari, V. et al. (eds.) Computer Vision – ECCV 2018 218–234. https://doi.org/10.1007/978-3-030-01216-8_14 (Springer International Publishing, 2018).
-
Isola, P., Zhu, J.-Y., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. http://arxiv.org/abs/1611.07004. ArXiv:1611.07004 (2018).
-
Dhariwal, P. & Nichol, A. Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems 8780–8794. https://proceedings.nips.cc/paper/2021/hash/49ad23d1ec9fa4bd8d77d02681df5cfa-Abstract.html (2021).
-
Stojanovski, D., Hermida, U., Lamata, P., Beqiri, A. & Gomez, A. Echo from noise: Synthetic ultrasound image generation using diffusion models for real image segmentation. In Kainz, B. et al. (eds.) Simplifying Medical Ultrasound 34–43. https://doi.org/10.1007/978-3-031-44521-7_4 (Springer Nature Switzerland, 2023).
-
Johnson, J., Alahi, A. & Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. https://doi.org/10.48550/arXiv.1603.08155 (2016). ArXiv:1603.08155.
-
Farnebäck, G. Two-frame motion estimation based on polynomial expansion. In Image Analysis 363–370. https://doi.org/10.1007/3-540-45103-x_50 (Springer, 2003).
-
Matta, S. et al. A systematic review of generalization research in medical image classification. Comput. Biol. Med. 183, 109256. https://doi.org/10.1016/j.compbiomed.2024.109256 (2024).
Google Scholar
-
Saharia, C. et al. Palette: image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings 1–10 (2022). https://doi.org/10.1145/3528233.3530757.
-
Breger, A. et al. A study of why we need to reassess full reference image quality assessment with medical images. J. Imaging Inform. Med. https://doi.org/10.1007/s10278-025-01462-1 (2025).
Google Scholar
-
Pambrun, J.-F. & Noumeir, R. Limitations of the SSIM quality metric in the context of diagnostic imaging. In IEEE Int. Conf. Image Processing (ICIP) 2960–2963. https://doi.org/10.1109/ICIP.2015.7351345 (2015).
-
Dohmen, M. et al. Similarity and quality metrics for MR image-to-image translation. Sci. Rep. 15, 3853. https://doi.org/10.1038/s41598-025-87358-0 (2025).
Google Scholar
-
Rodriguez, J. A., Vazquez, D., Laradji, I., Pedersoli, M. & Rodriguez, P. Ocr-vqgan: Taming text-within-image generation. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 3678–3687. https://doi.org/10.1109/WACV56688.2023.00368 (2023).
-
Fang, S., Xie, H., Chen, J., Tan, J. & Zhang, Y. Learning to draw text in natural images with conditional adversarial networks. In Proc. 28th Int. Joint Conf. Artif. Intell. (IJCAI) 715–722. https://doi.org/10.24963/ijcai.2019/101 (2019).
-
Kirillov, A. et al. Segment anything. https://doi.org/10.48550/arXiv.2304.02643ArXiv:2304.02643 (2023).
Download references
Acknowledgements
We thank Kami Gill, Claudia Guitierrez, Roshana Goodar, Shari Kennedy, Megan McLaughlin, Jane Glover, Vanessa Flores, Evette Iweke, Carol Leung, Caitlyn Brenner and other clinicians and sonographers who wish to remain anonymous, who served as expert evaluators of imaging. We thank Lennart Elbe for insightful suggestions on the manuscript.
Funding
This work was supported by the National Heart, Lung, and Blood Institute, the Chen Institute, and the Chan Zuckerberg Biohub, all to R.A.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Reprints and permissions
About this article
Cite this article
Tomic, N., Bhatnagar, R., Jain, S. et al. Generative deep learning for foundational video translation in ultrasound. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47777-z
Download citation
-
Received:
-
Accepted:
-
Published:
-
DOI: https://doi.org/10.1038/s41598-026-47777-z
