Skip to content
generative-deep-learning-for-foundational-video-translation-in-ultrasound-–-scientific-reports

Generative deep learning for foundational video translation in ultrasound – Scientific Reports

Data availability

Due to the sensitive nature of patient data, we are not able to make these data publicly available at this time. Code will be made available upon publication. The Corresponding Author is the point of contact.

References

  1. Ferreira, D. L., Lau, C., Salaymang, Z. & Arnaout, R. Self-supervised learning for label-free segmentation in cardiac ultrasound. Nat. Commun. 16, 4070. https://doi.org/10.1038/s41467-025-59451-5 (2025).

    Google Scholar 

  2. Dabiri, Y. et al. Mitral valve atlas for artificial intelligence predictions of MitraClip intervention outcomes. Front. Cardiovasc. Med. 8, 759675. https://doi.org/10.3389/fcvm.2021.759675 (2021).

    Google Scholar 

  3. Datar, Y. et al. Myocardial texture analysis of echocardiograms in cardiac transthyretin amyloidosis. J. Am. Soc. Echocardiogr. 37, 570–573. https://doi.org/10.1016/j.echo.2024.02.005 (2024).

    Google Scholar 

  4. Madani, A., Arnaout, R., Mofrad, M. & Arnaout, R. Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit. Med. https://doi.org/10.1038/s41746-017-0013-1 (2018).

    Google Scholar 

  5. Holste, G. et al. Complete AI-enabled echocardiography interpretation with multitask deep learning. JAMA 334, 306–318. https://doi.org/10.1001/jama.2025.8731 (2025).

    Google Scholar 

  6. Arnaout, R. et al. An ensemble of neural networks provides expert-level prenatal detection of complex congenital heart disease. Nat. Med. 27, 882–891. https://doi.org/10.1038/s41591-021-01342-5 (2021).

    Google Scholar 

  7. Athalye, C. et al. Deep-learning model for prenatal congenital heart disease screening generalizes to community setting and outperforms clinical detection. Ultrasound Obstet. Gynecol. 63, 44–52. https://doi.org/10.1002/uog.27503 (2024).

    Google Scholar 

  8. Kornblith, A. E. et al. Development and validation of a deep learning strategy for automated view classification of pediatric focused assessment with sonography for trauma. J. Ultrasound Med. 41, 1915–1924. https://doi.org/10.1002/jum.15868 (2022).

    Google Scholar 

  9. Reddy, A., Rizvi, S., Moon-Grady, A. J. & Arnaout, R. Improving prenatal detection of congenital heart disease with a scalable composite analysis of 6 fetal cardiac ultrasound biometrics. J. Am. Soc. Echocardiogr. 37, 1186–1188. https://doi.org/10.1016/j.echo.2024.08.007 (2024).

    Google Scholar 

  10. Arnaout, R. Can machine learning help simplify the measurement of diastolic function in echocardiography?. JACC Cardiovasc. Imaging 14, 2105–2106. https://doi.org/10.1016/j.jcmg.2021.06.007 (2021).

    Google Scholar 

  11. Chinn, E., Arora, R., Arnaout, R. & Arnaout, R. Enriching medical imaging training sets enables more efficient machine learning. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocad055 (2023).

    Google Scholar 

  12. Ferreira, D. L. & Arnaout, R. Are foundation models efficient for medical image segmentation?. J. Am. Soc. Echocardiogr. 38, 514–516. https://doi.org/10.1016/j.echo.2025.02.001 (2025).

    Google Scholar 

  13. Arnaout, R. Adapting vision-language AI models to cardiology tasks. Nat. Med. 30, 1245–1246. https://doi.org/10.1038/s41591-024-02956-1 (2024).

    Google Scholar 

  14. Sachdeva, R. et al. Novel techniques in imaging congenital heart disease. J. Am. Coll. Cardiol. 83, 63–81. https://doi.org/10.1016/j.jacc.2023.10.025 (2024).

    Google Scholar 

  15. Miao, B. Y. et al. The mi-claim-gen checklist for generative artificial intelligence in health. Nat. Med. 31, 1394–1398. https://doi.org/10.1038/s41591-024-03470-0 (2025).

    Google Scholar 

  16. Dey, D. et al. Proceedings of the NHLBI workshop on artificial intelligence in cardiovascular imaging: translation to patient care. JACC Cardiovasc. Imaging 16, 1209–1223. https://doi.org/10.1016/j.jcmg.2023.05.012 (2023).

    Google Scholar 

  17. Arnaout, R. Chatgpt helped me write this talk title, but can it read an echocardiogram?. J. Am. Soc. Echocardiogr. https://doi.org/10.1016/j.echo.2023.07.007 (2023).

    Google Scholar 

  18. Couch, J., Arnaout, R. & Arnaout, R. Beyond size and class balance: alpha as a new dataset quality metric for deep learning. arXiv:2407.15724v2 (2024).

  19. Nguyen, P. et al. greylock: A python package for measuring the composition of complex datasets. http://arxiv.org/abs/2401.00102 (2023). ArXiv:2401.00102.

  20. Athalye, C. & Arnaout, R. Domain-guided data augmentation for deep learning on medical imaging. PLOS One 18, e0282532. https://doi.org/10.1371/journal.pone.0282532 (2023).

    Google Scholar 

  21. AIUM curriculum for fundamentals of ultrasound physics and instrumentation. J. Ultrasound Med. 38, 1933–1935. https://doi.org/10.1002/jum.15088 (2019).

  22. Farahani, A., Voghoei, S., Rasheed, K. & Arabnia, H. R. A brief review of domain adaptation. In Stahlbock, R. et al. (eds.) Advances in Data Science and Information Engineering 877–894. https://doi.org/10.1007/978-3-030-71704-9_65 (Springer International Publishing, 2021).

  23. Islam, K. T. et al. Improving portable low-field MRI image quality through image-to-image translation using paired low- and high-field images. Sci. Rep. https://doi.org/10.1038/s41598-023-48438-1 (2023).

    Google Scholar 

  24. Yang, Q. et al. MRI cross-modality image-to-image translation. Sci. Rep. 10, 3753. https://doi.org/10.1038/s41598-020-60520-6 (2020).

    Google Scholar 

  25. Armanious, K. et al. MedGAN: medical image translation using GANs. Comput. Med. Imaging Graph. 79, 101684. https://doi.org/10.1016/j.compmedimag.2019.101684 (2020).

    Google Scholar 

  26. Yu, J. et al. Free-form image inpainting with gated convolution. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 4470–4479. https://doi.org/10.1109/ICCV.2019.00457 (2019).

  27. Iizuka, S., Simo-Serra, E. & Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. 36, 107:1-107:14. https://doi.org/10.1145/3072959.3073659 (2017).

    Google Scholar 

  28. Chang, Y.-L., Liu, Z. Y. & Lee, K.-Y. Free-form video inpainting with 3D Gated convolution and temporal PatchGAN. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 9065–9074. https://doi.org/10.1109/ICCV.2019.00916 (2019).

  29. Liu, G. et al. Image inpainting for irregular holes using partial convolutions. In Computer Vision – ECCV 2018, vol. 11215 of Lecture Notes in Computer Science 89–105. https://doi.org/10.1007/978-3-030-01252-6_6 (Springer, 2018).

  30. Armanious, K., Mecky, Y., Gatidis, S. & Yang, B. Adversarial inpainting of medical image modalities. In ICASSP 2019 – 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 3267–3271. https://doi.org/10.1109/ICASSP.2019.8682677 (2019).

  31. Li, B. et al. AIVUS: guidewire artifacts inpainting for intravascular ultrasound imaging with united spatiotemporal aggregation learning. IEEE Trans. Comput. Imaging 8, 679–692. https://doi.org/10.1109/TCI.2022.3194722 (2022).

    Google Scholar 

  32. Quan, W., Chen, J., Liu, Y., Yan, D.-M. & Wonka, P. Deep learning-based image and video inpainting: a survey. Int. J. Comput. Vis. 132, 2367–2400. https://doi.org/10.1007/s11263-023-01977-6 (2024).

    Google Scholar 

  33. Shmelkov, K., Schmid, C. & Alahari, K. How good is my GAN? In Ferrari, V. et al. (eds.) Computer Vision – ECCV 2018 218–234. https://doi.org/10.1007/978-3-030-01216-8_14 (Springer International Publishing, 2018).

  34. Isola, P., Zhu, J.-Y., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. http://arxiv.org/abs/1611.07004. ArXiv:1611.07004 (2018).

  35. Dhariwal, P. & Nichol, A. Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems 8780–8794. https://proceedings.nips.cc/paper/2021/hash/49ad23d1ec9fa4bd8d77d02681df5cfa-Abstract.html (2021).

  36. Stojanovski, D., Hermida, U., Lamata, P., Beqiri, A. & Gomez, A. Echo from noise: Synthetic ultrasound image generation using diffusion models for real image segmentation. In Kainz, B. et al. (eds.) Simplifying Medical Ultrasound 34–43. https://doi.org/10.1007/978-3-031-44521-7_4 (Springer Nature Switzerland, 2023).

  37. Johnson, J., Alahi, A. & Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. https://doi.org/10.48550/arXiv.1603.08155 (2016). ArXiv:1603.08155.

  38. Farnebäck, G. Two-frame motion estimation based on polynomial expansion. In Image Analysis 363–370. https://doi.org/10.1007/3-540-45103-x_50 (Springer, 2003).

  39. Matta, S. et al. A systematic review of generalization research in medical image classification. Comput. Biol. Med. 183, 109256. https://doi.org/10.1016/j.compbiomed.2024.109256 (2024).

    Google Scholar 

  40. Saharia, C. et al. Palette: image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings 1–10 (2022). https://doi.org/10.1145/3528233.3530757.

  41. Breger, A. et al. A study of why we need to reassess full reference image quality assessment with medical images. J. Imaging Inform. Med. https://doi.org/10.1007/s10278-025-01462-1 (2025).

    Google Scholar 

  42. Pambrun, J.-F. & Noumeir, R. Limitations of the SSIM quality metric in the context of diagnostic imaging. In IEEE Int. Conf. Image Processing (ICIP) 2960–2963. https://doi.org/10.1109/ICIP.2015.7351345 (2015).

  43. Dohmen, M. et al. Similarity and quality metrics for MR image-to-image translation. Sci. Rep. 15, 3853. https://doi.org/10.1038/s41598-025-87358-0 (2025).

    Google Scholar 

  44. Rodriguez, J. A., Vazquez, D., Laradji, I., Pedersoli, M. & Rodriguez, P. Ocr-vqgan: Taming text-within-image generation. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 3678–3687. https://doi.org/10.1109/WACV56688.2023.00368 (2023).

  45. Fang, S., Xie, H., Chen, J., Tan, J. & Zhang, Y. Learning to draw text in natural images with conditional adversarial networks. In Proc. 28th Int. Joint Conf. Artif. Intell. (IJCAI) 715–722. https://doi.org/10.24963/ijcai.2019/101 (2019).

  46. Kirillov, A. et al. Segment anything. https://doi.org/10.48550/arXiv.2304.02643ArXiv:2304.02643 (2023).

Download references

Acknowledgements

We thank Kami Gill, Claudia Guitierrez, Roshana Goodar, Shari Kennedy, Megan McLaughlin, Jane Glover, Vanessa Flores, Evette Iweke, Carol Leung, Caitlyn Brenner and other clinicians and sonographers who wish to remain anonymous, who served as expert evaluators of imaging. We thank Lennart Elbe for insightful suggestions on the manuscript.

Funding

This work was supported by the National Heart, Lung, and Blood Institute, the Chen Institute, and the Chan Zuckerberg Biohub, all to R.A.

Author information

Authors and Affiliations

  1. Department of Medicine, Division of Cardiology, Bakar Computational Health Sciences Institute, University of California, San Francisco, 521 Parnassus Avenue, San Francisco, CA, 94143, USA

    Nikolina Tomic, Roshni Bhatnagar, Sarthak Jain, Connor Lau, Tien-Yu Liu, Laura Gambini & Rima Arnaout

  2. UCSF-UC Berkeley Joint Program in Computational Precision Health, Department of Radiology and Pediatrics Center for Intelligent Imaging, University of California, San Francisco, San Francisco, CA, 94143, USA

    Rima Arnaout

Authors

  1. Nikolina Tomic
  2. Roshni Bhatnagar
  3. Sarthak Jain
  4. Connor Lau
  5. Tien-Yu Liu
  6. Laura Gambini
  7. Rima Arnaout

Contributions

R.A. conceived of the study. R.A., N.T., and S.J. designed experiments with input from T.L. N.T. performed experiments and analyzed data with help from R.B., L.G., S.J., T. L., and C.L. N.T. and R.A. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Rima Arnaout.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tomic, N., Bhatnagar, R., Jain, S. et al. Generative deep learning for foundational video translation in ultrasound. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47777-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41598-026-47777-z

Keywords

colind88

Back To Top