Generative deep learning for foundational video translation in ultrasound – Scientific Reports

16 April 2026
colind88
News Feed

Data availability

Due to the sensitive nature of patient data, we are not able to make these data publicly available at this time. Code will be made available upon publication. The Corresponding Author is the point of contact.

References

Ferreira, D. L., Lau, C., Salaymang, Z. & Arnaout, R. Self-supervised learning for label-free segmentation in cardiac ultrasound. Nat. Commun. 16, 4070. https://doi.org/10.1038/s41467-025-59451-5 (2025).

Google Scholar
Dabiri, Y. et al. Mitral valve atlas for artificial intelligence predictions of MitraClip intervention outcomes. Front. Cardiovasc. Med. 8, 759675. https://doi.org/10.3389/fcvm.2021.759675 (2021).

Google Scholar
Datar, Y. et al. Myocardial texture analysis of echocardiograms in cardiac transthyretin amyloidosis. J. Am. Soc. Echocardiogr. 37, 570–573. https://doi.org/10.1016/j.echo.2024.02.005 (2024).

Google Scholar
Madani, A., Arnaout, R., Mofrad, M. & Arnaout, R. Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit. Med. https://doi.org/10.1038/s41746-017-0013-1 (2018).

Google Scholar
Holste, G. et al. Complete AI-enabled echocardiography interpretation with multitask deep learning. JAMA 334, 306–318. https://doi.org/10.1001/jama.2025.8731 (2025).

Google Scholar
Arnaout, R. et al. An ensemble of neural networks provides expert-level prenatal detection of complex congenital heart disease. Nat. Med. 27, 882–891. https://doi.org/10.1038/s41591-021-01342-5 (2021).

Google Scholar
Athalye, C. et al. Deep-learning model for prenatal congenital heart disease screening generalizes to community setting and outperforms clinical detection. Ultrasound Obstet. Gynecol. 63, 44–52. https://doi.org/10.1002/uog.27503 (2024).

Google Scholar
Kornblith, A. E. et al. Development and validation of a deep learning strategy for automated view classification of pediatric focused assessment with sonography for trauma. J. Ultrasound Med. 41, 1915–1924. https://doi.org/10.1002/jum.15868 (2022).

Google Scholar
Reddy, A., Rizvi, S., Moon-Grady, A. J. & Arnaout, R. Improving prenatal detection of congenital heart disease with a scalable composite analysis of 6 fetal cardiac ultrasound biometrics. J. Am. Soc. Echocardiogr. 37, 1186–1188. https://doi.org/10.1016/j.echo.2024.08.007 (2024).

Google Scholar
Arnaout, R. Can machine learning help simplify the measurement of diastolic function in echocardiography?. JACC Cardiovasc. Imaging 14, 2105–2106. https://doi.org/10.1016/j.jcmg.2021.06.007 (2021).

Google Scholar
Chinn, E., Arora, R., Arnaout, R. & Arnaout, R. Enriching medical imaging training sets enables more efficient machine learning. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocad055 (2023).

Google Scholar
Ferreira, D. L. & Arnaout, R. Are foundation models efficient for medical image segmentation?. J. Am. Soc. Echocardiogr. 38, 514–516. https://doi.org/10.1016/j.echo.2025.02.001 (2025).

Google Scholar
Arnaout, R. Adapting vision-language AI models to cardiology tasks. Nat. Med. 30, 1245–1246. https://doi.org/10.1038/s41591-024-02956-1 (2024).

Google Scholar
Sachdeva, R. et al. Novel techniques in imaging congenital heart disease. J. Am. Coll. Cardiol. 83, 63–81. https://doi.org/10.1016/j.jacc.2023.10.025 (2024).

Google Scholar
Miao, B. Y. et al. The mi-claim-gen checklist for generative artificial intelligence in health. Nat. Med. 31, 1394–1398. https://doi.org/10.1038/s41591-024-03470-0 (2025).

Google Scholar
Dey, D. et al. Proceedings of the NHLBI workshop on artificial intelligence in cardiovascular imaging: translation to patient care. JACC Cardiovasc. Imaging 16, 1209–1223. https://doi.org/10.1016/j.jcmg.2023.05.012 (2023).

Google Scholar
Arnaout, R. Chatgpt helped me write this talk title, but can it read an echocardiogram?. J. Am. Soc. Echocardiogr. https://doi.org/10.1016/j.echo.2023.07.007 (2023).

Google Scholar
Couch, J., Arnaout, R. & Arnaout, R. Beyond size and class balance: alpha as a new dataset quality metric for deep learning. arXiv:2407.15724v2 (2024).
Nguyen, P. et al. greylock: A python package for measuring the composition of complex datasets. http://arxiv.org/abs/2401.00102 (2023). ArXiv:2401.00102.
Athalye, C. & Arnaout, R. Domain-guided data augmentation for deep learning on medical imaging. PLOS One 18, e0282532. https://doi.org/10.1371/journal.pone.0282532 (2023).

Google Scholar
AIUM curriculum for fundamentals of ultrasound physics and instrumentation. J. Ultrasound Med. 38, 1933–1935. https://doi.org/10.1002/jum.15088 (2019).
Farahani, A., Voghoei, S., Rasheed, K. & Arabnia, H. R. A brief review of domain adaptation. In Stahlbock, R. et al. (eds.) Advances in Data Science and Information Engineering 877–894. https://doi.org/10.1007/978-3-030-71704-9_65 (Springer International Publishing, 2021).
Islam, K. T. et al. Improving portable low-field MRI image quality through image-to-image translation using paired low- and high-field images. Sci. Rep. https://doi.org/10.1038/s41598-023-48438-1 (2023).

Google Scholar
Yang, Q. et al. MRI cross-modality image-to-image translation. Sci. Rep. 10, 3753. https://doi.org/10.1038/s41598-020-60520-6 (2020).

Google Scholar
Armanious, K. et al. MedGAN: medical image translation using GANs. Comput. Med. Imaging Graph. 79, 101684. https://doi.org/10.1016/j.compmedimag.2019.101684 (2020).

Google Scholar
Yu, J. et al. Free-form image inpainting with gated convolution. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 4470–4479. https://doi.org/10.1109/ICCV.2019.00457 (2019).
Iizuka, S., Simo-Serra, E. & Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. 36, 107:1-107:14. https://doi.org/10.1145/3072959.3073659 (2017).

Google Scholar
Chang, Y.-L., Liu, Z. Y. & Lee, K.-Y. Free-form video inpainting with 3D Gated convolution and temporal PatchGAN. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) 9065–9074. https://doi.org/10.1109/ICCV.2019.00916 (2019).
Liu, G. et al. Image inpainting for irregular holes using partial convolutions. In Computer Vision – ECCV 2018, vol. 11215 of Lecture Notes in Computer Science 89–105. https://doi.org/10.1007/978-3-030-01252-6_6 (Springer, 2018).
Armanious, K., Mecky, Y., Gatidis, S. & Yang, B. Adversarial inpainting of medical image modalities. In ICASSP 2019 – 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 3267–3271. https://doi.org/10.1109/ICASSP.2019.8682677 (2019).
Li, B. et al. AIVUS: guidewire artifacts inpainting for intravascular ultrasound imaging with united spatiotemporal aggregation learning. IEEE Trans. Comput. Imaging 8, 679–692. https://doi.org/10.1109/TCI.2022.3194722 (2022).

Google Scholar
Quan, W., Chen, J., Liu, Y., Yan, D.-M. & Wonka, P. Deep learning-based image and video inpainting: a survey. Int. J. Comput. Vis. 132, 2367–2400. https://doi.org/10.1007/s11263-023-01977-6 (2024).

Google Scholar
Shmelkov, K., Schmid, C. & Alahari, K. How good is my GAN? In Ferrari, V. et al. (eds.) Computer Vision – ECCV 2018 218–234. https://doi.org/10.1007/978-3-030-01216-8_14 (Springer International Publishing, 2018).
Isola, P., Zhu, J.-Y., Zhou, T. & Efros, A. A. Image-to-image translation with conditional adversarial networks. http://arxiv.org/abs/1611.07004. ArXiv:1611.07004 (2018).
Dhariwal, P. & Nichol, A. Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems 8780–8794. https://proceedings.nips.cc/paper/2021/hash/49ad23d1ec9fa4bd8d77d02681df5cfa-Abstract.html (2021).
Stojanovski, D., Hermida, U., Lamata, P., Beqiri, A. & Gomez, A. Echo from noise: Synthetic ultrasound image generation using diffusion models for real image segmentation. In Kainz, B. et al. (eds.) Simplifying Medical Ultrasound 34–43. https://doi.org/10.1007/978-3-031-44521-7_4 (Springer Nature Switzerland, 2023).
Johnson, J., Alahi, A. & Fei-Fei, L. Perceptual losses for real-time style transfer and super-resolution. https://doi.org/10.48550/arXiv.1603.08155 (2016). ArXiv:1603.08155.
Farnebäck, G. Two-frame motion estimation based on polynomial expansion. In Image Analysis 363–370. https://doi.org/10.1007/3-540-45103-x_50 (Springer, 2003).
Matta, S. et al. A systematic review of generalization research in medical image classification. Comput. Biol. Med. 183, 109256. https://doi.org/10.1016/j.compbiomed.2024.109256 (2024).

Google Scholar
Saharia, C. et al. Palette: image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings 1–10 (2022). https://doi.org/10.1145/3528233.3530757.
Breger, A. et al. A study of why we need to reassess full reference image quality assessment with medical images. J. Imaging Inform. Med. https://doi.org/10.1007/s10278-025-01462-1 (2025).

Google Scholar
Pambrun, J.-F. & Noumeir, R. Limitations of the SSIM quality metric in the context of diagnostic imaging. In IEEE Int. Conf. Image Processing (ICIP) 2960–2963. https://doi.org/10.1109/ICIP.2015.7351345 (2015).
Dohmen, M. et al. Similarity and quality metrics for MR image-to-image translation. Sci. Rep. 15, 3853. https://doi.org/10.1038/s41598-025-87358-0 (2025).

Google Scholar
Rodriguez, J. A., Vazquez, D., Laradji, I., Pedersoli, M. & Rodriguez, P. Ocr-vqgan: Taming text-within-image generation. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 3678–3687. https://doi.org/10.1109/WACV56688.2023.00368 (2023).
Fang, S., Xie, H., Chen, J., Tan, J. & Zhang, Y. Learning to draw text in natural images with conditional adversarial networks. In Proc. 28th Int. Joint Conf. Artif. Intell. (IJCAI) 715–722. https://doi.org/10.24963/ijcai.2019/101 (2019).
Kirillov, A. et al. Segment anything. https://doi.org/10.48550/arXiv.2304.02643ArXiv:2304.02643 (2023).

Download references

Acknowledgements

We thank Kami Gill, Claudia Guitierrez, Roshana Goodar, Shari Kennedy, Megan McLaughlin, Jane Glover, Vanessa Flores, Evette Iweke, Carol Leung, Caitlyn Brenner and other clinicians and sonographers who wish to remain anonymous, who served as expert evaluators of imaging. We thank Lennart Elbe for insightful suggestions on the manuscript.

Funding

This work was supported by the National Heart, Lung, and Blood Institute, the Chen Institute, and the Chan Zuckerberg Biohub, all to R.A.

Author information

Authors and Affiliations

Department of Medicine, Division of Cardiology, Bakar Computational Health Sciences Institute, University of California, San Francisco, 521 Parnassus Avenue, San Francisco, CA, 94143, USA

Nikolina Tomic, Roshni Bhatnagar, Sarthak Jain, Connor Lau, Tien-Yu Liu, Laura Gambini & Rima Arnaout
UCSF-UC Berkeley Joint Program in Computational Precision Health, Department of Radiology and Pediatrics Center for Intelligent Imaging, University of California, San Francisco, San Francisco, CA, 94143, USA

Rima Arnaout

Authors

Nikolina Tomic
Roshni Bhatnagar
Sarthak Jain
Connor Lau
Tien-Yu Liu
Laura Gambini
Rima Arnaout

Contributions

R.A. conceived of the study. R.A., N.T., and S.J. designed experiments with input from T.L. N.T. performed experiments and analyzed data with help from R.B., L.G., S.J., T. L., and C.L. N.T. and R.A. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Rima Arnaout.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tomic, N., Bhatnagar, R., Jain, S. et al. Generative deep learning for foundational video translation in ultrasound. Sci Rep (2026). https://doi.org/10.1038/s41598-026-47777-z

Download citation

Received: 09 December 2025
Accepted: 02 April 2026
Published: 16 April 2026
DOI: https://doi.org/10.1038/s41598-026-47777-z

Generative deep learning for foundational video translation in ultrasound – Scientific Reports

Data availability

References

Acknowledgements

Funding