From conferences to surveys We are living in an era of superlatives. Each year, month, week, new advancements in machine learning research are announced. The number of (ML) papers added to arXiv is growing equally fast. More than 11 000 papers have been added last October in the Computer Science Category. Photo by Liam Charmer
Open-source machine learning systems face increasing security threats – Tech Edition
Recent research has uncovered significant security vulnerabilities in open-source machine learning (ML) frameworks, putting sensitive data and operations at risk. As ML adoption grows across industries, so does the urgency of addressing these threats. The vulnerabilities, identified in a report by JFrog, reveal gaps in ML security compared to more established systems like DevOps and web servers.
Critical vulnerabilities in ML frameworks
Open-source ML projects have seen a rise in security flaws, with JFrog reporting 22 vulnerabilities across 15 ML tools in recent months. Two primary concerns concern server-side components and privilege escalation risks within ML environments. These vulnerabilities could allow attackers to access sensitive files, gain unauthorised privileges, and compromise the entire ML workflow.
One significant flaw involves Weave, a Weights & Biases (W&B) tool that tracks and visualises ML model metrics. The WANDB Weave Directory Traversal vulnerability (CVE-2024-7340) allows attackers to exploit improper input validation in file paths. By doing so, they can access sensitive files, including admin API keys, enabling privilege escalation and potentially compromising ML pipelines.
Another affected tool is ZenML, which manages MLOps pipelines. A critical flaw in ZenML Cloud’s access control lets attackers with minimal access privileges escalate permissions. This could expose confidential data like secrets and model files, allowing attackers to manipulate pipelines, tamper with model data, or disrupt production environments dependent on these pipelines.
Risks of privilege escalation and data breaches
Other vulnerabilities highlight the risks of privilege escalation in ML systems. The Deep Lake Command Injection (CVE-2024-6507) found in the Deep Lake database is particularly severe. This database, designed for AI applications, suffers from improper command sanitisation, allowing attackers to execute arbitrary commands. Such breaches could compromise the database and connected applications, leading to remote code execution.
Vanna AI, a natural language SQL query generation tool, also has a serious vulnerability. The Vanna.AI Prompt Injection (CVE-2024-5565) flaw lets attackers inject malicious code into SQL prompts, which can result in remote code execution. This poses risks like manipulated visualisations, SQL injections, or data theft.
Mage.AI, an MLOps platform for managing data pipelines, is vulnerable to unauthorised shell access, file leaks, and path traversal issues. These flaws enable attackers to control pipelines, expose configurations, and execute malicious commands, risking privilege escalation and data integrity breaches.
The path forward
JFrog’s findings highlight a critical gap in MLOps security. Many organisations fail to integrate AI/ML security with broader cybersecurity strategies, leaving blind spots. Attackers can exploit these vulnerabilities to embed malicious code in models, steal data, or manipulate outputs, creating widespread disruptions.
As ML and AI continue transforming industries, securing their frameworks, datasets, and models is essential. Robust security practices must be prioritised to protect the innovations that drive this growing field.