What Is LLM OWASP Top 10?
The OWASP Top 10 for Large Language Models (LLMs) is a set of security risks specifically identified for LLM applications. As LLMs like OpenAI’s GPT and Anthropic Claude become more integrated into products and services, they introduce unique security challenges that traditional application security frameworks don't fully address.
This specialized Top 10 list highlights the most critical vulnerabilities for LLMs, offering insights into their associated risks and how they differ from conventional web or API security threats. The list is for developers, data scientists, and security experts who work on or with LLM applications. It serves as a guide to understanding, mitigating, and preventing these risks by providing practical examples and solutions.
OWASP Top 10 Threats for LLMs and How to Prevent Them
LLM01: Prompt Injection
Prompt injection attacks manipulate the input of an LLM to influence its output in unintended ways. By crafting specially designed inputs, attackers can make the LLM generate unauthorized commands, reveal sensitive information, or bypass security controls. In more sophisticated attacks, indirect prompt injections embed harmful instructions in external data sources that the LLM processes.
For example, an attacker could manipulate a model to access restricted systems or perform actions without user consent, compromising security. These attacks are particularly dangerous because the LLM often treats all inputs as trustworthy, leading to the execution of malicious commands.
Prevention:
- Enforce least privilege for LLM access to backend systems by limiting the model's access to sensitive functions.
- Implement input-output segregation using formats like ChatML to differentiate between system and user prompts.
- Include human review for high-risk operations or outputs involving sensitive data.
- Regularly monitor LLM interactions to detect and block suspicious behavior patterns early.
LLM02: Insecure Output Handling
Insecure output handling occurs when the output generated by an LLM is used without proper validation or sanitization. This can lead to vulnerabilities such as cross-site scripting (XSS), server-side request forgery (SSRF), or remote code execution if the model’s output interacts with other systems, such as web browsers or backends.
Since LLMs can generate code, scripts, or structured outputs based on user input, they may unintentionally include malicious elements that can compromise system security when executed.
Prevention:
- Apply strict output validation before passing LLM-generated data to downstream systems.
- Use output encoding techniques (e.g., sanitizing HTML, JavaScript) to prevent XSS or SSRF attacks.
- Implement access control layers that restrict what LLM-generated data can be executed or shared with external systems.
- Incorporate input sanitization in LLM queries to minimize the risk of indirect injections affecting the output.
LLM03: Training Data Poisoning
Training data poisoning refers to the manipulation of the datasets used to train or fine-tune LLMs. An attacker could inject malicious data or biases into these datasets, resulting in the model producing incorrect, harmful, or biased outputs.
Poisoned data can degrade model performance, compromise its ethical behavior, or lead to security risks, such as recommending faulty code or producing inaccurate legal advice. Poisoning is especially dangerous because it can be difficult to detect once the model is in production, affecting all future interactions.
Prevention:
- Use trusted and vetted data sources for training and fine-tuning to ensure integrity and avoid tampered data.
- Implement data anomaly detection techniques during training to identify and remove adversarial data points.
- Regularly audit training datasets and fine-tuning processes for signs of poisoning or bias.
- Use sandboxing techniques to isolate training processes and minimize exposure to untrusted data inputs.
LLM04: Model Denial of Service
Denial of service attacks on LLMs exploit their resource-intensive nature by overwhelming them with large or complex inputs, causing system slowdowns or making the model unresponsive.
This can degrade the quality of service for legitimate users or drive up operational costs due to excessive resource usage. Attackers may send repeated or recursive inputs that exceed the LLM’s context window, forcing the system to continually process more data than it can handle.
Prevention:
- Implement rate-limiting to restrict the number of queries per user or IP address within a set time frame.
- Set input size limits that restrict the length of inputs based on the LLM’s context window to prevent resource exhaustion.
- Continuously monitor resource utilization and set alerts for abnormal spikes in usage.
- Use query throttling to slow down complex operations that could lead to DoS conditions.
Tips from the expert:
In my experience, here are tips that can help you better address security challenges when working with LLMs as per the OWASP Top 10 for LLMs:
- Establish strong red-teaming practices: Regularly conduct red-teaming exercises to simulate real-world attack scenarios targeting LLMs. This helps in identifying weaknesses, especially with prompt injections and data leakage, before malicious actors do.
- Adopt adversarial testing techniques: Use adversarial testing against LLMs, such as generating intentionally crafted queries designed to manipulate or exploit the model’s behavior. This helps identify how the model could be manipulated and improves defenses against sophisticated prompt injections.
- Incorporate privacy-preserving computation: Use techniques like differential privacy or federated learning when training LLMs to ensure that sensitive data cannot be memorized or reproduced by the model, further minimizing risks of data leakage or training data poisoning.
- Leverage context-aware validation for inputs and outputs: Implement dynamic validation mechanisms that adjust based on context and content. For instance, outputs related to sensitive areas (like healthcare or financial information) should undergo stricter validation than general responses.
- Create fall-back mechanisms for handling ambiguous requests: When LLMs encounter ambiguous or potentially harmful inputs, have them trigger pre-defined fail-safe workflows. These workflows could escalate to human operators or trigger enhanced logging for further review.
LLM05: Supply Chain Vulnerabilities
LLM supply chain vulnerabilities arise when the models, datasets, or third-party plugins used in an LLM system are compromised. Since LLMs often rely on pre-trained models, third-party data, or external plugins, any vulnerability in these components can introduce significant risks.
An attacker could poison a dataset, exploit outdated software dependencies, or leverage insecure third-party plugins to gain unauthorized access or introduce backdoors into the LLM system.
Prevention:
- Conduct thorough vetting of third-party models, datasets, and plugins to ensure they meet security standards.
- Maintain a software bill of materials (SBOM) to track dependencies and ensure all components are up-to-date and secure.
- Implement regular security audits for third-party integrations to detect vulnerabilities early.
- Use signed models and datasets to ensure their integrity and authenticity.
LLM06: Sensitive Information Disclosure
LLM systems can be granted excessive permissions, allowing them to access or modify data beyond their intended scope. If a model is allowed to read, modify, or delete sensitive data or interact with critical systems, an attacker could exploit this access to cause data breaches or escalate privileges within the system.
Prevention:
- Enforce the principle of least privilege by granting LLMs only the minimum necessary permissions for their tasks.
- Regularly audit permission levels to ensure no unnecessary access is granted to the LLM or its plugins.
- Isolate sensitive functions using role-based access control (RBAC) to limit what the LLM can interact with.
- Use fine-grained API controls to ensure LLMs cannot access backend systems without proper authentication.
LLM07: Insecure Plugin Design
Insecure plugin design is a critical issue for LLM-based systems that rely on third-party integrations. Plugins with inadequate input validation, insufficient access controls, or broad permissions can introduce severe vulnerabilities like data leaks, privilege escalation, or remote code execution.
Prevention:
- Apply input validation and sanitization for all plugin inputs, ensuring that only properly formatted and expected data is accepted.
- Ensure least privilege access for plugins, restricting their capabilities to only what is necessary for their function.
- Conduct regular security testing on plugins using SAST and DAST tools to detect and mitigate vulnerabilities early.
- Implement strong authentication and authorization mechanisms, such as OAuth, to control access to plugin functionality.
LLM08: Excessive Agency
Excessive agency refers to when LLM systems are granted too much autonomy in decision-making or system interactions. This can lead to unintended consequences, such as performing unauthorized actions, sending unintended messages, or altering critical data.
For example, an LLM-based assistant with excessive permissions could modify or delete important documents or send unauthorized communications based on a faulty command.
Prevention:
- Restrict LLM autonomy and scope of action by limiting the tasks it can perform without human oversight.
- Implement a human-in-the-loop process for critical actions, requiring user confirmation before high-impact actions are taken.
- Design LLM tools with fine-grained functionality to prevent excessive functionality, such as limiting a plugin to reading emails without the ability to send or delete them.
- Monitor audit logs to track LLM actions and ensure that any inappropriate or unintended actions are flagged and reviewed.
LLM09: Overreliance
Overreliance on LLMs can lead to serious consequences when the system produces incorrect or misleading information. LLMs may generate false, inappropriate, or biased outputs, which can be especially harmful in high-stakes scenarios like healthcare, finance, or legal advice.
Prevention:
- Regularly validate LLM outputs through cross-referencing with trusted external sources or by implementing self-checking mechanisms.
- Incorporate disclaimers in applications to inform users about the potential inaccuracies of LLM-generated content.
- Use domain-specific fine-tuning to reduce hallucinations and increase the reliability of the model in specialized contexts.
- Ensure human review for all high-stakes outputs and decisions.
LLM10: Model Theft
Model theft occurs when the intellectual property of a proprietary large language model is exposed to unauthorized individuals or organizations. LLMs can inadvertently reveal private data, such as proprietary algorithms, based on their training data or user inputs.
Shadow models, created from stolen LLM models, are often used to launch attacks and gain unauthorized access to further sensitive data. It also allows adversaries to manipulate the model’s performance and damage its integrity.
Prevention:
- Use strong access controls that prevent unauthorized entities from accessing or manipulating the LLM.
- Implement rate limiting of API calls to prevent sensitive information from being exfiltrated.
- Apply strict input/output filters to identify and block potentially sensitive data before it is processed or returned by the model.
- Establish a centralized registry or inventory for all machine learning models used in a production environment, helping maintain compliance and inform risk mitigation measures.
Related content: Read our guide to LLM security tools (coming soon)
Application Security Testing for LLM APIs with Pynt
Pynt focuses on API security, the main attack vector in modern applications. Pynt’s solution aligns with application security best practices by offering automated API discovery and testing, which are critical for identifying vulnerabilities early in the development cycle. It emphasizes continuous monitoring and rigorous testing across all stages, from development to production, ensuring comprehensive API security. Pynt's approach integrates seamlessly with CI/CD pipelines, supporting the 'shift-left' methodology. This ensures that API security is not just an afterthought but a fundamental aspect of the development process, enhancing overall application security.
Learn with Pynt about prioritizing API security in your AST strategy to protect against threats and vulnerabilities, including LLM’s emerging attack vectors.