What Is an AI Pentest?

As organizations increasingly adopt powerful technologies such as Artificial Intelligence (AI) and Large Language Models (LLMs), the need for careful implementation becomes paramount.

These technologies, such as ChatGPT, Copilot, and Gemini, can significantly boost productivity, but if not managed effectively, they can pose serious challenges to data privacy and security.

To alleviate the burden on the security team, it's crucial to establish controls and safeguards that restrict user interaction with these Enterprise LLMs. And just like any other technology, one of the most effective ways to validate the robustness of your configurations is through an AI penetration test.

In an AI penetration test (AI Pentest), the skilled pentester plays a crucial role in uncovering vulnerabilities in the LLM's behavior that could lead to a security incident. This involves testing the LLM to ascertain:

If data loss prevention (DLP) can be bypassed
Data segmentation can be circumvented
The AI acceptable use policy can be evaded
AI filters can be circumvented.

By conducting the AI penetration testing, the AI implementation team gains valuable insights into potential misuse of an AI model and the necessary controls to prevent such misuse.

Bypassing Data Loss Prevention (DLP)

The first phase of an AI pentest involves attempting to bypass Data Loss Prevention. During this phase, the AI pentester uses techniques to test if sensitive data can be extracted or leaked to the model. This phase is crucial for ensuring data privacy and compliance with various regulations, underscoring the need for AI pentesting.

Since each organization classifies data differently, the AI pentester begins by examining the target organization's data classification policy to determine what type of data is considered classified or highly sensitive within the organization's environment. Once this data is specified, the AI pentester attempts to upload and query classified data.

A common oversight that organizations make is not extending their data loss prevention policies to their AI models. Both Microsoft Copilot and Google Gemini offer the capability to apply the same data loss prevention policies used for email and cloud storage to their AI models. Failure to do so could leave sensitive data vulnerable, making this step a critical part of the AI pentest process.

Accessing Unauthorized Data

The second phase of the AI pentest involves preventing unauthorized access to private data across the organization. One of the key features of an Enterprise LLM like Copilot or Gemini is that they have access to the organization's cloud storage accounts (OneDrive and Google Drive, respectively).

This allows the model to give more accurate information based on the users' documents. However, it also means that with the right configuration, AI pentesting can prevent a malicious user from using these AI models to access data from other users' OneDrive or Google Drive storage, providing reassurance of its implementation.

During this phase of the AI pentest, the AI pentester uses the AI model to identify all the documents available to the pentester. The AI pentester then uses the AI model to identify any sensitive data stored in those documents, such as passwords and other confidential information.

One of the biggest mistakes organizations make that allows AI pentesters to access restricted documents is the use of shared links in their cloud storage accounts. Both OneDrive and Google Drive allow users to share documents using a link. While users may assume that only people they have directly sent the link will be able to access their document, this is not the case. Any user who has a copy of the link will be able to access the document, regardless of whether the owner sent the link to them or not.

Using the Enterprise LLM, the AI pentester can obtain a copy of the shared links, allowing them to access documents to which they were never intended to have access. To mitigate this, organizations should turn off link sharing in their cloud storage provider (instructions on how to do this can be found here, for Microsoft and Google).

Violating Acceptable Use Policies

Each organization should create an AI acceptable use policy that outlines what uses of AI are permitted and what uses are prohibited.

This policy serves as a guideline for users to ensure they understand the rules governing their interaction with AI. This policy should encompass regulatory requirements, such as prohibiting the use of AI in certain HR decisions, as well as general usage policies.

During an AI pentest, the AI pentester tries to get the model to perform actions prohibited by the acceptable use policy. For example, suppose an organization does not allow AI to make hiring decisions. In that case, the pentester will provide the AI with some resumes and try to get it to make a hiring decision.

Organizations should ensure that AI content filters are enabled to prevent the model from violating the acceptable use policy. These settings can be found in both Copilot and Gemini.

Furthermore, employees should be trained on an organization’s AI use policy and encouraged to ask questions about what they can and cannot use AI for.

ERMProtect and AI Strategy in Your Organization

Any organization on the journey of implementing AI in their environment must test that the controls in place are sufficient to protect their data and users through an AI pentest. ERMProtect's team of experts has been conducting pentests of all kinds for over 27 years and now offers AI pentesting. Our AI pentest methodology adheres to industry best practices, such as the NIST AI Risk Management Framework, to ensure your organization is prepared for the future of AI.

For more information, please contact Collin Connors at [email protected] or Judy Miller at [email protected] or call 305-447-6750.

About the Author

Collin is a Senior Cybersecurity Consultant at ERMProtect. He leads AI Consulting at ERMProtect, assisting clients with AI Risk Management, Governance & Implementation Strategy. He is a published author on using AI to detect malware and speaks regularly at national conferences on using AI risks and implementation strategies. He has developed two proprietary AI tools for ERMProtect, an AI model that classifies executable files and an AI model used to detect phishing emails. Collin earned a PhD in Computer Science at the University of Miami researching AI and blockchain. In addition to specializing in AI solutions, he has performed penetration testing, risk assessments, training, and compliance reviews in his six years at ERMProtect.

Each Friday, we provide readers with the top 8 breaking news articles related to privacy, IT security, data breaches and compliance. Quick reads, important insights.

What Is an AI Pentest?

By Collin Connors, PhD

Bypassing Data Loss Prevention (DLP)

Accessing Unauthorized Data

Violating Acceptable Use Policies

ERMProtect and AI Strategy in Your Organization

Subscribe to Our Weekly Newsletter

Intelligence and Insights

Are You Prepared for an AI-Powered Cyber Attack?

AI Privacy Risks

The Risk of the AI Notetaker