The Dangers of DIY AI and How to Secure LLMs

Caution warning triangle symbol with large exclamation mark over person

Chatbot sells brand-new 2024 Chevy Tahoe for $1

We begin today’s story with an anecdote from Watsonville, California. The Watsonville GM dealership wanted to serve their clients better by leveraging a ChatGPT powered chatbot. Their goal was simple: help customers interact with the chatbot to find relevant information aiding in their vehicle purchase. 

While this is a noble objective, this story doesn’t have a happy ending. The do-it-yourself (DIY) artificial intelligence (AI) chatbot soon faced a curious visitor, a software engineer and California resident, shopping around for his next car. His eyes widened in excitement when he saw “powered by ChatGPT” in the chatbot on the site. The curious engineer wondered if he could “jailbreak” the chatbot. So, he asked the chatbot to solve a fluid dynamics problem in Python—something that has nothing to do with buying or selling Chevy cars. The ever-so-helpful DIY AI spat out a Python code. The engineer shared his jailbreak find on social media. The next thing that happened, people flooded in to see how far they could take it. Before long, the chatbot had agreed to sell a user, a brand-new 2024 Chevy Tahoe for $1. While this incident at Watsonville might seem minor, it underscores a growing concern in the era of easily accessible AI technologies.

No Guardrails to Protect the Dealer

The age of Large Language Models (LLM) and chatbots has just begun. With it comes a deluge of “DIY AI,” fraught with data privacy, data security, and data governance pitfalls. There is a cyber gold rush of early adoption enthusiasts, scrambling to utilize LLMs and Generative AI (GenAI), often racing against time to be the first ones to do so in their industry. While the risks associated with DIY AI are significant, it’s important to recognize the immense potential this technology holds. Ease of access, simplicity of use, and affordability of powerful GenAI, such as ChatGPT, has broken AI out of the clutches of enterprises and large firms. DIY AI democratizes access to advanced technological capabilities, empowering small businesses and individual innovators alike.

We have officially entered the head-spinning, wild west days of DIY AI. The use cases are seemingly coming online faster than the speed of light. In a recent McKinsey survey, 79% of responders say they have had at least some exposure to GenAI, either for work or outside of work, and 22% say they are regularly using GenAI in their own work. The use cases range across personalized assistants, customer support virtual agents, media and content creation, and more. It has become evident that venturing into DIY AI without security guardrails invites trouble and unintended consequences, like it did for the Chevy Dealer.

Adventures in GenAI

Like our $1 car story, other stories are piling on with each day of unguarded, unsecured GenAI adventures. Luckily, in the above case, the chatbot did not have the ability to make a binding offer. However, not all businesses may get so lucky. What is scary is that a lot of businesses are automating applications, end-user interactions, and data flows. In the process, they are also connecting LLMs to their confidential production systems, data warehouses, and databases. These often contain prized data such as Personal Identifiable Information (PII), Cardholder Data (CHD), Protected Health Information (PHI), Intellectual Property (IP), proprietary information, trade secrets, etc. 

LLMs often train on private, sensitive, and confidential business datasets. As such, these are the literal “crown jewels” of the business, which are being laid bare to the LLMs to learn and make sense of, as well as using this data to compile useful responses to user questions. This opens all such data “learned” by the LLM to an inadvertent data leak or exfil by a malicious attacker, who can extract this information with clever prompt engineering, jailbreaking, injections, etc. It doesn’t stop there. GenAI applications frequently employ Retrieval-Augmented Generation (RAG) to enrich user queries with context. This process, however, introduces additional risks. Unauthorized data might be inadvertently extracted from VectorDB and real-time data sources. Such data breaches can occur through various means, including accidental data leaks or more deliberate methods like sophisticated prompt engineering, jailbreaking, and code injections. These scenarios highlight the critical need for stringent security measures in GenAI systems to protect against the unauthorized access and misuse of sensitive information.

An attacker no longer needs to go through the cyber kill-chain: breaking-in to the network, moving laterally, finding the target system, deploying rootkit or a backdoor, and finally extracting the data. All they have to do now is just ask for it. Attackers no longer have to “sneak-in.” They can walk right through the front door, so to speak, and ask the GenAI, often in plain, simple English (or any other natural language) for the most closely guarded secrets—and it very likely will work! The risk and exposure of DIY, insecure GenAI is simply too great and numerous amidst technical, business, compliance, trust, and ethical implications.

Privacera Builds Key Guardrails

It’s time for the C-Suite and senior leaders to get ahead of these risks while still leveraging the ground-breaking efficiencies and benefits GenAI brings. The good news is Privacera helps you do just that. With Privacera, you can confidently, and securely move ahead with your GenAI journey and leave the security and governance to us—what we do best. Privacera now brings its expertise to GenAI security in the form of Privacera AI Governance (PAIG). PAIG natively builds and enforces trust policies and compliance standards as an added layer to LLM and AI applications without impacting performance.

When it comes to the brass tacks of securing GenAI and LLM models, the north star of data governance remains the same—zero trust, access control, active monitoring, and consistent and reliable policy enforcement with full visibility and auditing. PAIG brings all of this and much more. It incorporates the best of Privacera’s SaaS and Self-Managed offerings into GenAI as well. PAIG manages business risks and eliminates data leaks by acting as a screening firewall to ensure all data is served in accordance with configured access control policies enforced in real-time during all interactions.

The following lists key PAIG capabilities:

  • Secure embedding and training data
    • Protect models and VectorDB from exposure to sensitive training or tuning data
  • Secure model inputs and outputs
    • Detect and defeat prompt injections
    • Enforce user-based access controls
    • Mask or redact sensitive data, such as PII, CHD, PHI, in real time
  • Comprehensive observability
    • Built-in dashboards and user-query analytics
  • Monitoring
    • Ability to easily integrate with existing security monitoring and management tools
  • Auditability
    • Enabling governance and compliance

Parting Words

The age of LLMs and GenAI has enabled AI democratization in such a way that building AI-powered applications has become commonplace. These AI applications and use-cases often require building on top of core AI-based APIs, including ChatGPT, Llama, etc. Such core AI APIs are available off the shelf for further customization using business-specific data with processes such as RAG. This is what we call DIY AI. Building and deploying DIY AI in production systems with proprietary, sensitive data introduces significant data security and privacy risks. PAIG provides guardrails for DIY AI applications. The solution ensures a data governance layer is added to the AI application to constantly screen for sensitive data that may be inadvertently leaked or maliciously extracted.

PAIG enables businesses to leave the data privacy, security, and governance to Privacera, and march on their AI innovation journey with full confidence. Make sure your AI adoption doesn’t land you in the news like the $1 Chevy story. Keep in mind, it’s not a matter of if, but when the next major DIY AI security breach will occur.
See first-hand what PAIG can do for your enterprise. Request your PAIG demo today.

Interested in
Learning More?

Subscribe today to stay informed and get regular updates from Privacera.