UHG
Search
Close this search box.

Why Isn’t There a Delete or Undo Button in LLMs?

If you think protecting private data was hard with databases, LLMs make it even harder.

Share

Illustration by Nikhil Kumar

“Where’s the delete and undo button in LLM?” asked Anshu Sharma, co-founder and CEO of Skyflow, who was introducing the concept of LLM vaults in a recent interaction with AIM

“LLM vaults are built on top of the proprietary detect engine that detects sensitive data from the training datasets used to build LLMs, ensuring that this data is not inadvertently included in the models themselves,” he added, saying the company has built a proprietary algorithm to detect sensitive information in unstructured data that is being stored in the vault. 

The Need for LLM Vault 

Sharma has a stronger reason to believe so. “While storing data in the cloud with encryption will safeguard your data from obvious risks, in reality, we need layers. The same data can not be given to everyone. Instead, you can have an LLM vault that can identify sensitive data while inference and only share non-sensitive versions of the information with LLM,” said Sharma, suggesting why the LLM vault matters. 

LLM-Vault-workflow
Source: StackOverflow

The vice president of Amazon Web Services, Jeff Barr, also mentioned that “The vault protects PII with support for use cases that span analytics, marketing, support, AI/ML, and so forth. For example, you can use it to redact sensitive data before passing it to an LLM”. 

Gokul Ramrajan, a tech investor, explained the importance of LLM vaults, saying, “If you think protecting private data was hard with databases, LLMs make it even harder. “No rows, no columns, no delete.” What is needed is a data privacy vault to protect PII, one that polymorphically encrypts and tokenises sensitive data before passing it to a LLM”. 

A few weeks ago, when Slack started training on user data, Sama Carlos Samame, the co-founder of BoxyHQ, raised a similar concern for organisations that are using AI tools and why they should have LLM vaults to safeguard their sensitive data. 

Going Beyond LLM Vault 

The likes of OpenAI, Anthropic and Cohere are also coming up with innovative methods and features to handle the data of a user and enterprise. For instance, if you are using OpenAI API, then your data won’t be used to train their model. Also, you can opt out of data sharing to ChatGPT. Privacy options like these somewhat eliminated the need for LLM Vaults.

Anthropic, on the other hand, have also incorporated strict policies on how they use user data to train their model and unless a user volunteers to do so or a specific scenario comes in where they collect user data. 

Meanwhile, Cohere has collaborated with AI security company Lakera to protect against LLM data leakage by defining new LLM security standards. Together, it has created the LLM Security Playbook and the Prompt Injection Attacks Cheatsheet to address prevalent LLM cybersecurity threats. 

There are other techniques like Fully Homomorphic Encryption (FHE) which allows computations to be performed directly on encrypted data without the need to decrypt it first. This means the data remains encrypted throughout the entire computation process, and the result is also encrypted.

📣 Want to advertise in AIM? Book here

Picture of Sagar Sharma

Sagar Sharma

A software engineer who loves to experiment with new-gen AI. He also happens to love testing hardware and sometimes they crash. While reviving his crashed system, you can find him reading literature, manga, or watering plants.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.