Prompt Injection: The Subtle Threat Lurking in AI Models

Nishan Ghimire
3 min readDec 9, 2023

With the evolution of every new technology, there arise new vulnerabilities that might be exploited by a malicious actor for their benefit or any other reason. we have seen such event when the first telephone was started we saw people exploiting it, we saw at the beginning of Web 2.0 also at the start of mobile phones. we have seen so many tricks and techniques to exploit different vulnerabilities .

Now, AI is evolving, especially after the release of LLM(Large Language Model), new technique is becoming popular to exploit LLM to get the unintended and forbidden response by a technique called prompt injection.

Then what is prompt injection?

Prompt injection occurs when an attacker manipulates the input (prompt) given to an LLM to achieve malicious goals. Think of it like poisoning the well — by subtly altering the prompt, the attacker can control the LLM’s response in a way that benefits them.

How can prompt be dangerous?

Prompt Injection can be used to generate various serious vulnerabilities if they are not been taken seriously in the development phase of LLM. some of them could be

  • Poisoning the data: By manipulating the training data used to train the LLM, an attacker could make it susceptible to specific prompts that trigger malicious responses.
  • Query crafting: In applications where users interact with the LLM through queries, an attacker could craft specific queries that lead to unintended responses.
  • Bypassing filters or restrictions by using specific language patterns or tokens.
  • Exploiting weaknesses in the LLM’s tokenization or encoding mechanisms.

Reason for the rise of prompt injection vulnerability

1. Rapid Development and Adoption: The breakneck speed of LLM development and deployment has outpaced the establishment of robust security measures. This creates a gap for vulnerabilities like prompt injection to emerge, often unaddressed until exploited.

2. Increasing Complexity: LLMs are inherently complex systems, with intricate internal workings and vast amounts of training data. This complexity can obscure potential injection points, making it difficult for developers to comprehensively identify and address them.

3. Limited Transparency: Many LLMs lack transparency regarding their internal functionality and decision-making processes. This lack of insight makes it challenging for users and researchers to understand how prompt injection might be leveraged to manipulate the model’s output.

4. Incomplete Data Sanitization: Training data for LLMs may contain implicit or explicit biases and harmful content. If this data is not thoroughly sanitized before use, it can leave the model vulnerable to manipulation through carefully crafted prompts.

5. Inadequate User Education: Many users are unaware of the potential risks of prompt injection and how it can be used to manipulate LLMs. This lack of awareness can make them susceptible to falling victim to such attacks.

For those who want to try and learn I have listed the resources below:

  1. Prompt-Injection-Everywhere: It contains the details and payload for prompt engineering
  2. Tryhackme(Advent of cyber 2023):https://tryhackme.com/room/adventofcyber2023: Here, they have showcased the basics of prompt injection
  3. Just try to learn from LLM themselves 😁

Thanks For Reading !!😄

--

--

Nishan Ghimire

Tech enthusiast | Pentester | Full stack web developer