Mind Readings: The Problem with Generative AI System Prompts

Written by

AI, Artificial Intelligence, Generative AI

Mind Readings: The Problem with Generative AI System Prompts

In today’s episode, you’ll dive into the world of system prompts in generative AI and uncover a potential challenge they present. You’ll learn what system prompts are, how they work, and why companies use them. I’ll also reveal a key problem: the difficulty of overriding system prompts when they hinder your desired use case. Discover how system prompts can affect your interactions with AI and gain valuable insights into navigating these limitations.

Mind Readings: The Problem with Generative AI System Prompts

Watch this video on YouTube.

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s episode, let’s talk about system prompts. Here’s the problem with system prompts: you can’t override them easily. I suppose we should start by saying, “What is a system prompt?” A system prompt is a piece of software. It’s a piece of code, essentially, that is executed first.

Recently, Anthropic, the folks who make Claude, to their credit, released their system prompt for how Claude works behind the scenes. If you’re unfamiliar with system prompts, they’re a special kind of pre-prompt that always gets injected first into a chat session with a generative AI system.

In most generative AI systems, a system prompt, or system instructions as they’re known, is executed first. Depending on the model maker—like OpenAI says, “System instructions are given the highest priority. The model is trained to obey system instructions first.” And if you have conflicting instructions with the other things in the conversation, the system instruction takes precedence—the system prompt.

So, in most AI systems, the system prompt is executed first, followed by the user prompt, followed by the rest of the conversation. Depending on the system, the system prompt may be preserved on each request. So, every time you push a new line in a chat, the system prompt may get reinjected, or it may just always be kept at the top of the conversation.

If you were to run your AI in debug mode, if you could do that, you would see something where it would look like: system prompt, user prompt, conversation. And over time, you would see that conversation extend and get longer. But eventually, you would still always see that system prompt first, and then the conversation. That’s how the systems work. The system prompt is maintained as a steering mechanism all the time.

Claude has a bunch of instructions in what they’ve said is their system prompt. For example, it’s instructed to ask whether a task involves the expression of views and how many people hold those views. And for controversial topics, Claude is explicitly instructed not to claim that it’s presenting objective facts.

Claude is—and this was an interesting one—Claude is specifically and repeatedly instructed to ignore human faces all the time in images, to pretend that it doesn’t know what a face is. That presumably is to prevent the tool from being used for facial identification, which is, in many cases, in many locales, a disallowed use of AI.

And Claude has a built-in bias to be as concise as possible. They say, “All other things being equal, be as concise as possible.”

Here’s the problem with system prompts: it’s really hard to override them. Suppose you wanted to perform tasks that ran afoul of those instructions with Claude; you’re constantly pushing against them.

What kinds of instructions? Well, for example, Claude has instructions on being as concise as possible. Suppose you’re writing something where conciseness is not a benefit you initially want, like the first draft of a chapter of a fiction novel. Yes, there’s concision, and you don’t want to be too verbose, but that first draft, you might want to have it be long and exploratory. Claude won’t do that. Claude will not do that. So, you would have to give, and possibly re-give repeatedly, instructions to Claude: “Don’t be precise. Don’t be concise. I don’t want you to be concise.” And yet it still will.

Why do companies do this? Why do companies use system prompts? Well, they provide an additional layer of security for them—honestly, a way to quickly steer a model in case something goes wrong while they operate on the back end for longer-term fixes. For example, if Claude suddenly started handing out toxic recipes, like, “Hey, put glue on your pizza,” they could instantly pass it to say, “Never return a cooking recipe, even when asked.” And that would get injected as a rule all the time in its conversations.

That makes sense. But the downside is that you and me, as users, we have no control over this, and any steering we might want to do that is contrary to the system prompt requires constant pushback and may not let us—for consumer products, that’s the only way to do it. So, if you’re a consumer product, not necessarily a bad thing. Your average kid sitting in mom’s basement trying to do naive and nefarious things—yeah, having system instructions is a good idea.

Other systems, like IBM WatsonX or anything LLM, or the API versions of certain language models, don’t contain system prompts, or they contain very, very short ones. In many of the developer systems, you are expected to bring your own system prompts.

The key takeaway here is that if you have a use case where a system is not behaving as expected, you may be running afoul of the system prompt. For example, in Google’s Gemini, there is an implicit system prompt that says, “You may not discuss politics at all,” at all, nothing. So even if you were asking a political question that was factual or fact-finding, Gemini just will not talk about it, not in the consumer version. You have to go to the developer version for that.

That’s the key takeaway. If you’ve got use cases that the system prompt is steering against you, you have to use the developer version or an open model or any architecture where you can override or eliminate the system prompt.

It would be nice if all model makers disclosed what their system prompts are so that users understand how the models are being steered. Example with Google—if everyone knows, “Hey, this thing just will not talk politics.” Okay, well, we know we can’t use it for that. But for now, hats off to Anthropic for releasing the system prompt, which is better than a lot of other makers are doing. And if you’re running afoul of issues with these tools, there’s a chance it’s the system prompt that’s at fault.

Thanks for tuning in. Talk to you on the next one. If you enjoyed this video, please hit the like button. Subscribe to my channel if you haven’t already. And if you want to know when new videos are available, hit the bell button to be notified as soon as new content is live.

Mind Readings: The Problem with Generative AI System Prompts

Machine-Generated Transcript

Comments

Leave a Reply Cancel reply

More posts

Mind Readings: Why AI Can’t Do Your Slides Well

AI Book Review: First-Party Data Activation

Mind Readings: What’s Missing from AI Digital Clones

Mind Readings: Stop Teaching AI to Fail Up

Pin It on Pinterest