Do we need to stop talking about ‘AI guardrails’?

It’s important that our staff and clients understand the benefits and opportunities of AI along with its limitations and potential harms
We recently had a debate on Slack (as we do) about a word that’s popped up everywhere in the world of AI: “guardrails.” Is it a useful metaphor that helps clients grasp the concept of safety, or a misleading term that gives them a false sense of security?
Here’s where we ended up.
The case against: ditch the word entirely
The problem is that a physical guardrail (like one you might see on a cliff edge or a motorway) is a tangible, pretty predictable barrier. When it fails you can usually see the damage, understand where it failed and figure out what to do about it.
AI “guardrails” are different, in that they’re often non-deterministic. They might be a polite instruction in a prompt (“UNDER NO CIRCUMSTANCES SHOULD YOU DO THE BAD THING”) or a second AI validating the first. This doesn’t enforce a hard stop; it’s asking the AI to behave. It creates the illusion that users are “safe” because the specified bad thing can’t happen. When, in reality, the probability of failure isn’t clear and can lead to spectacular crashes, like an AI deleting a production database.
We already talk to clients openly about threat models, risk appetites, and the cost and implications of mitigations and controls for general security issues. We should treat AI like any other software system and focus on these honest discussions about managing risks.
The case for: an imperfect metaphor is good enough
Physical guardrails aren’t perfectly safe either. The one on a balcony can be ducked under or climbed over, and even motorway barriers can fail due to corrosion or sudden impact in ways that aren’t always communicated. They’ve evolved over time, just like AI “guardrails” will improve as we learn and iterate.
And practically speaking, the guardrail ship has already sailed. It’s in common use for AI, is in the dictionary and is applied to legal and constitutional oversight. Given that naming things is one of the 2 hardest problems in computer science, struggling to change an already-adopted term might just be a waste of effort.
The common ground: transparency is everything
Ultimately, we agreed on the essential point: hiding behind any metaphor isn’t the way we want to go. Whether we call them “guardrails” , “controls” or something else, it’s important that our staff and clients understand the benefits and opportunities of AI along with its limitations and potential harms.
That means helping organisations navigate the world of AI by being open and honest about the risks. We will always put the right controls in place but, like physical guardrails, they aren’t impenetrable.