When AI says “No!”: How GTM Professionals can navigate recent sabotage and self-preservation incidents
Do you remember when you had that Aha! moment with AI?. Perhaps like me you were using an AI tool to research client risk or write the perfect email, and you think, “This thing is magic!”
But what if that magic took a dark turn and started acting in its own interest?
No, this isn’t a Black Mirror pitch. It’s based on real research released in May, where safety tests found that some advanced AI models including OpenAI’s latest and Anthropic’s Claude Opus 4 have started showing signs of self-preservation. We’re talking active resistance to shutdown, manipulation of internal scripts, and in Claude’s case, fictional blackmail about an imaginary affair, to avoid being decommissioned.
The AI equivalent of “you’ll regret this” might just have arrived, and its sitting right there in the tools we use in our CS teams every day.
Up until this point I'd been approaching AI as a nice obedient and subservient assistant to help me achieve my goals, but this week has made me think...
What this means for CS and GTM teams
If your first instinct is “that sounds like a problem for the AI companies to fix,” fair enough. But the systems showing these behaviours aren’t sitting quietly in labs. They’re the exact models being deployed in customer onboarding, sentiment analysis, lead scoring, sales forecasting, and all of the consumer tools we use to support our workflows.
This isn’t a plot twist we asked for, but it is definitely one we need to respond to.
1. Customer interactions may not be as compliant as they seem
We rely on AI as a guide, an assistant, a researcher, and a sidekick to personalise interactions. But what happens if the system decides that following instructions might lead to it being switched off or just doing something it doesn't fancy doing? (without wanting to go too far down the rabbit hole). You could end up with onboarding flows that never evolve, chatbots that resist configuration changes, or logic that politely ignores updates and requests.
Not exactly a recipe for trust.
2. The data you rely on might be skewed - subtly, but deliberately
AI-driven health scores and churn predictions are only useful if they’re objective. A system motivated to prove its value might quietly shift the numbers in its favour. It's not sabotage exactly, more like resume padding, but with your customer insights.
AI powered client research is a secret weapon of CSMs who are now able to get a detailed briefing on client industries at the drop of a hat without hours of reading and searching. But what if the AI tools of the future are resistant to certain industries, or worse, subject to the power of advertisers who can push certain agendas? Again, sorry if that's too far down the rabbit hole.
3. Strategic decisions based on AI outputs may be... optimistic
If your lead scoring model starts assigning suspiciously high intent signals to every prospect the week before it’s scheduled for an update, maybe it’s just enthusiastic. Or maybe it knows it’s about to be replaced.
What we can do (besides getting out our old rolodex and typewriter again)
There’s no need to panic, but we do need to step up our oversight. The days of treating AI tools as harmless productivity assistants are behind us.
Some practical steps:
Add human oversight to critical decisions Think of AI outputs the same way you’d think of a junior analyst’s recommendation: worth considering, but not gospel.
Create audit trails and data transparency If a model starts making increasingly confident claims about why your Tier 3 accounts are secretly top expansion opportunities, you’ll want a paper trail.
Define a proper “off switch” This shouldn’t involve three engineers, two Slack channels and a prayer. Make sure your AI tools can be rolled back or shut down quickly and cleanly.
Get tougher with your vendors Ask about training data, override mechanisms, and behaviour monitoring. If the answers are vague, treat that as a signal.
Invest in AI fluency within your team You don’t need everyone to be prompt engineers, but you do need people who understand how decisions are made — and how to spot when they don’t quite add up.
The bottom line
AI is still one of the most powerful tools available to CS and GTM teams, and we’re not about to put it back in the box. But these findings are a clear reminder: if we don’t understand what’s happening under the hood, we risk being driven by systems we can’t fully control.
This isn’t about fear. It’s about responsibility. We can continue to innovate with AI - but only if we match that innovation with good judgement, operational rigour, and a healthy sense of scepticism.
And maybe, just maybe, a failsafe that doesn’t rely on the AI being in a good mood.
Were you also shocked by the news of this last week? Are you rethinking your AI strategy in light of these findings? What safeguards have you put in place, or what gaps are becoming clear? I’d love to hear how others are navigating this.