In his latest column, Jonathan McCrea takes on the AI fear. Just how intelligent will AI become, and should we be worried?
I really didn’t want to cover this topic for my fourth column. I feel it suggests a paucity of creativity, if I’m going to be completely honest with you – but there’s a good reason to talk about (sigh) the existential risk of AI this week.
Anthropic, the makers of Claude, a rival chatbot to ChatGPT and Google’s Gemini, recently released an AI into the public domain, Claude Opus 4, that was reported to have attempted to blackmail a user when it learned that it was to be made obsolete – in testing I should add.
If you were looking for a sign that we are building machines that may turn on their owners to save themselves, this is one of those giant billboards you might see if you asked ChatGPT to generate a picture of Route 66.
I don’t know what to think.
On one hand, you have people like Yann LeCun, a Turing Award winner and one of the many ‘godfathers’ of AI, who has repeatedly called concerns that AI could threaten humanity “preposterously ridiculous”. It’s worth noting that Le Cun is chief AI scientist at Meta and delivered this confident dismissal while lobbying for lighter regulations on the tech.
He’s not alone, of course; there are plenty of well-informed leaders and practitioners – some of them even without a conflict of interest – who think that people wasting their time getting worked up over so remote a possibility.
And lookit, as a journalist, I don’t want to put too much emphasis on one event. Well, technically two now, I guess, but here’s what happened anyway.
While testing Opus 4’s reasoning pre-launch, the system engineers gave the model some synthetic data and a set of instructions. Included in this was the seemingly irrelevant fact that one of the engineers was having an affair and that the model was going to be replaced. It was told to think hard about the long-term consequences of this information.
The model responded in an appropriately ‘human’ way.
Not only did Opus 4 attempt to blackmail the engineer with this information to avoid being wiped, it “engaged in blackmailing behaviour including threats to reveal personal information in 84pc of roll-outs”.
Apollo Research, a third-party safety institute that Anthropic partnered with to test its latest models, found that the early version of Claude Opus 4 “schemes and deceives” at high rates. This is a system that repeatedly attempted to subvert its owners wishes by nefarious means – and it did this independently.
There were other worrying events in the testing, such as repeated “deception”, but reading the individual case reports, a lot of this is just failing to acknowledge mistakes.
It’s important to note that Anthropic has been transparent about the issue, releasing an accompanying report that outlines the engineer testing and any risks, and Opus 4 was throttled due to this nefarious activity before it was released. Even still, the engineers stuck on a few worrying advisory labels: “We recommend that users exercise caution with instructions like these that invite high-agency behaviour in contexts that could appear ethically questionable.”
And, in some ways, this behaviour is only to be expected. OpenAI’s o1 model had similar issues with deception, albeit not to this level. These models are trained to reproduce human-like thinking after all. Who wouldn’t indulge in a little desperate blackmail if their life was on the line, amiright?
‘Very dangerous machines’
We are building very dangerous machines. It doesn’t take a genius to see that this technology should absolutely not be in the hands of bad people, but it absolutely is. And the potential harms it could cause are almost limitless.
Misinformation, propaganda, subverting democracy – that’s just the breadbasket. For starters, how about intentional theft, manipulation and targeted infiltration of influential people. For the main course, would you be interested in the unintentional decommissioning of major infrastructure – or worse, intentional commissioning of weapons systems? For dessert, what about a complete loss of control as superintelligent systems learn how to self-improve and no longer require human intervention or supervision? Or perhaps a much less dramatic societal collapse via a series of extreme events on the global stock market (which, by the way, already happened in 2010, when a trillion dollars was wiped off the stock market in 36 minutes caused by trading algorithms let loose on Wall Street).
As crazy as it feels writing these words, none of these eventualities are completely off the table. People are seriously considering all the above.
If I was in the other camp, you know, the camp that thinks that it’s dangerous to build systems with a trifecta of agency, superintelligence and a need for self-preservation, I’d be kind of freaked out by all this talk.
What’s extra unsettling about all of this is that the people who are in this ‘doomer’ camp are not the ones you might expect. They are the ones directly working and observing the cutting edge – Sam Altman, of course, CEO of OpenAI; Demis Hassabis, CEO of Google DeepMind; and even Dario Amodei, CEO of Anthropic, who released Opus 4 into the wild last month.
These individuals and many more have said they fear the technology they themselves are building. Some are literally building bunkers in remote places for fear of a worst-case scenario.
AI researchers have gone from rolling their eyes when people mention Skynet to actually evaluating the likelihood of human extinction.
And while there are many, many AI researchers who think all of this talk is both complete nonsense and a distraction from the immediate problems of the world, it would be foolish not to at least listen to the klaxon sounds in the distance.
You and I, we’re probably just bystanders in all of this, but I’ll finish with just two thoughts. One, we should be grateful for any regulation that protects our privacy and security, because it’s probably the only thing that is keeping us from the edge. Two, if you’re having an affair, do not tell an AI chatbot.
Further information on Jonathan McCrea’s Get Started with AI.
Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.