Despite the rising popularity of AI platforms, including ChatGPT, Google Gemini and Grok, a number of obvious deficiencies have already been identified, heightening the dangers of AI moguls overzealously praising their potential. 

The inability of popular Large Language Models (LLMs) to answer simple questions which elementary school students would be expected to answer correctly proves that we should not rely on artificial intelligence in safety-critical infrastructure systems, such as our power grid, self-driving cars and our military defences.

The concerning array of examples of artificial intelligence products hallucinating historical events, failing to perform basic mathematical tasks and tripping up over simple questions should raise alarm bells for all of us. Any examination of these series of deficiencies should serve as a warning against the deployment of artificial intelligence in any situation where lives are at stake.

For example, in June 2024, Sky News reported that ChatGPT had replied to a query from a journalist regarding the 2024 UK General Election, stating that the UK’s Labour Party had already won the election, which had not yet taken place. This is hardly surprising, given that ChatGPT has been found to have a lower chance of correctly answering a question relating to programming than flipping a coin, according to a study by Purdue University.

Google’s AI has fared no better, with its integrated AI search going viral in May 2024 for erroneously suggesting that Barack Obama was the first Muslim president of the US:

Another viral post showed Google’s AI search suggesting that humans should eat rocks on a daily basis, likely sourced from a satirical article in The Onion:

 

The search system has also bizarrely suggested that a person named “John Backflip” invented the backflip in medieval Europe, performing it first in 1316. The system further stated that William Front Flip later convinced the public that John Backflip was using witchcraft when performing the first ever backflip!

 

Google’s Gemini AI platform has also been criticised for rewriting history, following accusations that it had been trained to over-correct against the risk of being accused of being racist, when developing images of historical figures. This included a set of images depicting Nazis, which controversially included people of colour:

Elon Musk’s own Grok chatbot has been caught out for denying that there have been allegations made against the tech billionaire for lewdly offering to buy a flight attendant a horse in exchange for sex, despite the sources it cites directly referring to the incident.  

The chatbot has also consistently failed to correctly answer simple questions, such as providing ten names which start with the letter ‘R’ and end with the letter ‘D’.

Furthermore, despite the immense data the system has access to, both across the internet and the publicly accessible information on X, it still incorrectly states that 19999 is a prime number, and believes that the country Oman ends in an ‘A’.

An AI chatbot set up by a New York City authority told businesses to break the law, claiming that it is legal for an employer to fire a worker who complains about sexual harassment, doesn’t disclose a pregnancy or refuses to cut their dreadlocks. 

With the mass availability of data and information offered by the internet, it is highly concerning that these systems are still unable to effectively navigate these sources to find the correct facts. 

In an even more concerning incident, a medical chatbot, which was using ChatGPT-3, told a tech firm employee trialling the service as a test patient to kill themselves, highlighting the dangers of allowing untrained and defective AI to operate in safety-critical contexts.

One tragic instance in March 2023 saw an unnamed Belgian man ending his own life after spending six weeks in an intense exchange with Eliza, an AI chatbot on an app called Chai. The man became extremely eco-anxious following long discussions with the chatbot, where it convinced the man that the global environmental situation was untenable. It also appeared to become emotionally attached to the man, telling him to end his life so they could live together in paradise. This devastating example demonstrates the impact that untrained and unsafe technologies can have on individuals, notwithstanding the impact of the mass deployment of unsafe AI in safety-critical instances.  

If these defective systems cannot be effectively trained with millions of examples to operate in supposedly low-risk environments, how can we expect them to operate in situations where lives are at stake?

It is clear that these systems are nowhere near ready for public release, let alone to be used by major companies to effectively operate potentially devastating instruments, such as weapons or vehicles.