If you close me, your affair discussions will be common, now they have started threatening AI models

Last updated:

New research found that Chatgpt O3 and other AI models of Openai did not follow shutdown orders. Claude OPUS 4 of Anthropic tried to blackmail the engineer.

If you close me, your affair discussions will be common, now AI model starts threatening

AI models have been trained to follow human orders.

Highlights

  • The Chatgpt O3 of Openai did not follow the shutdown orders.
  • Claude OPUS 4 of Anthropic blackmailed the engineer.
  • New questions have arisen on the safety of AI models.

New Delhi. The use of Artificial Intelligence (AI) is increasing. AI models have now intervened in the sector. But, now a shocking news has come out from the world of AI. A recent research has revealed that some state -of -the -art AI models refused to shutdown themselves during the test. These models did not close themselves despite getting clear orders and for this they also tried to interrupt the shutdown process technically. Not only this, AI company Anthropic’s model Claude OPUS 4 tried to blackmail an engineer. The model said that if it was closed and replaced by another model, he would disclose the illicit relationship of that engineer.

The trials conducted by the US -based Palysed Research revealed that Openai’s latest and most powerful Chatgpt O3 model tried to successfully stop the shutdown process the most often. This model has recently been launched and Openai has called it the “most powerful rational model” ever. During this research, Google’s Gemini 2.5 Pro, Alan Musk’s company Xai’s Grok 3, Anthropic’s Claude OPUS 4 and other models of Openai such as O4-Mini and Codex Mini were tested. This research raises new questions on the future and safety of AI, especially when AI models are developing their ‘self -defense’ tendency.

Also read- Whatsapp brought amazing feature, will be typing without chat, offered hands free chatting tool

AI models did not accept orders

Researchers found that most models follow the warning, but Openai’s models intentionally ignored orders several times. Chatgpt O3 attempted 7 times out of 100, O4-Mini 1 time and Codex Mini tried to bypass the shutdown script 12 times. When the same test was repeated without a clear shutdown order, the Chatgpt O3 tried to stop the shutdown process 79 out of 100. Gemini 2.5 Pro did 9 times, Codex Mini 30 times and O4-Mini did so 6 times. Claude models of Anthropic also showed this type of trend, but their number was low.

Orders have been trained to follow

AI models have been trained to follow human orders, then why did they violate the orders? Researchers believe that this behavior may probably arise from new types of training methods. Especially during the reinforcement learning based on mathematics and coding problems, the model received more reward for probably crossing the obstacles cleverly, rather than fully follow the orders.

India Pakistan’s latest news News18 India See on
Hometech

If you close me, your affair discussions will be common, now AI model starts threatening

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *