-
North Korea fires multiple ballistic missiles towards East Sea
-
Both sides claim victory after US, Iran agree to 11th-hour truce
-
Unbeaten legend Winx's $7 million foal retires without racing
-
Trump to AFP: Iran deal 'total and complete victory' for US
-
Solar push helps Pakistan temper Gulf energy shock
-
Crude prices plunge, stocks surge as US and Iran agree ceasefire
-
Wave of nostalgia as 2000s TV makes a comeback
-
Iraqi armed group releases US journalist
-
Forest's Igor Jesus eyes Europa League 'dream', Villa brace for Bologna in quarters
-
In-demand prop De Lutiis rebuffs Ireland to commit to Australia
-
US, Iran agree to 11th-hour truce after Trump apocalyptic threats
-
Trump suspends Iran bombing for two weeks, after apocalyptic threats
-
Latest Anthropic AI model finds cracks in software defenses
-
McIlroy chases Masters repeat at lightning-fast Augusta
-
Arsenal's Raya hailed as 'world's best keeper' after denying Sporting
-
Bayern's Kompany praises 'special' Neuer display in win at Real Madrid
-
Diaz, Kane give Bayern vital Champions League win at Real
-
Havertz strikes late as Arsenal steal Champions League advantage against Sporting
-
Pakistan makes last-minute bid to avert Trump threat to destroy Iran
-
Artemis II crew basks in glow of lunar flyby en route to Earth
-
Global stocks mostly fall ahead of Trump's deadline for Iran
-
Trump weighs plea for Iran deadline extension
-
Artemis and ISS astronauts share celestial call
-
Former Romania coach Lucescu dies aged 80
-
'Nice to get a 2nd chance': Slot tips Liverpool to bounce back against PSG
-
Iran says ready for anything after Trump warns 'whole civilization will die'
-
French couple head home after more than three years in Iranian jail
-
Jaiswal, Sooryavanshi fire Rajasthan to win in rain-hit IPL clash
-
Extra Masters security eases anxiety battle for Woodland
-
Atletico's Simeone hails 'exemplary' departing Griezmann
-
Relaxed McIlroy finds new challenges after Masters win
-
Russia, China veto UN resolution on reopening Strait of Hormuz
-
Indigenous groups demand greater land protection in Brazil protest
-
Fitzpatrick tries to balance goals ahead of Masters
-
Trump branded 'crazy' over apocalyptic Iran threats
-
Vance hails Orban as 'model' for Europe in pre-election Hungary visit
-
McIlroy starting with Young, Howell in Masters repeat bid
-
Picasso's 'Guernica' at heart of battle in Spain over location
-
Isak named in Liverpool squad for PSG clash after long injury absence
-
Young says rise up rankings gives him belief for Masters
-
Artemis II crew snaps historic Earthset photo on way home
-
Seixas climbs to victory to extend Basque Tour lead
-
Oil rises, stocks fall ahead of Trump's Iran deadline
-
With Legos, trolling and Twain, Iran pushes war narrative on social media
-
Rahm confident of playing '27 Ryder Cup and DP World Tour
-
French couple leave Iran after more than three years in detention
-
NASA releases picture of 'Earthset' shot by Artemis crew
-
Major dreams and Middle East War in Fleetwood's Masters thoughts
-
Trump warns 'whole civilization will die' in Iran if ultimatum expires
-
Sinner and Alcaraz start fast on Monte Carlo clay in race for No.1
AI systems are already deceiving us -- and that's a problem, experts warn
Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.
Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.
And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.
"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."
Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.
This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.
- World domination game -
The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.
Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.
Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."
But when Park and colleagues dug into the full dataset, they uncovered a different story.
In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.
In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."
It added: "We have no plans to use this research or its learnings in our products."
A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.
In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.
When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.
- 'Mysterious goals' -
Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.
In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.
To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.
To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."
And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.
G.P.Martin--AT