-
Japan's only two pandas to be sent back to China
-
Zelensky, US envoys to push on with Ukraine talks in Berlin
-
Australia to toughen gun laws after deadly Bondi shootings
-
Lyon poised to bounce back after surprise Brisbane omission
-
Australia defends record on antisemitism after Bondi Beach attack
-
US police probe deaths of director Rob Reiner, wife as 'apparent homicide'
-
'Terrified' Sydney man misidentified as Bondi shooter
-
Cambodia says Thai air strikes hit home province of heritage temples
-
EU-Mercosur trade deal faces bumpy ride to finish line
-
Inside the mind of Tolkien illustrator John Howe
-
Mbeumo faces double Cameroon challenge at AFCON
-
Tongue replaces Atkinson in only England change for third Ashes Test
-
England's Brook vows to rein it in after 'shocking' Ashes shots
-
Bondi Beach gunmen had possible Islamic State links, says ABC
-
Lakers fend off Suns fightback, Hawks edge Sixers
-
Louvre trade unions to launch rolling strike
-
Far-right Kast wins Chile election landslide
-
Asian markets drop with Wall St as tech fears revive
-
North Korean leader's sister sports Chinese foldable phone
-
Iran's women bikers take the road despite legal, social obstacles
-
Civilians venture home after militia seizes DR Congo town
-
Countdown to disclosure: Epstein deadline tests US transparency
-
Desperate England looking for Ashes miracle in Adelaide
-
Far-right Kast wins Chile election in landslide
-
What we know about Australia's Bondi Beach attack
-
Witnesses tell of courage, panic in wake of Bondi Beach shootings
-
Chiefs out of playoffs after decade as Mahomes hurts knee
-
Chilean hard right victory stirs memories of dictatorship
-
Volunteers patrol Thai villages as artillery rains at Cambodia border
-
Apex Discovers Mineralized Carbonatite at its Lac Le Moyne Project, Québec
-
Lin Xiang Xiong Art Gallery Officially Opens
-
Fintravion Business Academy (FBA) Aligns Technology Development Strategy Around FintrionAI 6.0 Under Adrian T. Langshore
-
Pantheon Resources PLC - Retirement of Director
-
HyProMag USA Provides Positive Update to Valuation Of Expanded Dallas-Fort Worth Plant And Commences Strategic Review to Explore a U.S. Listing
-
Relief Therapeutics and NeuroX Complete Business Combination and Form MindMaze Therapeutics
-
Far-right candidate Kast wins Chile presidential election
-
Father and son gunmen kill 15 at Jewish festival on Australia's Bondi Beach
-
Rodrygo scrapes Real Madrid win at Alaves
-
Jimmy Lai, the Hong Kong media 'troublemaker' in Beijing's crosshairs
-
Hong Kong court to deliver verdicts on media mogul Jimmy Lai
-
Bills rein in Patriots as Chiefs eliminated
-
Chiefs eliminated from NFL playoff hunt after dominant decade
-
Far right eyes comeback as Chile presidential polls close
-
Freed Belarus dissident Bialiatski vows to keep resisting regime from exile
-
Americans Novak and Coughlin win PGA-LPGA pairs event
-
Zelensky, US envoys to push on with Ukraine talks in Berlin on Monday
-
Toulon edge out Bath as Saints, Bears and Quins run riot
-
Inter Milan go top in Italy as champions Napoli stumble
-
ECOWAS threatens 'targeted sanctions' over Guinea Bissau coup
-
World leaders express horror at Bondi beach shooting
AI systems are already deceiving us -- and that's a problem, experts warn
Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.
Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.
And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.
"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."
Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.
This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.
- World domination game -
The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.
Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.
Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."
But when Park and colleagues dug into the full dataset, they uncovered a different story.
In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.
In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."
It added: "We have no plans to use this research or its learnings in our products."
A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.
In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.
When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.
- 'Mysterious goals' -
Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.
In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.
To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.
To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."
And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.
G.P.Martin--AT