Close Menu
Alan C. Moore
    What's Hot

    Multiple people on board small plane that crashed in San Diego neighborhood are dead: authorities

    May 22, 2025

    Of Course, Anti-Semitic Rhetoric on the Left Motivated the Killer of the Israeli Embassy Employees

    May 22, 2025

    Getcha Popcorn Ready: The Democratic Civil War Is Right on Schedule

    May 22, 2025
    Facebook X (Twitter) Instagram
    Trending
    • Multiple people on board small plane that crashed in San Diego neighborhood are dead: authorities
    • Of Course, Anti-Semitic Rhetoric on the Left Motivated the Killer of the Israeli Embassy Employees
    • Getcha Popcorn Ready: The Democratic Civil War Is Right on Schedule
    • Elon Musk reacts to report that says he’s finished, Scott Bessent hurled abuses on him ‘loud enough for Trump to hear’
    • ‘Satya Nadella personally invited him,’ family of Indian-origin entrepreneur Akshay Gupta murdered in Austin speaks up
    • Donald Trump’s ‘One Big Beautiful Bill’: Only two Republicans vote ‘No’ — who are they?
    • WATCH LIVE: Karoline Leavitt holds White House press briefing after House passes ‘big, beautiful bill’
    • Congress Closer To Defunding Planned Parenthood After House Passes ’Big Beautiful Bill’
    Alan C. MooreAlan C. Moore
    Subscribe
    Thursday, May 22
    • Home
    • US News
    • Politics
    • Business & Economy
    • Video
    • About Alan
    • Newsletter Sign-up
    Alan C. Moore
    Home » Blog » Who’s to Blame When AI Agents Screw Up?

    Who’s to Blame When AI Agents Screw Up?

    May 22, 2025Updated:May 22, 2025 Tech No Comments
    Who to Blame AI Business jpg
    Who to Blame AI Business jpg
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Over the past year, veteran software engineer Jay Prakash Thakur has spent his nights and weekends prototyping AI agents that could, in the near future, order meals and engineer mobile apps almost entirely on their own. His agents, while surprisingly capable, have also exposed new legal questions that await companies trying to capitalize on Silicon Valley’s hottest new technology.

    Agents are AI programs that can act mostly independently, allowing companies to automate tasks such as answering customer questions or paying invoices. While ChatGPT and similar chatbots can draft emails or analyze bills upon request, Microsoft and other tech giants expect that agents will tackle more complex functions—and most importantly, do it with little human oversight.

    The tech industry’s most ambitious plans involve multi-agent systems, with dozens of agents someday teaming up to replace entire workforces. For companies, the benefit is clear: saving on time and labor costs. Already, demand for the technology is rising. Tech market researcher Gartner estimates that agentic AI will resolve 80 percent of common customer service queries by 2029. Fiverr, a service where businesses can book freelance coders, reports that searches for “ai agent” have surged 18,347 percent in recent months.

    Thakur, a mostly self-taught coder living in California, wanted to be at the forefront of the emerging field. His day job at Microsoft isn’t related to agents, but he has been tinkering with AutoGen, Microsoft’s open source software for building agents, since he worked at Amazon back in 2024. Thakur says he has developed multi-agent prototypes using AutoGen with just a dash of programming. Last week, Amazon rolled out a similar agent development tool called Strands; Google offers what it calls an Agent Development Kit.

    Because agents are meant to act autonomously, the question of who bears responsibility when their errors cause financial damage has been Thakur’s biggest concern. Assigning blame when agents from different companies miscommunicate within a single, large system could become contentious, he believes. He compared the challenge of reviewing error logs from various agents to reconstructing a conversation based on different people’s notes. “It’s often impossible to pinpoint responsibility,” Thakur says.

    Joseph Fireman, senior legal counsel at OpenAI, said on stage at a recent legal conference hosted by the Media Law Resource Center in San Francisco that aggrieved parties tend to go after those with the deepest pockets. That means companies like his will need to be prepared to take some responsibility when agents cause harm—even when a kid messing around with an agent might be to blame. (If that person were at fault, they likely wouldn’t be a worthwhile target moneywise, the thinking goes). “I don’t think anybody is hoping to get through to the consumer sitting in their mom’s basement on the computer,” Fireman said. The insurance industry has begun rolling out coverage for AI chatbot issues to help companies cover the costs of mishaps.

    Onion Rings

    Thakur’s experiments have involved him stringing together agents in systems that require as little human intervention as possible. One project he pursued was replacing fellow software developers with two agents. One was trained to search for specialized tools needed for making apps, and the other summarized their usage policies. In the future, a third agent could use the identified tools and follow the summarized policies to develop an entirely new app, Thakur says.

    When Thakur put his prototype to the test, a search agent found a tool that, according to the website, “supports unlimited requests per minute for enterprise users” (meaning high-paying clients can rely on it as much as they want). But in trying to distill the key information, the summarization agent dropped the crucial qualification of “per minute for enterprise users.” It erroneously told the coding agent, which did not qualify as an enterprise user, that it could write a program that made unlimited requests to the outside service. Because this was a test, there was no harm done. If it had happened in real life, the truncated guidance could have led to the entire system unexpectedly breaking down.

    Thakur also pursued a more complicated project. He developed an ordering system for a futuristic restaurant that could accept custom orders across cuisines. Users could type out their desires—“burgers and fries”—to a chatbot. An AI agent could then research an appropriate price and translate the order into a recipe. It could then pass off the instructions to a cast of robots with differing culinary expertise. Thakur doesn’t actually have a commercial kitchen, let alone a single robot, but he developed a simulation to identify pitfalls.

    Nine out of 10 times, all went well. Then, there were the cases where “I want onion rings” became “extra onions.” Or requests such as “extra naan” were ignored. Errors tended to appear most often when Thakur tried to jam through orders with more than five items. A worst-case scenario, if this happened in real life, would be misserving someone with a food allergy.

    In another prototype Thakur has tried, a shopping comparison agent meant to help users find the best deal came back with a bargain offer from one ecommerce website but incorrectly linked to a product page on a different website, which had a higher price. If the agent were designed to automatically make purchases, a customer would have ended up overspending, Thakur says.

    More familiar AI programs such as ChatGPT already make costly errors. Last year, a coupon inadvertently invented by an airline AI chatbot was held to be legally binding. This month, chatbot developer Anthropic had to apologize to a judge for a sloppy AI-generated citation in a court filing. Single-agent systems can also go wrong. Naveen Chatlapalli, a software developer helping companies with agents, says he’s seen an HR agent approve leave requests it should have denied and a notetaker agent send sensitive information from meetings to the wrong department. With relatively straightforward programs like these, it’s easy to diagnose what went wrong and introduce more human oversight.

    Even Thakur’s more complex restaurant snafus could be resolved by simply having the customer confirm that the cooking agent has the order correct. But that undermines the principle of limiting human involvement. “We want to save time for our customers,” Thakur says. “That’s where it’s still making mistakes.” And as far as identifying the origin of any issues that arise, an agent that interprets an order wrong can be as much at fault as a cooking agent that fails to recognize flaws in the request, Thakur says.

    A leading hope among developers is that a “judge” agent can start to reign over these systems and identify and remedy errors before they snowball. They are meant to act as the manager that figures out the customer meant onion rings, not extra onions. Mark Kashef, a freelancer on Fiverr who runs an AI strategy company called Prompt Advisers, worries that companies are starting to overengineer early systems with an unnecessary number of agents—no different than bloating inside a human bureaucracy. This month, Kashef told an African government seeking his advice to focus on developing a single agent that could save it the most time.

    But as the tech industry pursues more elaborate AI systems, someone will have to settle who pays when a customer demands a refund for a botched food order or sues over a more significant misfire. During the recent legal conference in San Francisco, OpenAI’s Fireman and other attorneys said existing laws would hold users who issue orders to agents somewhat responsible for the actions of those agents—especially when the users were warned of the agents’ actions and limitations.

    Legal experts have suggested that people who wish to use agentic systems sign contracts that push responsibility onto the companies supplying the technology. Of course, ordinary consumers can’t force giant companies to agree to these terms. If anything, some users may rely on agents to review legalese for them. “There will be interesting questions about whether agents can bypass privacy policies and terms of service” on behalf of users, Rebecca Jacobs, associate general counsel at Anthropic, said at the conference.

    Dazza Greenwood, an attorney who has been researching the legal risks of agents, encourages caution. “If you have a 10 percent error rate with ‘add onions,’ that to me is nowhere near release,” he says. “Work your systems out so that you’re not inflicting harm on people to start with.”

    The reality is that users can’t kick up their feet and leave it all to the agents just yet.

    Source credit

    Keep Reading

    Politico’s Newsroom Is Starting a Legal Battle With Management Over AI

    Politico’s Newsroom Is Starting a Legal Battle With Management Over AI

    How VeloCloud Uses AI to Dynamically Optimize Network Performance

    Train Your Brain for the Age of AI

    Who’s to Blame When AI Agents Screw Up?

    AI Chatbot Jailbreaking Security Threat is ‘Immediate, Tangible, and Deeply Concerning’

    Editors Picks

    Multiple people on board small plane that crashed in San Diego neighborhood are dead: authorities

    May 22, 2025

    Of Course, Anti-Semitic Rhetoric on the Left Motivated the Killer of the Israeli Embassy Employees

    May 22, 2025

    Getcha Popcorn Ready: The Democratic Civil War Is Right on Schedule

    May 22, 2025

    Elon Musk reacts to report that says he’s finished, Scott Bessent hurled abuses on him ‘loud enough for Trump to hear’

    May 22, 2025

    ‘Satya Nadella personally invited him,’ family of Indian-origin entrepreneur Akshay Gupta murdered in Austin speaks up

    May 22, 2025

    Donald Trump’s ‘One Big Beautiful Bill’: Only two Republicans vote ‘No’ — who are they?

    May 22, 2025

    WATCH LIVE: Karoline Leavitt holds White House press briefing after House passes ‘big, beautiful bill’

    May 22, 2025

    Congress Closer To Defunding Planned Parenthood After House Passes ’Big Beautiful Bill’

    May 22, 2025

    SCOTUS Declines To Nullify Lower Court Blockade On Establishing First Public Religious Charter School

    May 22, 2025

    WATCH: Jake Tapper Accidentally Proved He’s Still Part of the Biden Cover-Up

    May 22, 2025
    • Home
    • US News
    • Politics
    • Business & Economy
    • About Alan
    • Contact

    Sign up for the Conservative Insider Newsletter.

    Get the latest conservative news from alancmoore.com [aweber listid="5891409" formid="902172699" formtype="webform"]
    Facebook X (Twitter) YouTube Instagram TikTok
    © 2025 alancmoore.com
    • Privacy Policy
    • Terms
    • Accessibility

    Type above and press Enter to search. Press Esc to cancel.