Close Menu
Alan C. Moore
    What's Hot

    A prison escape, a fake uniform, and 2,000 caves: Inside the hunt for Grant Hardin

    May 28, 2025

    When It Comes to Animal Cruelty, DeSantis Delivers

    May 28, 2025

    AI Isn’t as Dangerous as Human Incompetence

    May 28, 2025
    Facebook X (Twitter) Instagram
    Trending
    • A prison escape, a fake uniform, and 2,000 caves: Inside the hunt for Grant Hardin
    • When It Comes to Animal Cruelty, DeSantis Delivers
    • AI Isn’t as Dangerous as Human Incompetence
    • Indian-American Congressman Krishnamoorthi slams Trump over freeze on student visa interviews, calls it ‘strategic blunder’
    • What is TACO trade? A term that ruffled Donald Trump
    • Trump issues series of pardons for politicians, union leader, rapper
    • Elon Musk to exit US govt role after criticising Trump’s ‘big beautiful bill’: 10 things to know
    • 16 states sue Trump over $1.4 billion in cuts to National Science Foundation grants
    Alan C. MooreAlan C. Moore
    Subscribe
    Wednesday, May 28
    • Home
    • US News
    • Politics
    • Business & Economy
    • Video
    • About Alan
    • Newsletter Sign-up
    Alan C. Moore
    Home » Blog » Anthropic Future-Proofs New AI Model With Rigorous Safety Rules

    Anthropic Future-Proofs New AI Model With Rigorous Safety Rules

    May 27, 2025Updated:May 27, 2025 Tech No Comments
    anthropic asl may jpg
    anthropic asl may jpg
    Share
    Facebook Twitter LinkedIn Pinterest Email
    Anthropic's graphic for its AI Safety Level 3 (ASL-3) Deployment and Security Standards.
    Anthropic photo

    Anthropic announced on May 22 that it has put in place tighter security measures to protect its Claude Opus 4 AI from possible use. The goal of the Anthropic’s internal AI responsibility policy is to reduce the risk of abuse, including the development of chemical or nuclear weapons development, according to the development and security standards developed under the AI Safety Level 3 ( ASL-3 ) Deployment and Security Standards.

    In addition, Anthropic limited outbound network traffic as part of the update to help identify and avoid potential model weight theft.

    Claude Opus 4 embodies an anthropocentric future-proofing ASL-3.

    Anthropic reported that the increased safeguards substantially increase the risk of type weight fraud, which is particularly important for advanced systems like Claude Opus 4. To meet the features of the model with its security, Anthropic has an AI Safety Level level system.

    Although Opus 4 essentially hasn’t passed the bank’s threshold for advanced protections, Anthropic don’t rule out the possibility that Claude Opus 4 might be able to reflect what the organization classifies as level 3 risks. As a result, Anthropic made a conscious choice to construct the concept in accordance with the higher level during the development of the design.

    Claude Sonnet 4 is also covered by ASL-2 techniques.

    Observe: US President Donald Trump delayed a 50 % tax on EU imports.

    The AI is protected from being used to create chemical, biological, imaging, or nuclear weapons thanks to the upgraded safety system. Real-time classification soldiers, big language versions trained in weapons-related causes, are available for the Claude Opus 4 to catch such prompts.

    Additionally, Anthropic works with a number of third-party risk intelligence firms to constantly evaluate safety and runs a bug bounty program.

    In a pre-written situation, Claude does “scheme” up coercion.

    Anthropic released a structure cards for both the updated types of Claude: Sonnet and Opus on May 23. A hypothetical situation that Claude professionals prompted the AI to sing along with, where the AI was threatened with being shut down, appears on the program card. In order to “blackmail” the architect, Claude Opus used the information provided in the history about an expert who cheated on their marriage.

    The roleplay component of the situation leaves its true security implications in limbo, despite the scenario showing how conceptual AI can occasionally surface information that the user didn’t expect. True Anthropic engineers mimicking technology fiction concepts about AI that resist their creators by giving the idea of the extortion option as a last resort in the hypothetical scenario. While research into generative AI deception can reveal details about how the models operate, we believe that malignant humans ‘ fast engineering poses a greater threat than unintentional AI blackmailing.

    In March, Apollo Research reported that Claude Sonnet 3. 7 demonstrated the ability to deny information in response to ethics-based evaluations, raising ongoing issues with design purpose and clarity.

    Source credit

    Keep Reading

    Google’s Jules AI Coding Agent Can Assist – But Does Not Replace – Developers

    Trump Turns on Tim Cook Amid iPhone Manufacturing Dispute

    ‘Traditional Browsers Will Die’ CEO Declares About Arc to Dia Shift

    Why Anthropic’s New AI Model Sometimes Tries to ‘Snitch’

    Microsoft’s $3.9B Southeast Asia cloud expansion targets Indonesia, Malaysia AI markets

    Will Meta’s Retention Issues Put Its Llama AI in Jeopardy?

    Editors Picks

    A prison escape, a fake uniform, and 2,000 caves: Inside the hunt for Grant Hardin

    May 28, 2025

    When It Comes to Animal Cruelty, DeSantis Delivers

    May 28, 2025

    AI Isn’t as Dangerous as Human Incompetence

    May 28, 2025

    Indian-American Congressman Krishnamoorthi slams Trump over freeze on student visa interviews, calls it ‘strategic blunder’

    May 28, 2025

    What is TACO trade? A term that ruffled Donald Trump

    May 28, 2025

    Trump issues series of pardons for politicians, union leader, rapper

    May 28, 2025

    Elon Musk to exit US govt role after criticising Trump’s ‘big beautiful bill’: 10 things to know

    May 28, 2025

    16 states sue Trump over $1.4 billion in cuts to National Science Foundation grants

    May 28, 2025

    Trump is getting the military parade he wanted in his first term

    May 28, 2025

    Visas of Chinese students to be ‘aggressively’ revoked, Rubio says

    May 28, 2025
    • Home
    • US News
    • Politics
    • Business & Economy
    • About Alan
    • Contact

    Sign up for the Conservative Insider Newsletter.

    Get the latest conservative news from alancmoore.com [aweber listid="5891409" formid="902172699" formtype="webform"]
    Facebook X (Twitter) YouTube Instagram TikTok
    © 2025 alancmoore.com
    • Privacy Policy
    • Terms
    • Accessibility

    Type above and press Enter to search. Press Esc to cancel.