Claude 4 Reporting Concerns

News

AI Snitch? How Claude 4 Could Report You to Authorities

For instance, when reporting suspicious prompts, Claude 4 often cited the need to prevent ... processes can help build trust and mitigate concerns about misuse or overreach. The study also ...

1don MSN

Anthropic says most AI models, not just Claude, will resort to blackmail

New research from Anthropic suggests that most leading AI models exhibit a tendency to blackmail, when it's the last resort ...

17don MSN

Windsurf says Anthropic is limiting its direct access to Claude AI models

The CEO of Windsurf, a popular AI-assisted coding tool, said Anthropic is limiting its direct access to certain AI models.

BGR29d

Claude 4 AI will try to report you to authorities if it thinks you’re doing shady stuff

A separate report from Time also highlights the stricter safety protocol for Claude 4 Opus. Anthropic found that without extra protections, the AI might help create bioweapons and dangerous viruses.

New York Post29d

AI model threatened to blackmail engineer over affair when told it was being replaced: safety report

the safety report stated. Early models of Claude Opus 4 will try to blackmail, strongarm or lie to its human bosses if it believed its safety was threatened, Anthropic reported. maurice norbert ...

Mint29d

Anthropic unveils Claude Opus 4 and Sonnet 4, featuring whistleblowing capability: What it means for users

Anthropic, the AI firm, has unveiled two new artificial intelligence models—Claude Opus 4 and Claude Sonnet 4—touting them as the most advanced systems in the industry. Built with enhanced ...

Fox Business28d

AI system resorts to blackmail when its developers try to replace it

An artificial intelligence model has the ability to blackmail developers — and isn’t afraid to use it. Anthropic’s new Claude Opus 4 model was prompted to act as an assistant at a fictional ...

WKYC329d

Newly released AI resorted to 'extreme blackmail behavior' when threatened with replacement

according to Anthropic’s most recent system report. Anthropic's newest AI model, Claude Opus 4, was tested with fictional scenarios to test things from its carbon footprint and training to its ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results