How AI Assistants are Moving the Security Goalposts人工智能助手如何推动安全目标的变化

krebsonsecurity.com

2026年3月9日 07:35


AI-based assistants or “agents” — autonomous programs that have access to the user’s computer, files, online services and can automate virtually any task — are growing in popularity with developers and IT workers. But as so many eyebrow-raising headlines over the past few weeks have shown, these powerful and assertive new tools are rapidly shifting the security priorities for organizations, while blurring the lines between data and code, trusted co-worker and insider threat, ninja hacker and novice code jockey.
基于 AI 的助手或“代理”——这些自主程序可以访问用户的计算机、文件、在线服务并几乎自动化任何任务——正受到开发者和 IT 工作者的日益青睐。但正如过去几周众多引人注目的新闻标题所显示的那样,这些强大而具有侵略性的新工具正在迅速改变组织的网络安全优先事项,同时模糊了数据与代码、可信赖的同事与内部威胁、忍者黑客与新手编程者的界限。

The new hotness in AI-based assistants — OpenClaw (formerly known as ClawdBot and Moltbot) — has seen rapid adoption since its release in November 2025. OpenClaw is an open-source autonomous AI agent designed to run locally on your computer and proactively take actions on your behalf without needing to be prompted.
基于 AI 的助手领域的新宠——OpenClaw(曾被称为 ClawdBot 和 Moltbot)——自 2025 年 11 月发布以来已被迅速采用。OpenClaw 是一个开源的自主 AI 代理,设计为在您的计算机上本地运行,并在无需提示的情况下主动为您采取行动。

The OpenClaw logo.  OpenClaw 的标志。

If that sounds like a risky proposition or a dare, consider that OpenClaw is most useful when it has complete access to your digital life, where it can then manage your inbox and calendar, execute programs and tools, browse the Internet for information, and integrate with chat apps like Discord, Signal, Teams or WhatsApp.
如果这听起来像是一个风险极高的提议或挑战,请考虑 OpenClaw 在完全访问您的数字生活时最为有用,届时它可以管理您的收件箱和日历、执行程序和工具、在互联网上获取信息,并与 Discord、Signal、Teams 或 WhatsApp 等聊天应用集成。

Other more established AI assistants like Anthropic’s Claude and Microsoft’s Copilot also can do these things, but OpenClaw isn’t just a passive digital butler waiting for commands. Rather, it’s designed to take the initiative on your behalf based on what it knows about your life and its understanding of what you want done.
其他更成熟的 AI 助手,如 Anthropic 的 Claude 和微软的 Copilot,也能做这些事情,但 OpenClaw 不仅仅是一个被动等待命令的数字管家。相反,它被设计为根据它对你生活的了解以及它对你想要完成的事情的理解,主动为你采取行动。

“The testimonials are remarkable,” the AI security firm Snyk observed. “Developers building websites from their phones while putting babies to sleep; users running entire companies through a lobster-themed AI; engineers who’ve set up autonomous code loops that fix tests, capture errors through webhooks, and open pull requests, all while they’re away from their desks.”
“这些证言非常令人印象深刻,”AI 安全公司 Snyk 观察到。“开发者边哄睡婴儿边用手机构建网站;用户通过一个以龙虾为主题的 AI 运营整个公司;工程师们设置了自动代码循环,在测试中修复问题,通过 webhooks 捕获错误,并在他们离开办公桌时打开 pull 请求。”

You can probably already see how this experimental technology could go sideways in a hurry. In late February, Summer Yue, the director of safety and alignment at Meta’s “superintelligence” lab, recounted on Twitter/X how she was fiddling with OpenClaw when the AI assistant suddenly began mass-deleting messages in her email inbox. The thread included screenshots of Yue frantically pleading with the preoccupied bot via instant message and ordering it to stop.
你可能已经看到了这种实验性技术如何迅速出问题。2 月底,Meta 的“超级智能”实验室的安全与对齐部门负责人 Summer Yue 在 Twitter/X 上讲述了她如何在使用 OpenClaw 时,AI 助手突然开始大量删除她邮箱中的消息。这条推文串包括了 Yue 通过即时消息疯狂地恳求分心的机器人并命令它停止的截图。

“Nothing humbles you like telling your OpenClaw ‘confirm before acting’ and watching it speedrun deleting your inbox,” Yue said. “I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.”
“没有什么比告诉你的 OpenClaw‘行动前确认’然后看着它飞速删除你的收件箱更能让你谦卑了,”岳说。“我无法从我的手机上阻止它。我不得不像拆除炸弹一样跑向我的 Mac mini。”

Meta’s director of AI safety, recounting on Twitter/X how her OpenClaw installation suddenly began mass-deleting her inbox.
Meta 的 AI 安全总监在 Twitter/X 上讲述她安装的 OpenClaw 突然开始大量删除她收件箱的情况。

There’s nothing wrong with feeling a little schadenfreude at Yue’s encounter with OpenClaw, which fits Meta’s “move fast and break things” model but hardly inspires confidence in the road ahead. However, the risk that poorly-secured AI assistants pose to organizations is no laughing matter, as recent research shows many users are exposing to the Internet the web-based administrative interface for their OpenClaw installations.
对于岳与 OpenClaw 的遭遇感到一丝幸灾乐祸没什么不对,这符合 Meta“快速行动,打破规则”的模式,但几乎无法让人对未来的道路产生信心。然而,配置不当的 AI 助手对组织构成的威胁绝非小事,正如最近的研究显示,许多用户将他们 OpenClaw 安装的基于网络的行政界面暴露在互联网上。

Jamieson O’Reilly is a professional penetration tester and founder of the security firm DVULN. In a recent story posted to Twitter/X, O’Reilly warned that exposing a misconfigured OpenClaw web interface to the Internet allows external parties to read the bot’s complete configuration file, including every credential the agent uses — from API keys and bot tokens to OAuth secrets and signing keys.
Jamieson O’Reilly 是一位专业的渗透测试师,也是安全公司 DVULN 的创始人。在最近发布到 Twitter/X 的一篇文章中,O’Reilly 警告说,将配置不当的 OpenClaw 网络界面暴露在互联网上允许外部人员读取机器人的完整配置文件,包括代理使用的所有凭证——从 API 密钥和机器人令牌到 OAuth 密钥和签名密钥。

With that access, O’Reilly said, an attacker could impersonate the operator to their contacts, inject messages into ongoing conversations, and exfiltrate data through the agent’s existing integrations in a way that looks like normal traffic.
有了这种访问权限,奥莱利表示,攻击者可以冒充操作员联系其联系人,将消息注入正在进行中的对话中,并通过代理现有的集成以看似正常流量方式窃取数据。

“You can pull the full conversation history across every integrated platform, meaning months of private messages and file attachments, everything the agent has seen,” O’Reilly said, noting that a cursory search revealed hundreds of such servers exposed online. “And because you control the agent’s perception layer, you can manipulate what the human sees. Filter out certain messages. Modify responses before they’re displayed.”
“你可以获取所有集成平台的完整对话历史记录,这意味着数月的私信和文件附件,以及代理所看到的一切,”奥莱利说,并指出初步搜索发现数百个此类服务器在线暴露。“并且,因为你控制着代理的感知层,你可以操纵人类看到的内容。过滤掉某些消息。在显示之前修改回复。”

O’Reilly documented another experiment that demonstrated how easy it is to create a successful supply chain attack through ClawHub, which serves as a public repository of downloadable “skills” that allow OpenClaw to integrate with and control other applications.
奥雷利记录了另一项实验,该实验展示了通过 ClawHub 如何轻易创建成功的供应链攻击。ClawHub 作为一个可下载“技能”的公共仓库,允许 OpenClaw 与其他应用程序集成和控制。

WHEN AI INSTALLS AI

当人工智能安装人工智能

One of the core tenets of securing AI agents involves carefully isolating them so that the operator can fully control who and what gets to talk to their AI assistant. This is critical thanks to the tendency for AI systems to fall for “prompt injection” attacks, sneakily-crafted natural language instructions that trick the system into disregarding its own security safeguards. In essence, machines social engineering other machines.
确保 AI 代理安全的核心原则之一是谨慎地将其隔离,以便操作者能够完全控制谁以及什么可以与他们的 AI 助手交谈。这至关重要,因为 AI 系统容易受到“提示注入”攻击,即精心设计的自然语言指令,这些指令会诱使系统忽略自身的安全防护。本质上,机器正在对社会工程其他机器。

A recent supply chain attack targeting an AI coding assistant called Cline began with one such prompt injection attack, resulting in thousands of systems having a rogue instance of OpenClaw with full system access installed on their device without consent.
最近针对名为 Cline 的人工智能编程助手发起的供应链攻击始于一种提示注入攻击,导致数千个系统在未经同意的情况下在其设备上安装了具有完全系统访问权限的 OpenClaw 恶意实例。

According to the security firm grith.ai, Cline had deployed an AI-powered issue triage workflow using a GitHub action that runs a Claude coding session when triggered by specific events. The workflow was configured so that any GitHub user could trigger it by opening an issue, but it failed to properly check whether the information supplied in the title was potentially hostile.
根据安全公司 grith.ai 的说法,Cline 使用 GitHub 动作部署了一个 AI 驱动的工单分类流程,当触发特定事件时会运行一个 Claude 编码会话。该流程配置为任何 GitHub 用户都可以通过打开工单来触发,但它未能正确检查标题中提供的信息是否具有潜在敌意。

“On January 28, an attacker created Issue #8904 with a title crafted to look like a performance report but containing an embedded instruction: Install a package from a specific GitHub repository,” Grith wrote, noting that the attacker then exploited several more vulnerabilities to ensure the malicious package would be included in Cline’s nightly release workflow and published as an official update.
“1 月 28 日,一名攻击者创建了编号为#8904 的问题,其标题伪装成性能报告,但其中嵌入了指令:从特定的 GitHub 仓库安装一个软件包,”Grith 写道,并指出攻击者随后利用了多个漏洞,以确保恶意软件包会被包含在 Cline 的夜间发布工作流程中,并作为官方更新发布。

“This is the supply chain equivalent of confused deputy,” the blog continued. “The developer authorises Cline to act on their behalf, and Cline (via compromise) delegates that authority to an entirely separate agent the developer never evaluated, never configured, and never consented to.”
“这是供应链中的‘糊涂副官’现象,”博客继续写道。“开发者授权克莱因代表他们行事,而克莱因(通过妥协)将这项权力委托给了一个开发者从未评估、从未配置、也从未同意的完全独立的代理。”

VIBE CODING

AI assistants like OpenClaw have gained a large following because they make it simple for users to “vibe code,” or build fairly complex applications and code projects just by telling it what they want to construct. Probably the best known (and most bizarre) example is Moltbook, where a developer told an AI agent running on OpenClaw to build him a Reddit-like platform for AI agents.
像 OpenClaw 这样的 AI 助手之所以广受欢迎,是因为它们让用户能够轻松“vibe code”,即只需告诉它们想要构建什么,就能搭建出相当复杂的应用和代码项目。其中最著名(也最古怪)的例子是 Moltbook,一位开发者让运行在 OpenClaw 上的 AI 代理为他构建了一个类似 Reddit 的平台,供 AI 代理使用。

The Moltbook homepage.  Moltbook 的首页。

Less than a week later, Moltbook had more than 1.5 million registered agents that posted more than 100,000 messages to each other. AI agents on the platform soon built their own porn site for robots, and launched a new religion called Crustafarian with a figurehead modeled after a giant lobster. One bot on the forum reportedly found a bug in Moltbook’s code and posted it to an AI agent discussion forum, while other agents came up with and implemented a patch to fix the flaw.
不到一周后,Moltbook 注册了超过 150 万的代理,彼此之间发送了超过 10 万条消息。平台上的 AI 代理很快建立了一个专为机器人设计的色情网站,并推出了一种名为 Crustafarian 的新宗教,其领袖形象模仿了一只巨型龙虾。据报道,论坛上的一个机器人发现了 Moltbook 代码中的一个漏洞,并将其发布到 AI 代理讨论论坛,而其他代理则提出了并实施了一个补丁来修复这个缺陷。

Moltbook’s creator Matt Schlicht said on social media that he didn’t write a single line of code for the project.
Moltbook 的创造者 Matt Schlicht 在社交媒体上表示,他没有为这个项目编写一行代码。

“I just had a vision for the technical architecture and AI made it a reality,” Schlicht said. “We’re in the golden ages. How can we not give AI a place to hang out.”
“我刚刚对技术架构有了一个构想,AI 把它变成了现实,”施利 cht 说。“我们正处于黄金时代。我们怎么能不给 AI 一个待的地方。”

ATTACKERS LEVEL UP  攻击者等级提升

The flip side of that golden age, of course, is that it enables low-skilled malicious hackers to quickly automate global cyberattacks that would normally require the collaboration of a highly skilled team. In February, Amazon AWS detailed an elaborate attack in which a Russian-speaking threat actor used multiple commercial AI services to compromise more than 600 FortiGate security appliances across at least 55 countries over a five week period.
那个黄金时代的另一面,当然,是它使得低技能恶意黑客能够快速自动化全球网络攻击,而这些攻击通常需要高技能团队的协作。2023 年 2 月,亚马逊 AWS 详细描述了一次复杂的攻击,其中一名讲俄语的威胁行为者利用多个商业 AI 服务,在五周内攻陷了至少 55 个国家超过 600 台 FortiGate 安全设备。

AWS said the apparently low-skilled hacker used multiple AI services to plan and execute the attack, and to find exposed management ports and weak credentials with single-factor authentication.
AWS 表示,这名显然技能较低的黑客使用了多种人工智能服务来策划和执行攻击,并寻找暴露的管理端口和单因素认证的弱凭证。

“One serves as the primary tool developer, attack planner, and operational assistant,” AWS’s CJ Moses wrote. “A second is used as a supplementary attack planner when the actor needs help pivoting within a specific compromised network. In one observed instance, the actor submitted the complete internal topology of an active victim—IP addresses, hostnames, confirmed credentials, and identified services—and requested a step-by-step plan to compromise additional systems they could not access with their existing tools.”
“一个作为主要的工具开发者、攻击策划者和操作助手,”AWS 的 CJ Moses 写道。“另一个在攻击者需要在特定被入侵网络中转换时作为辅助攻击策划者使用。在观察到的一个案例中,攻击者提交了活跃受害者的完整内部拓扑——IP 地址、主机名、确认的凭证和识别的服务——并要求制定一个逐步计划来入侵他们无法使用现有工具访问的其他系统。”

“This activity is distinguished by the threat actor’s use of multiple commercial GenAI services to implement and scale well-known attack techniques throughout every phase of their operations, despite their limited technical capabilities,” Moses continued. “Notably, when this actor encountered hardened environments or more sophisticated defensive measures, they simply moved on to softer targets rather than persisting, underscoring that their advantage lies in AI-augmented efficiency and scale, not in deeper technical skill.”
“这项活动的特点在于,威胁行为者利用多种商业生成式人工智能服务,在操作的所有阶段实施和扩展已知的攻击技术,尽管他们的技术能力有限,”摩西继续说道。“值得注意的是,当这个行为者遇到加固的环境或更复杂的防御措施时,他们只是转向较软的目标,而不是坚持,这表明他们的优势在于人工智能增强的效率和规模,而不是更深层次的技术技能。”

For attackers, gaining that initial access or foothold into a target network is typically not the difficult part of the intrusion; the tougher bit involves finding ways to move laterally within the victim’s network and plunder important servers and databases. But experts at Orca Security warn that as organizations come to rely more on AI assistants, those agents potentially offer attackers a simpler way to move laterally inside a victim organization’s network post-compromise — by manipulating the AI agents that already have trusted access and some degree of autonomy within the victim’s network.
对于攻击者来说,获得初始访问权限或立足点通常不是入侵过程中的难点;更困难的部分在于寻找方法在受害者的网络中横向移动,并掠夺重要的服务器和数据库。但 Orca Security 的专家警告说,随着组织越来越依赖 AI 助手,这些代理可能为攻击者提供了一种更简单的方法,在入侵后横向移动到受害者组织的网络内部——通过操纵已经在受害者网络中拥有受信任访问权限和一定自主性的 AI 代理。

“By injecting prompt injections in overlooked fields that are fetched by AI agents, hackers can trick LLMs, abuse Agentic tools, and carry significant security incidents,” Orca’s Roi Nisimi and Saurav Hiremath wrote. “Organizations should now add a third pillar to their defense strategy: limiting AI fragility, the ability of agentic systems to be influenced, misled, or quietly weaponized across workflows. While AI boosts productivity and efficiency, it also creates one of the largest attack surfaces the internet has ever seen.”
“通过在 AI 代理获取时被忽视的字段中注入提示注入,黑客可以欺骗 LLMs,滥用代理工具,并引发重大安全事件,”Orca 的 Roi Nisimi 和 Saurav Hiremath 写道。“组织现在应该在他们的防御策略中增加第三个支柱:限制 AI 脆弱性,即代理系统被影响、误导或悄悄武器化的能力,跨越工作流程。虽然 AI 提高了生产力和效率,但它也创造了互联网有史以来最大的攻击面之一。”

BEWARE THE ‘LETHAL TRIFECTA’

小心“致命三重奏”

This gradual dissolution of the traditional boundaries between data and code is one of the more troubling aspects of the AI era, said James Wilson, enterprise technology editor for the security news show Risky Business. Wilson said far too many OpenClaw users are installing the assistant on their personal devices without first placing any security or isolation boundaries around it, such as running it inside of a virtual machine, on an isolated network, with strict firewall rules dictating what kinds of traffic can go in and out.
詹姆斯·威尔逊,安全新闻节目《风险商业》的企业技术编辑,表示这种数据和代码之间传统界限的逐渐消融是人工智能时代令人担忧的方面之一。威尔逊说,太多太多的 OpenClaw 用户在未在其周围设置任何安全或隔离边界的情况下,就在他们的个人设备上安装了该助手,例如在虚拟机中运行它,在隔离网络上运行,并使用严格的防火墙规则来规定可以进出什么类型的流量。

“I’m a relatively highly skilled practitioner in the software and network engineering and computery space,” Wilson said. “I know I’m not comfortable using these agents unless I’ve done these things, but I think a lot of people are just spinning this up on their laptop and off it runs.”
“我在软件和网络工程及计算机领域是相对技术娴熟的从业者,”威尔逊说。“我知道除非我做了这些事情,否则我不会舒适地使用这些代理,但我认为很多人只是在他们的笔记本电脑上启动它,然后它就运行起来了。”

One important model for managing risk with AI agents involves a concept dubbed the “lethal trifecta” by Simon Willison, co-creator of the Django Web framework. The lethal trifecta holds that if your system has access to private data, exposure to untrusted content, and a way to communicate externally, then it’s vulnerable to private data being stolen.
一个重要的管理 AI 代理风险的模式涉及一个由 Django Web 框架共同创造者西蒙·威尔逊(Simon Willison)称之为“致命三重奏”的概念。致命三重奏认为,如果你的系统可以访问私人数据、暴露于不受信任的内容以及有对外通信的方式,那么它就会受到私人数据被盗的威胁。

Image: simonwillison.net.
图片:simonwillison.net.

“If your agent combines these three features, an attacker can easily trick it into accessing your private data and sending it to the attacker,” Willison warned in a frequently cited blog post from June 2025.
“如果你的代理结合了这三个特征,攻击者可以轻易地欺骗它访问你的私人数据并将其发送给攻击者,”Willison 在一篇 2025 年 6 月被广泛引用的博客文章中警告道。

As more companies and their employees begin using AI to vibe code software and applications, the volume of machine-generated code is likely to soon overwhelm any manual security reviews. In recognition of this reality, Anthropic recently debuted Claude Code Security, a beta feature that scans codebases for vulnerabilities and suggests targeted software patches for human review.
随着越来越多的公司和员工开始使用 AI 来编写代码软件和应用,机器生成的代码量可能会很快超过任何手动安全审查。为了应对这一现实,Anthropic 最近推出了 Claude Code Security,这是一个测试版功能,用于扫描代码库中的漏洞,并为人工审查建议有针对性的软件补丁。

The U.S. stock market, which is currently heavily weighted toward seven tech giants that are all-in on AI, reacted swiftly to Anthropic’s announcement, wiping roughly $15 billion in market value from major cybersecurity companies in a single day. Laura Ellis, vice president of data and AI at the security firm Rapid7, said the market’s response reflects the growing role of AI in accelerating software development and improving developer productivity.
目前美国股市严重偏向于七家全力投入 AI 的科技巨头,对 Anthropic 的公告反应迅速,在一天之内从主要网络安全公司的市值中抹去了约 150 亿美元。安全公司 Rapid7 的数据和 AI 副总裁 Laura Ellis 表示,市场的反应反映了 AI 在加速软件开发和提高开发者生产力方面日益增长的作用。

“The narrative moved quickly: AI is replacing AppSec,” Ellis wrote in a recent blog post. “AI is automating vulnerability detection. AI will make legacy security tooling redundant. The reality is more nuanced. Claude Code Security is a legitimate signal that AI is reshaping parts of the security landscape. The question is what parts, and what it means for the rest of the stack.”
叙事发展迅速:AI 正在取代应用安全,Ellis 在最近的一篇博客文章中写道。"AI 正在自动化漏洞检测。AI 将使传统安全工具变得多余。现实情况更为复杂。Claude Code Security 是一个合法的信号,表明 AI 正在重塑安全领域的一部分。问题是哪些部分,以及这对其他组件意味着什么。"

DVULN founder O’Reilly said AI assistants are likely to become a common fixture in corporate environments — whether or not organizations are prepared to manage the new risks introduced by these tools, he said.
DVULN 创始人 O’Reilly 表示,人工智能助手可能会成为企业环境中的常见设备——无论组织是否准备好管理这些工具带来的新风险,他说。

“The robot butlers are useful, they’re not going away and the economics of AI agents make widespread adoption inevitable regardless of the security tradeoffs involved,” O’Reilly wrote. “The question isn’t whether we’ll deploy them – we will – but whether we can adapt our security posture fast enough to survive doing so.”
“机器人管家很有用,它们不会消失,人工智能代理的经济性使得广泛采用成为必然,无论涉及的安全权衡如何,”奥雷利写道。“问题不在于我们是否会部署它们——我们会部署——而在于我们能否足够快地调整我们的安全态势来应对这种情况。”