As AI Agents Gain Power, Safety Details Lag Behind

In 2026, AI agents are moving from labs to laptops. Systems like OpenClaw and Moltbook have captured public attention, while industry leaders like OpenAI push their agent features forward. The promise is tangible: software that doesn't just answer questions, but plans, codes, and acts across your digital workspace with minimal oversight.

Yet a new study from the MIT AI Agent Index, examining 67 deployed systems, reveals a concerning disparity. While developers enthusiastically showcase what their agents can do, they are markedly silent on how they ensure those agents are safe.

The data is stark. Approximately 70% of the agents offer some documentation, and nearly half publish their code. However, only 19% disclose a formal safety policy. Fewer than one in ten report results from external safety evaluations. This creates what the researchers describe as a "lopsided transparency."

The core issue is autonomy. Unlike a chatbot whose error ends with a reply, an agent can access files, send messages, or alter data. A mistake can ripple through multiple steps, causing real damage. Despite this, most developers do not publicly explain how they test for such failures.

The pattern is clear: capability is broadcast, while guardrails are obscured. Demos and benchmarks are shared freely; details on safety procedures or risk audits are not. As these systems graduate from experiments to integral parts of software engineering and other sensitive workflows, this information gap becomes more significant.

The research doesn't conclude that agentic AI is inherently unsafe. It does, however, signal that the technology's accelerating autonomy is outpacing the structured, public accounting of its risks. The power is on full display. The protocols for containing it are not.