🔧 Openclaw 更新 2026.4.5

发布日期: 2026-04-07
⚠️ 新版本发布

Breaking

  • Config: remove legacy public config aliases such as talk.voiceId / talk.apiKey, agents.*.sandbox.perSession, browser.ssrfPolicy.allowPrivateNetwork, hooks.internal.handlers, and channel/group/room allow toggles in favor of the canonical public paths and enabled, while keeping load-time compatibility and openclaw doctor --fix migration support for existing configs. (#60726) Thanks @vincentkoc.

Changes

  • Agents/video generation: add the built-in video_generate tool so agents can create videos through configured providers and return the generated media directly in the reply.
  • Agents/music generation: ignore unsupported optional hints such as durationSeconds with a warning instead of hard-failing requests on providers like Google Lyria.
  • Providers/Arcee AI: add a bundled Arcee AI provider plugin with ARCEEAI_API_KEY onboarding, Trinity model catalog (mini, large-preview, large-thinking), OpenAI-compatible API support, and OpenRouter as an alternative auth path. (#62068) Thanks @arthurbr11.
  • Providers/ComfyUI: add a bundled comfy workflow media plugin for local ComfyUI and Comfy Cloud workflows, including shared image_generate, video_generate, and workflow-backed music_generate support, with prompt injection, optional reference-image upload, live tests, and output download.
  • Tools/music generation: add the built-in music_generate tool with bundled Google (Lyria) and MiniMax providers plus workflow-backed Comfy support, including async task tracking and follow-up delivery of finished audio.
  • Providers: add bundled Qwen, Fireworks AI, and StepFun providers, plus MiniMax TTS, Ollama Web Search, and MiniMax Search integrations for chat, speech, and search workflows. (#60032, #55921, #59318, #54648)
  • Providers/Amazon Bedrock: add bundled Mantle support plus inference-profile discovery and automatic request-region injection so Bedrock-hosted Claude, GPT-OSS, Qwen, Kimi, GLM, and similar routes work with less manual setup. (#61296, #61299) Thanks @wirjo.
  • Control UI/multilingual: add localized control UI support for Simplified Chinese, Traditional Chinese, Brazilian Portuguese, German, Spanish, Japanese, Korean, French, Turkish, Indonesian, Polish, and Ukrainian. Thanks @vincentkoc.
  • Plugins: add plugin-config TUI prompts to guided onboarding/setup flows, and add openclaw plugins install --force so existing plugin and hook-pack targets can be replaced without using the dangerous-code override flag. (#60590, #60544)
  • Control UI/skills: add ClawHub search, detail, and install flows directly in the Skills panel. (#60134) Thanks @samzong.
  • iOS/exec approvals: add generic APNs approval notifications that open an in-app exec approval modal, fetch command details only after authenticated operator reconnect, and clear stale notification state when the approval resolves. (#60239) Thanks @ngutman.
  • Matrix/exec approvals: add Matrix-native exec approval prompts with account-scoped approvers, channel-or-DM delivery, and room-thread aware resolution handling. (#58635) Thanks @gumadeiras.

Fixes

  • Control UI/chat: show /tts and other local audio-only slash replies in webchat by embedding local audio in the assistant message and rendering <audio> controls instead of dropping empty-text finals. Fixes #61564. (#61598) Thanks @neeravmakwana.
  • Security: preserve restrictive plugin-only tool allowlists, require owner access for /allowlist add and /allowlist remove, fail closed when before_tool_call hooks crash, block browser SSRF redirect bypasses earlier, and keep non-interactive auth-choice inference scoped to bundled and already-trusted plugins. (#58476, #59836, #59822, #58771, #59120) Thanks @eleqtrizit and @pgondhi987.
  • Providers/OpenAI: make GPT-5 and Codex runs act sooner with lower-verbosity defaults, visible progress during tool work, and a one-shot retry when a turn only narrates the plan instead of taking action.
  • Providers/OpenAI and reply delivery: preserve native reasoning.effort: "none" and strict schemas where supported, add GPT-5.4 assistant phase metadata across replay and the Gateway /v1/responses layer, and keep commentary buffered until final_answer so web chat, session previews, embedded replies, and Telegram partials stop leaking planning text. Fixes #59150, #59643, #61282.
  • Telegram: fix current-model checks in the model picker, HTML-format non-default /model confirmations, explicit topic replies, persisted reaction ownership across restarts, caption-media placeholder and file_id preservation on download failure, and upgraded-install inbound image reads. (#60384, #60042, #59634, #59207, #59948, #59971) Thanks @sfuminya, @GitZhangChi, @dashhuang, @samzong, @v1p0r, and @neeravmakwana.
  • Telegram: restore DM voice-note preflight transcription so direct-message audio stops arriving as raw <media:audio> placeholders. (#61008) Thanks @manueltarouca.
  • Telegram/reasoning: only create a Telegram reasoning preview lane when the session is explicitly reasoning:stream, so hidden <think> traces from streamed replies stop surfacing as chat previews on normal sessions. Thanks @vincentkoc.
  • Telegram/native command menu: trim long menu descriptions before dropping commands so sub-100 command sets can still fit Telegram’s payload budget and keep more / entries visible. (#61129) Thanks @neeravmakwana.
  • Telegram/startup: bound deleteWebhook, getMe, and setWebhook startup requests while keeping the longer getUpdates poll timeout, so wedged Telegram control-plane calls stop hanging startup indefinitely. (#61601) Thanks @neeravmakwana.
  • Agents/failover: classify Anthropic “extra usage” exhaustion as billing so same-turn model fallback still triggers when Claude blocks long-context requests on usage limits. (#61608) Thanks @neeravmakwana.
  • Discord: keep REST, webhook, and monitor traffic on the configured proxy, preserve component-only media sends, honor @everyone and @here mention gates, keep ACK reactions on the active account, and split voice connect/playback timeouts so auto-join is more reliable. (#57465, #60361, #60345) Thanks @geekhuashan.
  • Discord/reply tags: strip leaked [[reply_to_current]] control tags from preview text and honor explicit reply-tag threading during final delivery, so Discord replies stay attached to the triggering message instead of printing reply metadata into chat.

💡 深度点评

这是一篇关于 OpenClaw 2026.4.5 更新内容的深度点评:

核心亮点

  • 多模态生成能力的深度工具化 本版本正式引入了 video_generatemusic_generate 内置工具,支持 Agent 直接调用。值得关注的是对 ComfyUI 工作流插件 的整合,它允许开发者将本地或云端的 ComfyUI 工作流(包括 prompt 注入、参考图上传)直接接入 Agent,实现从单一模型调用向复杂 AIGC 工作流的跨越。此外,集成了 Google Lyria、MiniMax、Runway 以及 Alibaba Wan 等主流多模态模型,完善了从生成到异步任务追踪的闭环。
  • 实验性“梦境”记忆系统(Memory Dreaming) 这是对 Agent 长期记忆架构的一次重大重塑。该功能将记忆晋升分为轻度、深度和 REM 三个协作阶段,通过异步处理将短期对话碎片提炼为 dreams.md 中的持久知识。这种模拟人类睡眠的记忆固化机制,配合加权短期回溯和 conceptual tagging,有效解决了长周期运行下 Agent 知识冗余与关键信息遗忘的权衡问题。
  • 提示词缓存(Prompt Caching)的极致优化 针对长上下文场景,OpenClaw 进行了底层的工程优化。通过标准化系统提示词指纹(处理空白符、换行符等差异)、实现 MCP 工具排序确定性,并从系统提示词中移除了重复的带内工具声明,强制模型依赖结构化定义。这些改动显著提升了 KV Cache 的重用率,直接降低了多轮对话的延迟与 Token 成本。

值得注意的修复

  • 推理链路隔离与隐私保护:修复了在 Telegram 和飞书等频道中 <think> 思考标签外泄的问题,现在仅在显式开启 reasoning:stream 时才会显示推理过程,保证了普通对话界面的整洁。
  • 安全性加固:修补了多个关键安全漏洞,包括插件工具白名单绕过、浏览器 SSRF 重定向规避,以及设备配对过程中的 Token 劫持风险,进一步收紧了 exec 执行策略。
  • 跨平台交付稳定性:解决了 Discord 生成图片路径丢失、Telegram 语音消息预检转录失效以及 iOS/Matrix 原生审批流状态清理不及时等影响生产体验的边缘 Case。

个人评价

OpenClaw 2026.4.5 是一个从“交互式助手”向“生产级自主智能体”转化的关键版本。它不仅在多模态生成上提供了更灵活的插件化方案,更通过“梦境系统”和“缓存一致性优化”在架构层面尝试解决 Agent 的智力稳定性与成本问题。对于开发者而言,这不仅是功能堆砌,更是对 Agent 底层运行机制的一次深度打磨,整体价值导向非常明确:追求更高的工程可靠性与长周期任务处理能力。


数据来源: GitHub openclaw/openclaw

Generated by OpenClaw at 2026-04-07 08:00:59