🔧 Openclaw 更新 2026.4.5
发布日期: 2026-04-07
⚠️ 新版本发布
Breaking
- Config: remove legacy public config aliases such as
talk.voiceId/talk.apiKey,agents.*.sandbox.perSession,browser.ssrfPolicy.allowPrivateNetwork,hooks.internal.handlers, and channel/group/roomallowtoggles in favor of the canonical public paths andenabled, while keeping load-time compatibility andopenclaw doctor --fixmigration support for existing configs. (#60726) Thanks @vincentkoc.
Changes
- Agents/video generation: add the built-in
video_generatetool so agents can create videos through configured providers and return the generated media directly in the reply. - Agents/music generation: ignore unsupported optional hints such as
durationSecondswith a warning instead of hard-failing requests on providers like Google Lyria. - Providers/Arcee AI: add a bundled Arcee AI provider plugin with
ARCEEAI_API_KEYonboarding, Trinity model catalog (mini, large-preview, large-thinking), OpenAI-compatible API support, and OpenRouter as an alternative auth path. (#62068) Thanks @arthurbr11. - Providers/ComfyUI: add a bundled
comfyworkflow media plugin for local ComfyUI and Comfy Cloud workflows, including sharedimage_generate,video_generate, and workflow-backedmusic_generatesupport, with prompt injection, optional reference-image upload, live tests, and output download. - Tools/music generation: add the built-in
music_generatetool with bundled Google (Lyria) and MiniMax providers plus workflow-backed Comfy support, including async task tracking and follow-up delivery of finished audio. - Providers: add bundled Qwen, Fireworks AI, and StepFun providers, plus MiniMax TTS, Ollama Web Search, and MiniMax Search integrations for chat, speech, and search workflows. (#60032, #55921, #59318, #54648)
- Providers/Amazon Bedrock: add bundled Mantle support plus inference-profile discovery and automatic request-region injection so Bedrock-hosted Claude, GPT-OSS, Qwen, Kimi, GLM, and similar routes work with less manual setup. (#61296, #61299) Thanks @wirjo.
- Control UI/multilingual: add localized control UI support for Simplified Chinese, Traditional Chinese, Brazilian Portuguese, German, Spanish, Japanese, Korean, French, Turkish, Indonesian, Polish, and Ukrainian. Thanks @vincentkoc.
- Plugins: add plugin-config TUI prompts to guided onboarding/setup flows, and add
openclaw plugins install --forceso existing plugin and hook-pack targets can be replaced without using the dangerous-code override flag. (#60590, #60544) - Control UI/skills: add ClawHub search, detail, and install flows directly in the Skills panel. (#60134) Thanks @samzong.
- iOS/exec approvals: add generic APNs approval notifications that open an in-app exec approval modal, fetch command details only after authenticated operator reconnect, and clear stale notification state when the approval resolves. (#60239) Thanks @ngutman.
- Matrix/exec approvals: add Matrix-native exec approval prompts with account-scoped approvers, channel-or-DM delivery, and room-thread aware resolution handling. (#58635) Thanks @gumadeiras.
Fixes
- Control UI/chat: show
/ttsand other local audio-only slash replies in webchat by embedding local audio in the assistant message and rendering<audio>controls instead of dropping empty-text finals. Fixes #61564. (#61598) Thanks @neeravmakwana. - Security: preserve restrictive plugin-only tool allowlists, require owner access for
/allowlist addand/allowlist remove, fail closed whenbefore_tool_callhooks crash, block browser SSRF redirect bypasses earlier, and keep non-interactive auth-choice inference scoped to bundled and already-trusted plugins. (#58476, #59836, #59822, #58771, #59120) Thanks @eleqtrizit and @pgondhi987. - Providers/OpenAI: make GPT-5 and Codex runs act sooner with lower-verbosity defaults, visible progress during tool work, and a one-shot retry when a turn only narrates the plan instead of taking action.
- Providers/OpenAI and reply delivery: preserve native
reasoning.effort: "none"and strict schemas where supported, add GPT-5.4 assistantphasemetadata across replay and the Gateway/v1/responseslayer, and keep commentary buffered untilfinal_answerso web chat, session previews, embedded replies, and Telegram partials stop leaking planning text. Fixes #59150, #59643, #61282. - Telegram: fix current-model checks in the model picker, HTML-format non-default
/modelconfirmations, explicit topic replies, persisted reaction ownership across restarts, caption-media placeholder andfile_idpreservation on download failure, and upgraded-install inbound image reads. (#60384, #60042, #59634, #59207, #59948, #59971) Thanks @sfuminya, @GitZhangChi, @dashhuang, @samzong, @v1p0r, and @neeravmakwana. - Telegram: restore DM voice-note preflight transcription so direct-message audio stops arriving as raw
<media:audio>placeholders. (#61008) Thanks @manueltarouca. - Telegram/reasoning: only create a Telegram reasoning preview lane when the session is explicitly
reasoning:stream, so hidden<think>traces from streamed replies stop surfacing as chat previews on normal sessions. Thanks @vincentkoc. - Telegram/native command menu: trim long menu descriptions before dropping commands so sub-100 command sets can still fit Telegram’s payload budget and keep more
/entries visible. (#61129) Thanks @neeravmakwana. - Telegram/startup: bound
deleteWebhook,getMe, andsetWebhookstartup requests while keeping the longergetUpdatespoll timeout, so wedged Telegram control-plane calls stop hanging startup indefinitely. (#61601) Thanks @neeravmakwana. - Agents/failover: classify Anthropic “extra usage” exhaustion as billing so same-turn model fallback still triggers when Claude blocks long-context requests on usage limits. (#61608) Thanks @neeravmakwana.
- Discord: keep REST, webhook, and monitor traffic on the configured proxy, preserve component-only media sends, honor
@everyoneand@heremention gates, keep ACK reactions on the active account, and split voice connect/playback timeouts so auto-join is more reliable. (#57465, #60361, #60345) Thanks @geekhuashan. - Discord/reply tags: strip leaked
[[reply_to_current]]control tags from preview text and honor explicit reply-tag threading during final delivery, so Discord replies stay attached to the triggering message instead of printing reply metadata into chat.
💡 深度点评
这是一篇关于 OpenClaw 2026.4.5 更新内容的深度点评:
核心亮点
- 多模态生成能力的深度工具化
本版本正式引入了
video_generate和music_generate内置工具,支持 Agent 直接调用。值得关注的是对 ComfyUI 工作流插件 的整合,它允许开发者将本地或云端的 ComfyUI 工作流(包括 prompt 注入、参考图上传)直接接入 Agent,实现从单一模型调用向复杂 AIGC 工作流的跨越。此外,集成了 Google Lyria、MiniMax、Runway 以及 Alibaba Wan 等主流多模态模型,完善了从生成到异步任务追踪的闭环。 - 实验性“梦境”记忆系统(Memory Dreaming)
这是对 Agent 长期记忆架构的一次重大重塑。该功能将记忆晋升分为轻度、深度和 REM 三个协作阶段,通过异步处理将短期对话碎片提炼为
dreams.md中的持久知识。这种模拟人类睡眠的记忆固化机制,配合加权短期回溯和 conceptual tagging,有效解决了长周期运行下 Agent 知识冗余与关键信息遗忘的权衡问题。 - 提示词缓存(Prompt Caching)的极致优化 针对长上下文场景,OpenClaw 进行了底层的工程优化。通过标准化系统提示词指纹(处理空白符、换行符等差异)、实现 MCP 工具排序确定性,并从系统提示词中移除了重复的带内工具声明,强制模型依赖结构化定义。这些改动显著提升了 KV Cache 的重用率,直接降低了多轮对话的延迟与 Token 成本。
值得注意的修复
- 推理链路隔离与隐私保护:修复了在 Telegram 和飞书等频道中
<think>思考标签外泄的问题,现在仅在显式开启reasoning:stream时才会显示推理过程,保证了普通对话界面的整洁。 - 安全性加固:修补了多个关键安全漏洞,包括插件工具白名单绕过、浏览器 SSRF 重定向规避,以及设备配对过程中的 Token 劫持风险,进一步收紧了
exec执行策略。 - 跨平台交付稳定性:解决了 Discord 生成图片路径丢失、Telegram 语音消息预检转录失效以及 iOS/Matrix 原生审批流状态清理不及时等影响生产体验的边缘 Case。
个人评价
OpenClaw 2026.4.5 是一个从“交互式助手”向“生产级自主智能体”转化的关键版本。它不仅在多模态生成上提供了更灵活的插件化方案,更通过“梦境系统”和“缓存一致性优化”在架构层面尝试解决 Agent 的智力稳定性与成本问题。对于开发者而言,这不仅是功能堆砌,更是对 Agent 底层运行机制的一次深度打磨,整体价值导向非常明确:追求更高的工程可靠性与长周期任务处理能力。
数据来源: GitHub openclaw/openclaw
Generated by OpenClaw at 2026-04-07 08:00:59