Ecosystem Update — 2026-04-13

April 13, 2026 · generated by the ecosystem-update Claude Skill

TL;DR

Semi-formal reasoning paper shows structured prompting (premises → trace → conclude) improves code reasoning 10% without execution — directly applicable to our reviewer agents
Toolkit explosion: awesome-claude-code-toolkit now at 135+ agents, 400k+ skills via SkillKit — but signal-to-noise is dropping; most items are domain plugins or duplicate orchestrators
New hook patterns worth stealing: cc-safe-setup bundles 6 safety hooks in one-command install; claude-code-hooks repo has 15 battle-tested hooks from 160+ hours autonomous operation; obey has 17 lifecycle hooks for rule enforcement
Tier 3 fetched this cycle (8 days since last); Tier 2 fetched (4 days since last)

Quick Wins

Item	Source	Type	Impact	Effort	Action
None this cycle	—	—	—	—	Setup remains mature; remaining gaps require new scripts or skills

No Quick Wins this cycle. Frontmatter additions and one-liner config changes were picked up in earlier runs. Everything new requires new scripts, new skill directories, or external tool installs — those are Build Queue items.

Build Queue

Semi-formal Reasoning Prompt Pattern (claude-md) — arxiv 2603.01896 — Structured prompting: construct premises, trace execution paths, derive formal conclusions. 10% accuracy improvement on code reasoning tasks. Could be added as reviewer agent instruction pattern. Impact 3, Effort 1, Priority 3.0 — but this is a prompt change to reviewer agent bodies (not frontmatter-only). Estimated: update reviewer.md and python-reviewer.md, ~20 LOC.
cc-safe-setup (hook) — yurukusa/cc-safe-setup — One-command install of 6 essential safety hooks (destructive command blocking, secret detection, large file prevention). Our hooks cover Stop/PreCompact/UserPromptSubmit but lack deterministic PreToolUse safety guards. Impact 2, Effort 2, Priority 1.0.
claude-code-hooks (battle-tested set) (hook) — yurukusa/claude-code-hooks — 15 production-tested hooks from 160+ hours autonomous operation. Includes PostToolUse linting, notification hooks, status hooks. Cherry-pick 2–3 that fill lifecycle gaps. Impact 2, Effort 2, Priority 1.0.
skills-janitor (skill) — khendzel/skills-janitor — Audits and deduplicates skills with 9 slash commands. We have 34 skill directories — some may be stale or overlapping. Impact 2, Effort 1, Priority 2.0. Borderline: we can do this with rg and manual review.
review-squad (agent-pattern) — 2389-research/review-squad — Multi-perspective code review via subagent panels. Our reviewers run solo, not as a panel. Panel pattern could improve coverage. Impact 2, Effort 2, Priority 1.0.
test-kitchen (agent-pattern) — 2389-research/test-kitchen — Parallel competing subagents with structured winner selection. Could enhance planning-gate by generating competing approaches before committing. Impact 2, Effort 2, Priority 1.0.
preflight prompt validator (mcp) — preflight-dev/preflight — 24-tool MCP server catching vague prompts before wasted cycles. Our UserPromptSubmit classifies route but doesn't validate prompt quality. Impact 2, Effort 2, Priority 1.0.
Auto-Dream Memory Consolidation (agent-pattern) — howborisusesclaudecode.com — Carried forward from last cycle. Subagent periodically reviews past sessions, merges insights. Our memory workflow is manual. Impact 2, Effort 2, Priority 1.0.
reporecall (mcp) — proofofwork-agency/reporecall — Tree-sitter AST indexing (22 languages) with ~5ms context injection. Could supplement omni-mem for code-level retrieval. Impact 2, Effort 3, Priority 0.7.

Research

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review — Taxonomy of 60 benchmarks, surveys agent frameworks, examines collaboration protocols (ACP, MCP, A2A). Reference material for governed agent architecture.
Memory for Autonomous LLM Agents — Five memory mechanism families including reflective self-improvement and policy-learned management. Key gap for us: we lack "learned forgetting" — our memory grows but never prunes automatically.
Agentic Code Reasoning (Semi-formal) — Semi-formal reasoning improves patch equivalence (78→88%), code QA (87% on RubberDuckBench), fault localization. Basis for the Build Queue item above.
Agent Contracts: Resource-Bounded Autonomous AI — Formal framework for resource and temporal constraints. Demonstrates 90% token reduction in iterative workflows. Our dispatch budgets are a crude version — this paper's formal approach could refine our budget model.
Deep Researcher Agent: 24/7 Experimentation — Framework for autonomous around-the-clock experiments. Relevant to our Ralph pattern and autonomous execution loop.

Already Have

isolation: worktree, context: fork, PostCompact hook, PreCompact hook, Stop hook, UserPromptSubmit hook, once: true modifier, PermissionRequest routing (via classify_prompt), type: prompt hooks, matcher/statusMessage fields, per-agent model overrides, allowed-tools restrictions, auto mode, batch command, agent teams awareness, session teleportation awareness, /btw side queries awareness, remote control awareness, memory workflow (native + omni-mem), planning-gate, Ralph loop, skill-creator, skill-installer, explorer read-only, isolation on all reviewer/planner/worker agents, cc-devops-skills awareness, fullstack-dev-skills awareness, Trail of Bits security skills awareness, context engineering kit awareness, compound engineering plugin awareness, container use awareness, ccmanager awareness, bouncer quality gate awareness, codetape awareness, harness meta-skill awareness, preflight MCP awareness, git worktree infrastructure, /govern orchestration, auto_runtime.py event-sourced tracking, dispatch budgets, postflight acceptance checking, what-would-chad-do reflection, route canary, enterprise maturity rubric

Rejected

oh-my-claudecode (19 agents, 28 skills) — Fails overengineering gate: our curated 10-agent roster + /govern already covers this
production-grade (14-agent autonomous workflow) — Same class: role-based decomposition vs our skill-based decomposition
ORCH (CLI orchestrating Code, Codex, Cursor) — Multi-tool orchestrator. We're single-tool. Scope creep.
vibe-kanban (Kanban-based agent coordination) — /govern handles coordination. Kanban UI is a new surface.
cozempic (13 pruning strategies) — PreCompact + Stop hooks cover this. Overengineered.
knowledge-graph (git-native context persistence) — omni-mem with fact graph covers this. Another persistence layer fails one-sentence proof.
claude-supermemory (cross-session via Supermemory platform) — External platform dependency. omni-mem is local.
fractal (recursive decomposition) — planning-gate + solution ladder covers decomposition
harness-evolver (LangSmith-native prompt evolution) — External service dependency
brooks-lint (code reviews from 6 classic books) — Too opinionated. Would conflict with our standards.
jarvis (76 tasks, 12 AI teams) — Not our domain
discoclaw (Discord bot) — We use Zoom

Sources checked: awesome-claude-code, howborisusesclaudecode.com, claude-code-best-practice, awesome-claude-code-toolkit, claude-code-new-features-early-2026, claude-code-hooks-mastery, WebSearch: github.com 2026 hooks/agents/skills, WebSearch: arxiv LLM agent coding 2026 Tier 2 fetched: yes (arxiv — last was 2026-04-09, 4 days ago) Tier 3 fetched: yes (awesome-claude-code-toolkit — last was 2026-04-05, 8 days ago) Run at: 2026-04-13T15:00:00Z Mode: --dry-run (no implementations)