Ecosystem Update — 2026-04-13

April 13, 2026 · curated by Chad Simon · 27 items reviewed

Highlights

Semi-formal reasoning paper shows structured prompting (premises → trace → conclude) improves code reasoning 10% without execution — directly applicable to
Toolkit explosion: awesome-claude-code-toolkit now at 135+ agents, 400k+ skills via SkillKit — but signal-to-noise is dropping; most items are domain plugins or duplicate orchestrators
New hook patterns worth stealing: cc-safe-setup bundles 6 safety hooks in one-command install; claude-code-hooks repo has 15 battle-tested hooks from 160+ hours autonomous operation; obey has 17 lifecycle hooks for rule enforcement
Tier 3 fetched this cycle (8 days since last); Tier 2 fetched (4 days since last)

Quick Wins (implemented today)

_None this cycle_ —

Setup remains mature; remaining gaps require new scripts or skills

New Tools, Skills & Patterns

Semi-formal Reasoning Prompt Pattern claude-md

arXiv

Structured prompting: construct premises, trace execution paths, derive formal conclusions. 10% accuracy improvement on code reasoning tasks. Could be added as reviewer agent instruction pattern. Impact 3, Effort 1, Priority 3.0 — but this is a prompt change to reviewer agent bodies (not frontmatter-only). Estimated: update reviewer.md and python-reviewer.md, ~20 LOC
cc-safe-setup hook

github.com/yurukusa/cc-safe-setup

One-command install of 6 essential safety hooks (destructive command blocking, secret detection, large file prevention). Our hooks cover Stop/PreCompact/UserPromptSubmit but lack deterministic PreToolUse safety guards. Impact 2, Effort 2, Priority 1.0
claude-code-hooks (battle-tested set) hook

github.com/yurukusa/claude-code-hooks

15 production-tested hooks from 160+ hours autonomous operation. Includes PostToolUse linting, notification hooks, status hooks. Cherry-pick 2–3 that fill lifecycle gaps. Impact 2, Effort 2, Priority 1.0
skills-janitor skill

github.com/khendzel/skills-janitor

Audits and deduplicates skills with 9 slash commands. We have 34 skill directories — some may be stale or overlapping. Impact 2, Effort 1, Priority 2.0. Borderline: we can do this with rg and manual review
review-squad agent-pattern

github.com/2389-research/review-squad

Multi-perspective code review via subagent panels. Our reviewers run solo, not as a panel. Panel pattern could improve coverage. Impact 2, Effort 2, Priority 1.0
test-kitchen agent-pattern

github.com/2389-research/test-kitchen

Parallel competing subagents with structured winner selection. Could enhance planning-gate by generating competing approaches before committing. Impact 2, Effort 2, Priority 1.0
preflight prompt validator mcp

github.com/preflight-dev/preflight

24-tool MCP server catching vague prompts before wasted cycles. Our UserPromptSubmit classifies route but doesn't validate prompt quality. Impact 2, Effort 2, Priority 1.0
Auto-Dream Memory Consolidation agent-pattern

howborisusesclaudecode.com

Subagent periodically reviews past sessions, merges insights. Our memory workflow is manual. Impact 2, Effort 2, Priority 1.0
reporecall mcp

github.com/proofofwork-agency/reporecall

Tree-sitter AST indexing (22 languages) with ~5ms context injection. Impact 2, Effort 3, Priority 0.7

Research Worth Reading

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

arXiv

Taxonomy of 60 benchmarks, surveys agent frameworks, examines collaboration protocols (ACP, MCP, A2A). Reference material for governed agent architecture
Memory for Autonomous LLM Agents

arXiv

Five memory mechanism families including reflective self-improvement and policy-learned management. Key gap for us: we lack "learned forgetting" — memory grows but never prunes automatically
Agentic Code Reasoning (Semi-formal)

arXiv

Semi-formal reasoning improves patch equivalence (78→88%), code QA (87% on RubberDuckBench), fault localization. Basis for the Build Queue item above
Agent Contracts: Resource-Bounded Autonomous AI

arXiv

Formal framework for resource and temporal constraints. Demonstrates 90% token reduction in iterative workflows. Our dispatch budgets are a crude version — this paper's formal approach could refine budget model
Deep Researcher Agent: 24/7 Experimentation

arXiv

Framework for autonomous around-the-clock experiments. Relevant to Ralph pattern and autonomous execution loop

Considered, Not Adopting

Items reviewed and explicitly declined this cycle, with the reason. Curation discipline matters more than coverage.

oh-my-claudecode
production-grade — (14-agent autonomous workflow) — Same class: role-based decomposition vs skill-based decomposition
ORCH — (CLI orchestrating Code, Codex, Cursor) — Multi-tool orchestrator. We're single-tool. Scope creep
vibe-kanban — Kanban UI is a new surface
cozempic — (13 pruning strategies) — PreCompact + Stop hooks cover this. Overengineered
knowledge-graph — Another persistence layer fails one-sentence proof
claude-supermemory — (cross-session via Supermemory platform) — External platform dependency
fractal — (recursive decomposition) — planning-gate + solution ladder covers decomposition
harness-evolver — (LangSmith-native prompt evolution) — External service dependency
brooks-lint — (code reviews from 6 classic books) — Too opinionated. Would conflict with standards
jarvis — (76 tasks, 12 AI teams) — Not domain
discoclaw — (Discord bot) — We use Zoom

Ecosystem Update — 2026-04-13

Highlights

Quick Wins (implemented today)

New Tools, Skills & Patterns

Research Worth Reading

Considered, Not Adopting

Sources Reviewed