Ecosystem Update - 2026-05-15

May 15, 2026 · generated by the ecosystem-update Claude Skill

TL;DR

One safe harness Quick Win was implemented: features.codex_hooks was renamed to the canonical features.hooks in ~/.codex/config.toml.
Today's Codex hook signal is about missing upstream lifecycle surfaces, especially ProviderError / TurnError; local implementation has to wait for a real event.
Today's arXiv signal reinforces the existing runtime posture: prefer direct corpus interaction with rg, evaluate long-horizon adaptation explicitly, and use pairwise verifier aggregation before trusting parallel agent output.

Quick Wins

Item	Source	Type	Impact	Effort	Action
Canonical hooks feature flag	OpenAI config reference, openai/codex#22148	hook	2	1	Replace deprecated `features.codex_hooks = true` with `features.hooks = true` in `~/.codex/config.toml`.

Auto-Implemented

Updated /Users/chadsimon/.codex/config.toml to use [features].hooks = true.
Backed up config.toml, hooks.json, and current agent TOMLs under /Users/chadsimon/.codex/backups/2026-05-15/.
Verified config.toml parses with tomllib, hooks.json parses with python3 -m json.tool, codex --version reports codex-cli 0.130.0, and python3 ~/.codex/bin/codex-runtime-doctor exits 0.

Build Queue

ProviderError / TurnError hook watcher (hook) - openai/codex#22774 - Track the upstream proposal for provider/API failure hooks, then wire recovery/notification/backoff only after Codex emits a stable event. Current runtime cannot implement this safely because no such hook event exists.
Plugin hooks trust review gate (hook) - OpenAI plugin hooks docs - Before enabling features.plugin_hooks, audit installed plugin hook manifests and commands. This is useful, but not a Quick Win because plugin hooks can execute bundled commands from marketplace/plugin packages.
FutureSim-style autonomy eval intake (research) - FutureSim - Add a lightweight evaluation story for long-horizon adaptation after knowledge cutoff, tied to the existing task-eval/autonomy harness rather than a new service.
Bradley-Terry parallel candidate selector spike (research) - OpenDeepThink - Evaluate pairwise ranking of parallel agent outputs for objectively testable tasks before adopting as a verifier pattern.

Research

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search - Directly supports the current rg-first search posture and warns that harness/tool-output presentation changes retrieval quality.
FutureSim: Replaying World Events to Evaluate Adaptive Agents - Useful benchmark pattern for testing whether autonomous runtime changes improve adaptation over time instead of only static task completion.
OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation - Candidate verifier pattern for ranking parallel agent outputs with pairwise comparisons instead of a single noisy pointwise judge.
Articraft: An Agentic System for Scalable Articulated 3D Asset Generation - Domain-specific, but its restricted workspace, domain SDK, tests, and structured feedback loop map cleanly to harness design.

Already Have

gpt-5.5 power-user default, approval_policy = "never", sandbox_mode = "danger-full-access", prompt telemetry off, canonical hooks = true, Bash PreToolUse safety guard, Bash PostToolUse verification ledger, Bash failure-context hook, SessionStart cached repo-context preflight, Stop omni-mem save hook, PreCompact omni-mem hook, OpenAI developer docs MCP, omni-mem MCP, Stitch MCP, Browser plugin, Computer Use plugin, Documents plugin, Spreadsheets plugin, Presentations plugin, Gmail plugin, read-only reviewer agents, read-only explorer/planner/validator agents, bounded agent max_threads = 3, what-would-chad-do, go, drive, auto, planning-gate, codex-security, security-audit, rlm-scan, session-recall, skill-audit, build-backlog, skill-creator, skill-installer, direct rg/file-search posture, and runtime doctor coverage.

Rejected

Enable plugin hooks blindly - rejected because plugin hooks execute bundled lifecycle commands; requires a trust review gate and explicit enablement.
Add new community hook scripts - rejected by the skill hard limit: no new hook script can be wired unless the script already exists and the target file is explicitly identified.
Wholesale import from Claude/CodeAlive/awesome skill catalogs - rejected because Codex-owned runtime surfaces must not depend on Claude-owned layouts or unaudited outside skills.
Native Codex memories flip-on - rejected as a Quick Win because the current runtime intentionally uses omni-mem; native memories remains a pilot decision, not a default switch.
UserPromptSubmit prompt telemetry hooks - rejected because global prompt telemetry is opt-in and the current policy forbids logging user prompts by default.
SessionEnd workaround hook - rejected because the upstream issue was closed as not planned and local heuristics would be brittle.
Default worktree isolation for all subagents - rejected because current agent fan-out is deliberately bounded; automatic worktree orchestration would add complexity without a proven recurring failure.

Sources checked: https://github.com/hesreallyhim/awesome-claude-code, https://howborisusesclaudecode.com/, https://github.com/shanraisshan/codex-cli-best-practice, https://arxiv.org/search/?searchtype=all&query=LLM+agent+coding&order=-announced_date_first, https://arxiv.org/api/query?search_query=all:agent%20AND%20all:grep&sortBy=submittedDate&sortOrder=descending&max_results=5, https://github.com/openai/codex/issues/22774, https://github.com/openai/codex/issues/22148, https://developers.openai.com/codex/config-reference, https://developers.openai.com/codex/config-schema.json Tier 2 fetched: yes Tier 3 fetched: no; last weekly run was 2026-05-08T15:37:21Z, under 7 days at run time. Targeted OpenAI config docs/schema were checked only to validate the Quick Win. Run at: 2026-05-15T10:34:01Z