~/chadacus.dev/ecosystem-update/2026-05-15

Ecosystem Update - 2026-05-15

May 15, 2026 · curated by Chad Simon · 16 items reviewed

Highlights

  • One safe harness Quick Win was implemented: features.codex_hooks was renamed to the canonical features.hooks in ~/.codex/config.toml
  • Today's Codex hook signal is about missing upstream lifecycle surfaces, especially ProviderError / TurnError; local implementation has to wait for a real event
  • Today's arXiv signal reinforces the existing runtime posture: prefer direct corpus interaction with rg, evaluate long-horizon adaptation explicitly, and use pairwise verifier aggregation before trusting parallel agent output

Quick Wins (implemented today)

  • Canonical hooks feature flag hook
    Replace deprecated features.codex_hooks = true with features.hooks = true in ~/.codex/config.toml

New Tools, Skills & Patterns

  • ProviderError / TurnError hook watcher hook
    openai/codex#22774 - Track the upstream proposal for provider/API failure hooks, then wire recovery/notification/backoff only after Codex emits a stable event. Current runtime cannot implement this safely because no such hook event exists
  • Plugin hooks trust review gate hook
    OpenAI plugin hooks docs - Before enabling features.plugin_hooks, audit installed plugin hook manifests and commands. This is useful, but not a Quick Win because plugin hooks can execute bundled commands from marketplace/plugin packages
  • FutureSim-style autonomy eval intake
    FutureSim - Add a lightweight evaluation story for long-horizon adaptation after knowledge cutoff, tied to the existing task-eval/autonomy harness rather than a new service
  • Bradley-Terry parallel candidate selector spike
    OpenDeepThink - Evaluate pairwise ranking of parallel agent outputs for objectively testable tasks before adopting as a verifier pattern

Research Worth Reading

  • Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
    - Directly supports the current rg-first search posture and warns that harness/tool-output presentation changes retrieval quality
  • FutureSim: Replaying World Events to Evaluate Adaptive Agents
    - Useful benchmark pattern for testing whether autonomous runtime changes improve adaptation over time instead of only static task completion
  • OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation
    - Candidate verifier pattern for ranking parallel agent outputs with pairwise comparisons instead of a single noisy pointwise judge
  • Articraft: An Agentic System for Scalable Articulated 3D Asset Generation
    - Domain-specific, but its restricted workspace, domain SDK, tests, and structured feedback loop map cleanly to harness design

Considered, Not Adopting

Items reviewed and explicitly declined this cycle, with the reason. Curation discipline matters more than coverage.

  • Enable plugin hooks blindly- rejected because plugin hooks execute bundled lifecycle commands; requires a trust review gate and explicit enablement
  • Add new community hook scripts- rejected by the skill hard limit: no new hook script can be wired unless the script already exists and the target file is explicitly identified
  • Wholesale import from Claude/CodeAlive/awesome skill catalogs- rejected because Codex-owned runtime surfaces must not depend on Claude-owned layouts or unaudited outside skills
  • Native Codex memories flip-onnative memories remains a pilot decision, not a default switch
  • UserPromptSubmit prompt telemetry hooks- rejected because global prompt telemetry is opt-in and the current policy forbids logging user prompts by default
  • SessionEnd workaround hook- rejected because the upstream issue was closed as not planned and local heuristics would be brittle
  • Default worktree isolation for all subagents- rejected because current agent fan-out is deliberately bounded; automatic worktree orchestration would add complexity without a proven recurring failure

Sources Reviewed

// archive

← back to all digests