Ecosystem Update - 2026-05-31
Highlights
- Safe Quick Win implemented: wired the existing quiet
session_startup.pyreadiness check intoSessionStartfor startup/resume/clear/compact - Tier 1 community sources were stale today; the useful workflow patterns are already covered locally or require repo-specific scripts
- Official Codex remains at stable
0.135.0while this machine reportscodex-cli 0.133.0; main-branch commits show thread archive, multi-agent assignment, and request-input work to watch after a stable release
Quick Wins (implemented today)
-
SessionStart runtime validation hook wiring hookAdd existing
session_startup.pytohooks.jsonwith `startup
New Tools, Skills & Patterns
-
Codex 0.135.0 stable upgrade and smoke runtimehttps://github.com/openai/codex/releases/tag/rust-v0.135.0 - Current CLI is still
0.133.0;0.135.0adds richercodex doctor, named permission profile display, packaged zsh helper discovery, SDK sandbox presets, and TUI fixes. Queue because upgrading the installed CLI is a runtime authority boundary -
Thread archive CLI command intake Codex-mdhttps://github.com/openai/codex/commit/3e7baa00e43419967d90d6ad9cef40f58d5ac89f - Main now has thread archive CLI command work. Compare it to
codex-session-searchand the native conversation-history migration decision once it lands in a stable release -
PermissionRequest policy hook design hookhttps://developers.openai.com/codex/hooks - Codex now documents
PermissionRequest; current harness blocks dangerous direct Bash viaPreToolUse, but has no approval-request policy script. Build only after defining what approval prompts should be allowed, denied, or delegated -
Subagent lifecycle ledger hooks hookhttps://developers.openai.com/codex/hooks - Official docs support
SubagentStartandSubagentStop; has read-only reviewers and bounded workers but no subagent lifecycle evidence hook. Needs a small existing script first, so it is not a Quick Win -
Plugin-bundled hook trust review follow-up plugin/hookhttps://developers.openai.com/codex/hooks - Docs now say enabled plugins can bundle lifecycle hooks using the normal trust flow. Current config intentionally has
plugin_hooks = false; review this only after a Codex upgrade because blindly enabling plugin hooks changes the trust surface -
SpecBench spec-level reasoning eval adapterhttps://arxiv.org/abs/2605.30314 - SpecBench targets incomplete or flawed design proposals before implementation. It maps directly to
alignment-grill, planning-gate, and AgentOps acceptance criteria -
LACUNA typed action admission spike
-
Cross-trajectory memory evalhttps://arxiv.org/abs/2605.28224 - The paper separates memory abstraction from inference strategy
Research Worth Reading
-
SpecBench: Evaluating Specification-Level Reasoning for Software Engineering LLM Agents- Directly relevant to catching PRD/spec gaps before broad implementation
-
Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents- Useful framing for checking whether multiple reviewers or workers produce locally plausible but globally inconsistent conclusions
-
LACUNA: Safe Agents as Recursive Program Holes- Relevant to typed action admission, tool/data bounds, and retry-on-diagnostics loops
-
When Does Memory Help Multi-Trajectory Inference for Tool-Use LLM Agents?- Supports testing memory promotion/retrieval by strategy instead of assuming all memory helps all retries
-
Plant, Persist, Trigger: Sleeper Attack on Large Language Model Agents- Reinforces keeping untrusted source ingestion, memory writes, and reusable skills behind strict admission
-
OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents- Good model for workspace-level evals with persistent files, revisions, and grounded explanations
Considered, Not Adopting
Items reviewed and explicitly declined this cycle, with the reason. Curation discipline matters more than coverage.
- Auto-upgrade Codex to 0.135.0 — - rejected as a Quick Win because upgrading the installed CLI crosses the runtime authority boundary
-
Enable plugin hooks or remove
plugin_hooks = falseautomatically — - rejected because current policy intentionally keeps plugin hooks disabled, and enabling them changes hook trust behavior -
Wire
PermissionRequest,SubagentStart, orSubagentStopwithout existing scripts — - rejected by the Quick Win hard limit; these need purpose-built scripts first - Enable native Codex memories from community advice
-
Enable
experimental_request_user_inputfrom main-branch work — - rejected because AGENTS.md forbids enabling under-development features globally - Wholesale import community skill or plugin catalogs — - rejected because outside skills require strict audit and the local skill set already covers the recurring workflows
- Global auto-format hooks — - rejected because safe formatter hooks need repo-specific commands and existing scripts