Ecosystem Update - 2026-05-10

May 10, 2026 · generated by the ecosystem-update Claude Skill

TL;DR

One safe Quick Win was implemented: the read-only OpenAI developer docs MCP server now opts into parallel tool calls.
Today's strongest new research signals are about constraint drift, skill least privilege, and judge-policy invariance; all map to the harness as evaluator or audit work, not immediate runtime rewrites.
The current setup already covers the high-value community patterns: subagents, read-only reviewers, hooks, omni-mem lifecycle hooks, skill audit, plugins, browser/Gmail/docs tooling, and autonomous runtime wrappers.

Quick Wins

Item	Source	Type	Impact	Effort	Action
OpenAI developer docs MCP parallel calls	https://github.com/shanraisshan/codex-cli-best-practice/blob/main/best-practice/codex-mcp.md and https://developers.openai.com/codex/config-schema.json	mcp	2	1	Auto-implemented `supports_parallel_tool_calls = true` for `[mcp_servers.openaiDeveloperDocs]`

Auto-Implemented

Updated /Users/chadsimon/.codex/config.toml so the official docs MCP server can run independent read-only tool calls in parallel.
Backed up config.toml, hooks.json, and all agent TOMLs to /Users/chadsimon/.codex/backups/2026-05-10/ before editing.
Verified config.toml, hooks.json, and all /Users/chadsimon/.codex/agents/*.toml parse after the change.

Build Queue

PreToolUse/PostToolUse Bash policy gates (hook) - https://github.com/shanraisshan/codex-cli-best-practice/blob/main/best-practice/codex-hooks.md - Current hooks cover SessionStart, Stop, and PreCompact, but no Bash tool guard exists. Worth building only after defining a small existing-script-backed allow/deny policy; the skill hard limit blocks adding hook wiring before the script exists.
Agent-scoped MCP allowlists (mcp) - https://github.com/shanraisshan/codex-cli-best-practice/blob/main/best-practice/codex-mcp.md - Current config exposes many MCP servers globally. A future pass should inventory actual tool use and narrow high-risk servers to specific agents instead of blanket access.
Constraint decay regression check (research) - https://arxiv.org/abs/2605.06445 - Add a small eval that catches architecture, database, and interface constraints drifting during backend/codegen tasks.
SkillScope-style least-privilege audit (skill) - https://arxiv.org/abs/2605.05868 - Extend the existing codex-skill-audit --strict workflow with declared tool/file/network expectations for imported skills before trust.
Judge policy invariance check (research) - https://arxiv.org/abs/2605.06161 - Add a validator/evaluator variant that perturbs review policy wording and flags unstable verdicts on the same artifact.
Maintenance score for agent-generated code (research) - https://arxiv.org/abs/2605.06464 - Fold maintainability signals into evaluate or refactor rather than adding a new service.

Research

MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems - Relevant to route-manifest and agent prompt tuning, but should stay research until a bounded benchmark proves lift.
Constraint Decay: The Fragility of LLM Agents in Backend Code Generation - Directly relevant to planning-gate and acceptance checks for architecture-heavy work.
SkillScope: Toward Fine-Grained Least-Privilege Enforcement for Agent Skills - Directly relevant to outside-skill trust decisions and the existing strict skill audit.
Beyond Accuracy: Policy Invariance as a Reliability Test for LLM Safety Judges - Useful for reviewer/evaluator stability checks before trusting automated verdicts.
To What Extent Does Agent-generated Code Require Maintenance? An Empirical Study - Supports adding maintainability gates to post-build evaluation instead of only correctness gates.
VibeServe: Can AI Agents Build Bespoke LLM Serving Systems? - Interesting multi-agent loop study, but not directly actionable for the local harness today.

Already Have

Codex-owned AGENTS.md contract, model = "gpt-5.5", approval_policy = "never", sandbox_mode = "danger-full-access", prompt telemetry off, web_search = "live", codex_hooks = true, goals = true, plugins enabled, OpenAI developer docs MCP, omni-mem MCP and lifecycle hooks, SessionStart cached repo context hook, Stop memory save hook, PreCompact memory hook, read-only explorer/planner/reviewer/validator agents, Python and TypeScript reviewers, workspace-write worker and chad-twin agents, agent concurrency caps, official/bundled plugin marketplaces, Browser/Gmail/Documents/Presentations/Spreadsheets plugins, skill-audit, session-recall, auto, drive, govern, planning-gate, rlm-scan, memory-adaptation, and prior ecosystem state dedupe.

Rejected

Wholesale import from awesome-claude-code - rejected: the repo is currently in an index rebuild state and the Codex contract requires copying or rewriting useful Claude behavior into Codex-owned surfaces, not runtime dependence.
New daemon or orchestration layer for hooks - rejected: overengineered. Current hooks.json, existing scripts, and Codex hook primitives are sufficient until a concrete recurring failure proves otherwise.
Enable native Codex memories immediately - rejected: already evaluated previously and conflicts with the current omni-mem default memory workflow unless there is an explicit pilot plan.
Add broad GitHub/Slack/Notion/MCP connectors from community skill directories - rejected: no proof that current CLI, web, and installed plugin surfaces are insufficient for today's workflow.
Policy-doc edits as Quick Wins - rejected by skill hard limit. Any AGENTS.md contract changes must be explicit user-directed work.

Sources checked: https://github.com/hesreallyhim/awesome-claude-code, https://howborisusesclaudecode.com/, https://github.com/shanraisshan/codex-cli-best-practice, https://github.com/shanraisshan/codex-cli-best-practice/blob/main/best-practice/codex-hooks.md, https://github.com/shanraisshan/codex-cli-best-practice/blob/main/best-practice/codex-mcp.md, https://arxiv.org/search/?searchtype=all&query=LLM+agent+coding&order=-announced_date_first, https://developers.openai.com/codex/config-schema.json, web search: "Codex new hooks agents skills site:github.com 2026" Tier 2 fetched: yes Tier 3 fetched: no - skipped because the last Tier 3 run was 2026-05-08T15:37:21Z, inside the 7-day window omni-mem: available; run summary saved Run at: 2026-05-10T10:33:44Z