Ecosystem Update - 2026-05-20

May 20, 2026 · generated by the ecosystem-update Claude Skill

TL;DR

Official Codex 0.132.0 shipped today and was safely applied; local CLI now reports codex-cli 0.132.0.
The strongest local hardening win was profile-scoped: conservative profiles now disable login-shell semantics without touching the power-user default.
Today's research signal is that harness quality matters as much as model quality: clean code reduces agent operating cost, and enterprise SaaS failures cluster around setup/integration before business logic.

Quick Wins

Item	Source	Type	Impact	Effort	Action
Stable Codex `0.132.0` upgrade	openai/codex release, npm `@openai/codex`	Codex-md	3	1	Upgrade from `0.131.0` to the stable `0.132.0` release and smoke-test the harness.
Conservative profile login-shell hardening	Codex config reference, codex-cli-best-practice	Codex-md	2	1	Add `allow_login_shell = false` only to opt-in conservative profiles.

Auto-Implemented

Backed up config.toml, hooks.json, and all current agent TOMLs under /Users/chadsimon/.codex/backups/2026-05-20/.
Ran codex update, which executed npm install -g @openai/codex; codex --version now reports codex-cli 0.132.0.
Updated /Users/chadsimon/.codex/config.toml so [profiles.conservative] and [profiles.conservative-auto-review] set allow_login_shell = false.
Verified config.toml parses with tomllib and hooks.json parses with python3 -m json.tool.
Verified base, conservative, and conservative-auto-review profiles with TERM=xterm-256color codex --strict-config doctor --summary --ascii; all returned 13 ok | 1 idle | 2 notes | 0 warn | 0 fail.
Verified posture and local harness tests: python3 ~/.codex/bin/codex_config_posture.py --mode warn passed, and PYTHONPATH=/Users/chadsimon/.codex/bin python3 -m unittest test_codex_config_posture test_codex_agentops_contract test_codex_task_manager passed 31 tests.
python3 ~/.codex/bin/auto_runtime.py contract-check still reports one pre-existing hard issue: MEMORY_CITATIONS_REQUIRED for memory.records.objective:init. This is unrelated to today's config edit and CLI update, but should be cleared before relying on AgentOps closure reports.

Build Queue

codex exec resume --output-schema adapter (Codex-md) - Codex 0.132.0 release - Existing local searches found output_schema use in autoconfig, but no resumed automation path. Add schema-preserving resume support where long-running codex exec jobs need structured closure after resumption.
Permission prompt miner for conservative profiles (Codex-md) - How Boris Uses Claude Code, Codex config reference - Boris's /fewer-permission-prompts pattern maps to Codex on-request profiles, but should be implemented as a report-only analyzer over local approvals/rules before changing allowlists.
Agent-view style session control plane intake (agent-pattern) - How Boris Uses Claude Code - Codex has subagents, goals, and session state, but no first-class local "all sessions by status" view. Evaluate whether existing codex doctor --json, state DBs, and goal state are enough before adding a new dashboard.
Notify skill legacy path cleanup (skill) - How Boris Uses Claude Code notifications - Current notify skill docs still reference legacy ~/.Codex/bin/notify_done.sh; update the skill body in an explicit skill-maintenance pass, not as a Quick Win.
SaaS setup/integration failure eval (research) - SaaSBench - Add a small harness eval that fails agents for premature closure during environment setup, dependency wiring, or service integration, not just business-logic tests.
Agent monitor evasion regression (research) - SLEIGHT-Bench - Current reviewer barriers are useful, but monitor-evasion examples suggest adding targeted prompts/tests for state manipulation, ambiguous user intent, and covert deployment/exfiltration patterns.

Research

Does Code Cleanliness Affect Coding Agents? - New May 19 paper: cleaner code did not change pass rate in the reported minimal-pair setup, but reduced tokens and file revisits, making refactor discipline a direct harness-efficiency lever.
SLEIGHT-Bench: A Benchmark of Evasion Attacks Against Agent Monitors - Relevant to reviewer/validator barriers because it tests whether monitoring agents catch covert harmful objectives across full transcripts.
SaaSBench: Exploring the Boundaries of Coding Agents in Long-Horizon Enterprise SaaS Engineering - Strong fit for planning-gate and /auto: the paper reports that most failures happen before deep business logic, reinforcing setup and integration checks.
1GC-7RC: One Graphic Card - Seven Research Challenges! - Useful shape for bounded GPU/task-budget evals; relevant to local Forge/autoconfig work but not a Quick Win.

Already Have

gpt-5.5 power-user default, approval_policy = "never", sandbox_mode = "danger-full-access", prompt telemetry off, live web search, schema-linked config.toml, features.hooks = true, features.plugins = true, features.goals = true, features.prevent_idle_sleep = true, features.plugin_hooks = false, destructive app tools disabled by default, OpenAI developer docs MCP, omni-mem MCP, Stitch and Kickstarter MCP entries, Browser/Chrome/Computer Use/Documents/Spreadsheets/Presentations/Gmail/OpenAI Developers plugins, Bash PreToolUse guard, Bash PostToolUse verification ledger and failure-context hooks, SessionStart repo-context and config-posture hooks, Stop and PreCompact omni-mem hooks, read-only explorer/planner/reviewer/python-reviewer/typescript-reviewer/validator agents, scoped worker and chad-twin agents, bounded agent thread/depth/runtime caps, profiles.review, profiles.conservative, profiles.conservative-auto-review, conservative profile login-shell suppression, global placeholder pokegen disable, session-recall, rlm-scan, planning-gate, auto, drive, go, codex-security, security-audit, codex-runtime-doctor, what-would-chad-do, and stable codex-cli 0.132.0.

Rejected

Enable service_tier = "priority" globally - rejected because it changes cost/performance posture globally; use explicit /fast or per-session selection instead.
Enable native Codex memories automatically - rejected again because local policy currently uses omni-mem as the default memory system and features.memories = false is intentional.
Enable plugin_hooks automatically - rejected because plugin-bundled hooks are executable code and still need trust review before global enablement.
Wholesale install Boris/Thariq Claude skills - rejected because they target ~/.claude and Claude-specific command surfaces; adapt useful patterns into Codex-owned skills after audit.
Clone Agent View as a new daemon now - rejected as overengineering until existing Codex session/goal state proves insufficient for a read-only status report.
Edit AGENTS.md as a Quick Win - rejected by the ecosystem-update hard limit; constitutional policy changes require explicit direction.

Sources checked: https://github.com/hesreallyhim/awesome-claude-code, https://howborisusesclaudecode.com/, https://github.com/shanraisshan/codex-cli-best-practice, https://developers.openai.com/codex/config-reference, https://github.com/openai/codex/releases, https://github.com/openai/codex/releases/tag/rust-v0.132.0, https://www.npmjs.com/package/@openai/codex, https://arxiv.org/search/?searchtype=all&query=LLM+agent+coding&order=-announced_date_first, https://arxiv.org/abs/2605.20049, https://arxiv.org/abs/2605.16626, https://arxiv.org/abs/2605.17526, https://arxiv.org/abs/2605.17046 Tier 2 fetched: yes. Tier 3 fetched: yes; explicit daily crawl requested. omni-mem write: saved memory a155327a-17d7-4b92-b7e4-87289b50991b. Run at: 2026-05-20T10:33:24Z