Ecosystem Update - 2026-05-20

May 20, 2026 · curated by Chad Simon · 18 items reviewed

Highlights

Official Codex 0.132.0 shipped today and was safely applied; local CLI now reports codex-cli 0.132.0
The strongest local hardening win was profile-scoped: conservative profiles now disable login-shell semantics without touching the power-user default
Today's research signal is that harness quality matters as much as model quality: clean code reduces agent operating cost, and enterprise SaaS failures cluster around setup/integration before business logic

Stable Codex 0.132.0 upgrade Codex-md

github.com/openai/codex

Upgrade from 0.131.0 to the stable 0.132.0 release and smoke-test the harness
Conservative profile login-shell hardening Codex-md

developers.openai.com

Add allow_login_shell = false only to opt-in conservative profiles

codex exec resume --output-schema adapter Codex-md

github.com/openai/codex

Codex 0.132.0 release - Existing local searches found output_schema use in autoconfig, but no resumed automation path. Add schema-preserving resume support where long-running codex exec jobs need structured closure after resumption
Permission prompt miner for conservative profiles Codex-md

howborisusesclaudecode.com
Agent-view style session control plane intake agent-pattern

howborisusesclaudecode.com

How Boris Uses Claude Code - Codex has subagents, goals, and session state, but no first-class local "all sessions by status" view. Evaluate whether existing codex doctor --json, state DBs, and goal state are enough before adding a new dashboard
Notify skill legacy path cleanup skill

howborisusesclaudecode.com

update the skill body in an explicit skill-maintenance pass, not as a Quick Win
SaaS setup/integration failure eval

arXiv

SaaSBench - Add a small harness eval that fails agents for premature closure during environment setup, dependency wiring, or service integration, not just business-logic tests
Agent monitor evasion regression

arXiv

SLEIGHT-Bench - Current reviewer barriers are useful, but monitor-evasion examples suggest adding targeted prompts/tests for state manipulation, ambiguous user intent, and covert deployment/exfiltration patterns

Does Code Cleanliness Affect Coding Agents?

arXiv

- New May 19 paper: cleaner code did not change pass rate in the reported minimal-pair setup, but reduced tokens and file revisits, making refactor discipline a direct harness-efficiency lever
SLEIGHT-Bench: A Benchmark of Evasion Attacks Against Agent Monitors

arXiv

- Relevant to reviewer/validator barriers because it tests whether monitoring agents catch covert harmful objectives across full transcripts
SaaSBench: Exploring the Boundaries of Coding Agents in Long-Horizon Enterprise SaaS Engineering

arXiv

- Strong fit for planning-gate and /auto: the paper reports that most failures happen before deep business logic, reinforcing setup and integration checks
1GC-7RC: One Graphic Card - Seven Research Challenges!

arXiv

- Useful shape for bounded GPU/task-budget evals; relevant to local Forge/autoconfig work but not a Quick Win

Items reviewed and explicitly declined this cycle, with the reason. Curation discipline matters more than coverage.

Enable service_tier = "priority" globally — - rejected because it changes cost/performance posture globally; use explicit /fast or per-session selection instead
Enable native Codex memories automatically
Enable plugin_hooks automatically — - rejected because plugin-bundled hooks are executable code and still need trust review before global enablement
Wholesale install Boris/Thariq Claude skills — - rejected because they target ~/.claude and Claude-specific command surfaces; adapt useful patterns into Codex-owned skills after audit
Clone Agent View as a new daemon now — - rejected as overengineering until existing Codex session/goal state proves insufficient for a read-only status report
Edit AGENTS.md as a Quick Win — - rejected by the ecosystem-update hard limit; constitutional policy changes require explicit direction