~/chadacus.dev/ecosystem-update/2026-05-20

Ecosystem Update - 2026-05-20

May 20, 2026 · curated by Chad Simon · 18 items reviewed

Highlights

  • Official Codex 0.132.0 shipped today and was safely applied; local CLI now reports codex-cli 0.132.0
  • The strongest local hardening win was profile-scoped: conservative profiles now disable login-shell semantics without touching the power-user default
  • Today's research signal is that harness quality matters as much as model quality: clean code reduces agent operating cost, and enterprise SaaS failures cluster around setup/integration before business logic

Quick Wins (implemented today)

  • Stable Codex 0.132.0 upgrade Codex-md
    Upgrade from 0.131.0 to the stable 0.132.0 release and smoke-test the harness
  • Conservative profile login-shell hardening Codex-md
    Add allow_login_shell = false only to opt-in conservative profiles

New Tools, Skills & Patterns

  • codex exec resume --output-schema adapter Codex-md
    Codex 0.132.0 release - Existing local searches found output_schema use in autoconfig, but no resumed automation path. Add schema-preserving resume support where long-running codex exec jobs need structured closure after resumption
  • Permission prompt miner for conservative profiles Codex-md
  • Agent-view style session control plane intake agent-pattern
    How Boris Uses Claude Code - Codex has subagents, goals, and session state, but no first-class local "all sessions by status" view. Evaluate whether existing codex doctor --json, state DBs, and goal state are enough before adding a new dashboard
  • Notify skill legacy path cleanup skill
    update the skill body in an explicit skill-maintenance pass, not as a Quick Win
  • SaaS setup/integration failure eval
    SaaSBench - Add a small harness eval that fails agents for premature closure during environment setup, dependency wiring, or service integration, not just business-logic tests
  • Agent monitor evasion regression
    SLEIGHT-Bench - Current reviewer barriers are useful, but monitor-evasion examples suggest adding targeted prompts/tests for state manipulation, ambiguous user intent, and covert deployment/exfiltration patterns

Research Worth Reading

  • Does Code Cleanliness Affect Coding Agents?
    - New May 19 paper: cleaner code did not change pass rate in the reported minimal-pair setup, but reduced tokens and file revisits, making refactor discipline a direct harness-efficiency lever
  • SLEIGHT-Bench: A Benchmark of Evasion Attacks Against Agent Monitors
    - Relevant to reviewer/validator barriers because it tests whether monitoring agents catch covert harmful objectives across full transcripts
  • SaaSBench: Exploring the Boundaries of Coding Agents in Long-Horizon Enterprise SaaS Engineering
    - Strong fit for planning-gate and /auto: the paper reports that most failures happen before deep business logic, reinforcing setup and integration checks
  • 1GC-7RC: One Graphic Card - Seven Research Challenges!
    - Useful shape for bounded GPU/task-budget evals; relevant to local Forge/autoconfig work but not a Quick Win

Considered, Not Adopting

Items reviewed and explicitly declined this cycle, with the reason. Curation discipline matters more than coverage.

  • Enable service_tier = "priority" globally- rejected because it changes cost/performance posture globally; use explicit /fast or per-session selection instead
  • Enable native Codex memories automatically
  • Enable plugin_hooks automatically- rejected because plugin-bundled hooks are executable code and still need trust review before global enablement
  • Wholesale install Boris/Thariq Claude skills- rejected because they target ~/.claude and Claude-specific command surfaces; adapt useful patterns into Codex-owned skills after audit
  • Clone Agent View as a new daemon now- rejected as overengineering until existing Codex session/goal state proves insufficient for a read-only status report
  • Edit AGENTS.md as a Quick Win- rejected by the ecosystem-update hard limit; constitutional policy changes require explicit direction

Sources Reviewed

// archive

← back to all digests