Ecosystem Update - 2026-05-27

May 27, 2026 · curated by Chad Simon · 15 items reviewed

Highlights

One safe Quick Win was implemented: restored the official Codex config schema header in ~/.codex/config.toml, clearing the local config-posture hard violation
Tier 1 community sources had no new GitHub commits since yesterday's run; their useful patterns are already covered locally or require repo-specific design
Today's strongest research signals are SEC-bench Pro and Verus-SpecGym: both reinforce evidence-backed closure and intent/spec validation rather than new global automation

Config schema header repair Codex-md

developers.openai.com

Restore #:schema https://developers.openai.com/codex/config-schema.json as the first line of ~/.codex/config.toml

SEC-bench Pro security-eval adapter

arXiv

https://arxiv.org/abs/2605.26548v1 - Long-horizon software-security tasks map cleanly to codex-security, security-audit, and AgentOps closure gates. Build only a thin intake/eval adapter if the benchmark artifacts are reproducible enough to run locally
Verus-SpecGym intent/spec validation intake

arXiv

https://arxiv.org/abs/2605.26457v1 - The paper targets a real verification failure mode: generated formal specs can be machine-checkable while not matching user intent. Fold this into planning-gate/eval design for formal or contract-heavy tasks before adding any new runtime hook
RepoMirage-style repository perturbation checks

arXiv

https://arxiv.org/abs/2605.26177v1 - Useful for future rlm-scan and reviewer evals because it tests whether agents use repository structure robustly instead of overfitting path/name cues
Conservative auto-review profile naming cleanup Codex-md

developers.openai.com

https://developers.openai.com/codex/config-reference#configtoml - Current approvals_reviewer = "guardian_subagent" is schema-accepted as a legacy alias, but docs prefer auto_review; normalize later with a focused profile compatibility check

Helicase: Uncertainty-Guided Supply Chain Knowledge Graph Construction with Autonomous Multi-Agent LLMs

arXiv

- Relevant to source synthesis and confidence accounting, but a knowledge-graph layer would be too heavy for today's harness without a concrete repeated failure
SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

arXiv

- Good candidate benchmark for security-agent closure quality and long-horizon exploit/patch workflows
Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization

arXiv

- Directly relevant to checking whether formal acceptance criteria preserve user intent, not just verifier success
RepoMirage: Probing Repository Context Reasoning in Code Agents with Perturbations

arXiv

- Supports future evals for cached repo context and codebase-localization robustness

Items reviewed and explicitly declined this cycle, with the reason. Curation discipline matters more than coverage.

Wholesale import from Awesome Claude Code, Boris workflows, or community Codex skill catalogs — - rejected: outside skills require strict audit and the local skill library already covers the recurring workflows
Global auto-format PostToolUse hook — - rejected as a Quick Win: the skill forbids adding hooks that require new scripts, and formatting is repo-specific
Enable native Codex memories
Default worktree isolation for all agents — - rejected: useful for selected large migrations, but a global behavior change would add coordination overhead without a proven current bottleneck
Build a Helicase-style source knowledge graph service — - rejected: current WebFetch/search plus report/state files satisfy today's source synthesis needs
Normalize guardian_subagent to auto_review immediately — - rejected as a Quick Win: the official schema still accepts the legacy value for compatibility, so this is cleanup rather than a hard defect