Back to detailsCodex App: From Task to Verified Change
Export slides PDF

00 / Cover

Codex App: From Task to Verified Change

A product demo script for putting goal, context, execution, and verification into one development flow.

The point is not speed. The point is trustworthy delivery.

01 / Brief

Put the task boundary on screen first

The demo starts with what the human actually asked for, not with a model capability claim.

task.zhaphar.localbrief
localhost:3000/tasks/codex-demo

Fix slide print pagination drift

Goal: keep PDF output aligned with browser preview.

Constraint: no chart library, no wider reading column.

  • Accept: every page is 16:9
  • Check: Playwright screenshots
  • Output: remaining risk notes
Browser artifact: user-facing task brief.

02 / Time

The time goes into context and proof

This breaks the lazy story that the Agent only writes code. Reading, changing, and proving are different jobs.

Task input6
Context reading22
Change execution11
Verification feedback18

03 / Decision

Route first, then delegate

Not every task should be automated the same way. Risk and verifiability decide the level of supervision.

High / Checkable

High risk · high verifiability

Tests have your back. Change freely, verify to close out.

High / Fuzzy

High risk · low verifiability

Add observability first. Do not rush to touch it.

Low / Checkable

Low risk · high verifiability

Automate first. Hand it off to the Agent in bulk.

Low / Fuzzy

Low risk · low verifiability

Docs and renames need a quick human glance.

04 / Context

Context should point back to files

This slide shows that the Agent is not guessing the project; it is working inside traceable boundaries.

workspacecontext
  1. app/sessions.css
  2. components/session-slide-visual.tsx
  3. tests/content/sessions.test.ts
  4. content/sessions/codex-app-demo
stage: 1280px × 720pxfooter and folio stay fixedartifact layouts own one evidence areaprint parity is part of the contract
Editor artifact: files and local rules stay visible.

05 / Maturity

From read-only to replayable

This frames product value as a maturing workflow, not a one-time model trick.

  1. 2026-02

    Read-only context

    Understand first, then speak.

  2. 2026-04

    Replayable evidence chain

    Terminal and browser checks join the same flow.

  3. 2026-06

    Verification-driven delivery

    Current: close with tests and risk notes.

06 / Proof

Close with command results

The terminal slide carries the final claim: what ran, what passed, and what remains uncovered.

pnpmexit 0
$ pnpm test184 pass$ pnpm typecheckNo TypeScript errors$ pnpm build✓ Compiled successfully
Terminal artifact: command, result, and scope.

07 / Closing

Hand the result back to the human

The handoff is not 'I fixed it'. It puts the change, proof, and remaining risk back into human judgment.

  • What changed: the boundary is clear
  • What passed: the evidence is repeatable
  • What remains: the risk is visible

Trustworthy delivery matters more than fast generation.