Source: raw/OpenAI_Codex_lead_on_the_new_shape_of_product_work_Andrew_Ambrosino.md (youtube.com/watch?v=P3KDebPTUrw) — Lenny’s Podcast interview with Andrew Ambrosino, product & engineering lead for the Codex app at OpenAI.

An operator-perspective interview (ai-podcasts looser bar — opinion mixed with observation) on how AI is reshaping product work, from the person building the tool a lot of the world now builds with. Context stats: nearly 100% of OpenAI uses Codex weekly (not just engineers), 5M+ weekly active users, 6× usage growth since January. The throughline: implementation is no longer the expensive part of building software — taste and curation are.

Key Takeaways

  • The inversion: implementation is cheap, taste is the bottleneck. The old process de-risked expensive implementation up front (research → docs → prototypes, because building was costly). Now anyone can stand up any feature, so you get ~90 uncoordinated prototypes of the same idea — and the expensive work moves to curation: “what’s good about these? what should we fold in? how should we frame this?” Ambrosino: “The implementation is actually not the expensive part anymore. It’s, dare I say, taste.”
  • “PRDs are dead” is wrong — pick the right medium. A document is right for product clarity on a vague area; a prototype is right to stress-test an interaction pattern. The trap: implementation is now so cheap that a polished prototype over-anchors and reads as production-ready when it’s actually early exploration (the “primal mark” — the first mark anchors everything downstream). The medium used to imply where you were in the process; that signal is now divorced, so you must say the stage out loud.
  • Why AI is still bad at design. Four reasons: (1) design is harder to grade — taste is a human feedback loop, unlike “does the code compile?”; (2) labs invest in code because it accelerates AI research (design isn’t in that flywheel); (3) design needs novelty/culture (you don’t want a model that outputs Linear’s website every time — software wants known patterns, design wants randomness); (4) an abstraction layer between visual design and codebase semantics — a rebrand isn’t “update 263 components,” it’s the shared semantic meaning between components — is still out of reach.
  • Role collapse, calibrated. “Your role is the average of what you spend your time on” — mostly PM work this month → you’re a PM for now (the “member of technical staff” idea). But eliminating roles entirely is dangerous: it discards real disciplines with knowable best practices. “You can use Excel, but you cannot be on the finance team.” IC vs management both converge on managing work / managing agents at different granularities — even an IC isn’t “typing code character by character” anymore.
  • “Zone defense” for product. In a world of bottom-up chaos (everyone throwing prototypes around), two product people working too closely is a bad signal — you want company coverage: spread out, find the gaps, let taste-makers steer from inception. Top-down year-long planning doesn’t work.
  • Build for the next model, not this one. The Codex app shipped in February would have failed in November — the only difference was the models. The method: list every feature you might want, prototype them all, ship the ones that are ready, let the rest bake, and re-try each when models leap. Operator → Atlas → Codex/ChatGPT browser is the same feature re-released with more intelligence. “We were too AGI-pilled for the moment.”
  • Planning under fast model progress. Shorter-term work needs more detail; 9-month plans stay deliberately hazy because any precision added to them is false precision. Whether a feature is good depends on whether the model is smart enough yet — not on the feature’s shape.
  • Codex as an OpenClaw-like agent OS. Beyond coding: a daily brief scheduled-task that triages 3,000 Slack channels (steered conversationally — “next time, deemphasize this workstream”); computer use that took over his machine to click through the Google Cloud console (pub/sub setup) when there was no connector; and Codex building itself a Premiere Pro extension to edit the launch videos. The pattern: connect to / drive specialty tools, don’t rebuild them (connectors, computer use, or extensions); Codex is the “home base” that opens Excel/Premiere and hands off.
  • “Loops are so last week.” The frontier moved from loops to supervised-vs-unsupervised code generation and harness engineering. The blockers to full autonomy: models increase complexity (“please make the models better at deleting code”), and teaching a model which features to build, ignore, or group is unsolved. Not yet a “set up a loop that listens to Twitter/Slack/email and improves the app” world — “but we’re trying to make it happen.”

Why it matters

This is the operator-perspective companion to the wiki’s agentic-coding cluster — it restates, from inside OpenAI, the same theses the wiki tracks from the Claude side: that the bottleneck has moved from writing code to judgment/taste, and that capability (not feature design) gates what ships. It pairs especially tightly with Dan Shipper’s “AI Paradox” (same podcast): Shipper’s “every agent needs a human who cares about it” and Ambrosino’s “taste is the scarce skill” are the same observation from two seats. Claims here are competitor-tool (Codex, not Claude) and operator-opinion — tagged and surfaced per the ai-podcasts bar, not treated as benchmarked fact.

Try It

  • Invest in taste/curation, not implementation speed. If anyone can build anything, your edge is knowing which of the 90 prototypes is right and how to frame it — make that the skill you grow.
  • Label the process stage explicitly. When you share a prototype, say “this is exploration, not ship-ready” — the polish no longer signals the stage, so the words have to.
  • Keep a prototype graveyard. Build the ambitious thing that doesn’t quite work yet, shelve it, and re-run it at every model leap — the shape may be right and only the intelligence missing.
  • Try the daily-brief automation. A scheduled agent that triages your channels/inbox and lets you steer it conversationally (“deemphasize this, surface that”) is the highest-leverage personal pattern he cites — the same shape as scheduled tasks / OpenClaw-style agents.

Open Questions

  • Operator opinion, self-selecting users. OpenAI’s own tool, used by employees who “self-select for figuring out the next thing” — Ambrosino himself flags that “the stuff that works here is not going to work with everybody.” Treat the usage stats and the role-collapse read as situated, not universal.
  • Competitor-tool perspective. This is Codex, not Claude; included per the ai-podcasts looser bar as a second-observation source for the agentic-coding theses, not as a Claude product claim.