Source: raw/OpenAI_Codex_lead_on_the_new_shape_of_product_work_Andrew_Ambrosino.md (youtube.com/watch?v=P3KDebPTUrw) — Lenny’s Podcast interview with Andrew Ambrosino, product & engineering lead for the Codex app at OpenAI.
An operator-perspective interview (ai-podcasts looser bar — opinion mixed with observation) on how AI is reshaping product work, from the person building the tool a lot of the world now builds with. Context stats: nearly 100% of OpenAI uses Codex weekly (not just engineers), 5M+ weekly active users, 6× usage growth since January. The throughline: implementation is no longer the expensive part of building software — taste and curation are.
Key Takeaways
- The inversion: implementation is cheap, taste is the bottleneck. The old process de-risked expensive implementation up front (research → docs → prototypes, because building was costly). Now anyone can stand up any feature, so you get ~90 uncoordinated prototypes of the same idea — and the expensive work moves to curation: “what’s good about these? what should we fold in? how should we frame this?” Ambrosino: “The implementation is actually not the expensive part anymore. It’s, dare I say, taste.”
- “PRDs are dead” is wrong — pick the right medium. A document is right for product clarity on a vague area; a prototype is right to stress-test an interaction pattern. The trap: implementation is now so cheap that a polished prototype over-anchors and reads as production-ready when it’s actually early exploration (the “primal mark” — the first mark anchors everything downstream). The medium used to imply where you were in the process; that signal is now divorced, so you must say the stage out loud.
- Why AI is still bad at design. Four reasons: (1) design is harder to grade — taste is a human feedback loop, unlike “does the code compile?”; (2) labs invest in code because it accelerates AI research (design isn’t in that flywheel); (3) design needs novelty/culture (you don’t want a model that outputs Linear’s website every time — software wants known patterns, design wants randomness); (4) an abstraction layer between visual design and codebase semantics — a rebrand isn’t “update 263 components,” it’s the shared semantic meaning between components — is still out of reach.
- Role collapse, calibrated. “Your role is the average of what you spend your time on” — mostly PM work this month → you’re a PM for now (the “member of technical staff” idea). But eliminating roles entirely is dangerous: it discards real disciplines with knowable best practices. “You can use Excel, but you cannot be on the finance team.” IC vs management both converge on managing work / managing agents at different granularities — even an IC isn’t “typing code character by character” anymore.
- “Zone defense” for product. In a world of bottom-up chaos (everyone throwing prototypes around), two product people working too closely is a bad signal — you want company coverage: spread out, find the gaps, let taste-makers steer from inception. Top-down year-long planning doesn’t work.
- Build for the next model, not this one. The Codex app shipped in February would have failed in November — the only difference was the models. The method: list every feature you might want, prototype them all, ship the ones that are ready, let the rest bake, and re-try each when models leap. Operator → Atlas → Codex/ChatGPT browser is the same feature re-released with more intelligence. “We were too AGI-pilled for the moment.”
- Planning under fast model progress. Shorter-term work needs more detail; 9-month plans stay deliberately hazy because any precision added to them is false precision. Whether a feature is good depends on whether the model is smart enough yet — not on the feature’s shape.
- Codex as an OpenClaw-like agent OS. Beyond coding: a daily brief scheduled-task that triages 3,000 Slack channels (steered conversationally — “next time, deemphasize this workstream”); computer use that took over his machine to click through the Google Cloud console (pub/sub setup) when there was no connector; and Codex building itself a Premiere Pro extension to edit the launch videos. The pattern: connect to / drive specialty tools, don’t rebuild them (connectors, computer use, or extensions); Codex is the “home base” that opens Excel/Premiere and hands off.
- “Loops are so last week.” The frontier moved from loops to supervised-vs-unsupervised code generation and harness engineering. The blockers to full autonomy: models increase complexity (“please make the models better at deleting code”), and teaching a model which features to build, ignore, or group is unsolved. Not yet a “set up a loop that listens to Twitter/Slack/email and improves the app” world — “but we’re trying to make it happen.”
Why it matters
This is the operator-perspective companion to the wiki’s agentic-coding cluster — it restates, from inside OpenAI, the same theses the wiki tracks from the Claude side: that the bottleneck has moved from writing code to judgment/taste, and that capability (not feature design) gates what ships. It pairs especially tightly with Dan Shipper’s “AI Paradox” (same podcast): Shipper’s “every agent needs a human who cares about it” and Ambrosino’s “taste is the scarce skill” are the same observation from two seats. Claims here are competitor-tool (Codex, not Claude) and operator-opinion — tagged and surfaced per the ai-podcasts bar, not treated as benchmarked fact.
Related
- The AI Paradox — Dan Shipper (Lenny’s Podcast) — same show, adjacent thread: more automation → more humans, taste/curation as the durable human role, the super-agent pattern.
- Why Coding Is Solved (Boris Cherny) — the Claude-side version of “implementation is cheap”; read against Ambrosino’s “taste is the bottleneck.”
- Agentic Coding and Returns to Expertise — the thesis that AI coding raises the value of judgment/expertise; “your role is the average of what you do” is the same idea.
- From Vibe Coding to Agentic Engineering (Karpathy) — the maturation arc from prompt-and-pray to engineered harnesses; “loops are so last week / harness engineering” lands here.
- Reflecting on a Year of Claude Code — the product-evolution counterpart (Claude Code’s own “build for the next model” history).
Try It
- Invest in taste/curation, not implementation speed. If anyone can build anything, your edge is knowing which of the 90 prototypes is right and how to frame it — make that the skill you grow.
- Label the process stage explicitly. When you share a prototype, say “this is exploration, not ship-ready” — the polish no longer signals the stage, so the words have to.
- Keep a prototype graveyard. Build the ambitious thing that doesn’t quite work yet, shelve it, and re-run it at every model leap — the shape may be right and only the intelligence missing.
- Try the daily-brief automation. A scheduled agent that triages your channels/inbox and lets you steer it conversationally (“deemphasize this, surface that”) is the highest-leverage personal pattern he cites — the same shape as scheduled tasks / OpenClaw-style agents.
Open Questions
- Operator opinion, self-selecting users. OpenAI’s own tool, used by employees who “self-select for figuring out the next thing” — Ambrosino himself flags that “the stuff that works here is not going to work with everybody.” Treat the usage stats and the role-collapse read as situated, not universal.
- Competitor-tool perspective. This is Codex, not Claude; included per the ai-podcasts looser bar as a second-observation source for the agentic-coding theses, not as a Claude product claim.