The New Shape of Product Work — Andrew Ambrosino (OpenAI Codex) on Lenny's Podcast

Source: raw/OpenAI_Codex_lead_on_the_new_shape_of_product_work_Andrew_Ambrosino.md (youtube.com/watch?v=P3KDebPTUrw) — Lenny’s Podcast interview with Andrew Ambrosino, product & engineering lead for the Codex app at OpenAI.

An operator-perspective interview (ai-podcasts looser bar — opinion mixed with observation) on how AI is reshaping product work, from the person building the tool a lot of the world now builds with. Context stats: nearly 100% of OpenAI uses Codex weekly (not just engineers), 5M+ weekly active users, 6× usage growth since January. The throughline: implementation is no longer the expensive part of building software — taste and curation are.

Key Takeaways

The inversion: implementation is cheap, taste is the bottleneck. The old process de-risked expensive implementation up front (research → docs → prototypes, because building was costly). Now anyone can stand up any feature, so you get ~90 uncoordinated prototypes of the same idea — and the expensive work moves to curation: “what’s good about these? what should we fold in? how should we frame this?” Ambrosino: “The implementation is actually not the expensive part anymore. It’s, dare I say, taste.”
“PRDs are dead” is wrong — pick the right medium. A document is right for product clarity on a vague area; a prototype is right to stress-test an interaction pattern. The trap: implementation is now so cheap that a polished prototype over-anchors and reads as production-ready when it’s actually early exploration (the “primal mark” — the first mark anchors everything downstream). The medium used to imply where you were in the process; that signal is now divorced, so you must say the stage out loud.
Why AI is still bad at design. Four reasons: (1) design is harder to grade — taste is a human feedback loop, unlike “does the code compile?”; (2) labs invest in code because it accelerates AI research (design isn’t in that flywheel); (3) design needs novelty/culture (you don’t want a model that outputs Linear’s website every time — software wants known patterns, design wants randomness); (4) an abstraction layer between visual design and codebase semantics — a rebrand isn’t “update 263 components,” it’s the shared semantic meaning between components — is still out of reach.
Role collapse, calibrated. “Your role is the average of what you spend your time on” — mostly PM work this month → you’re a PM for now (the “member of technical staff” idea). But eliminating roles entirely is dangerous: it discards real disciplines with knowable best practices. “You can use Excel, but you cannot be on the finance team.” IC vs management both converge on managing work / managing agents at different granularities — even an IC isn’t “typing code character by character” anymore.
“Zone defense” for product. In a world of bottom-up chaos (everyone throwing prototypes around), two product people working too closely is a bad signal — you want company coverage: spread out, find the gaps, let taste-makers steer from inception. Top-down year-long planning doesn’t work.
Build for the next model, not this one. The Codex app shipped in February would have failed in November — the only difference was the models. The method: list every feature you might want, prototype them all, ship the ones that are ready, let the rest bake, and re-try each when models leap. Operator → Atlas → Codex/ChatGPT browser is the same feature re-released with more intelligence. “We were too AGI-pilled for the moment.”
Planning under fast model progress. Shorter-term work needs more detail; 9-month plans stay deliberately hazy because any precision added to them is false precision. Whether a feature is good depends on whether the model is smart enough yet — not on the feature’s shape.
Codex as an OpenClaw-like agent OS. Beyond coding: a daily brief scheduled-task that triages 3,000 Slack channels (steered conversationally — “next time, deemphasize this workstream”); computer use that took over his machine to click through the Google Cloud console (pub/sub setup) when there was no connector; and Codex building itself a Premiere Pro extension to edit the launch videos. The pattern: connect to / drive specialty tools, don’t rebuild them (connectors, computer use, or extensions); Codex is the “home base” that opens Excel/Premiere and hands off.
“Loops are so last week.” The frontier moved from loops to supervised-vs-unsupervised code generation and harness engineering. The blockers to full autonomy: models increase complexity (“please make the models better at deleting code”), and teaching a model which features to build, ignore, or group is unsolved. Not yet a “set up a loop that listens to Twitter/Slack/email and improves the app” world — “but we’re trying to make it happen.”

Why it matters

This is the operator-perspective companion to the wiki’s agentic-coding cluster — it restates, from inside OpenAI, the same theses the wiki tracks from the Claude side: that the bottleneck has moved from writing code to judgment/taste, and that capability (not feature design) gates what ships. It pairs especially tightly with Dan Shipper’s “AI Paradox” (same podcast): Shipper’s “every agent needs a human who cares about it” and Ambrosino’s “taste is the scarce skill” are the same observation from two seats. Claims here are competitor-tool (Codex, not Claude) and operator-opinion — tagged and surfaced per the ai-podcasts bar, not treated as benchmarked fact.

The AI Paradox — Dan Shipper (Lenny’s Podcast) — same show, adjacent thread: more automation → more humans, taste/curation as the durable human role, the super-agent pattern.
Why Coding Is Solved (Boris Cherny) — the Claude-side version of “implementation is cheap”; read against Ambrosino’s “taste is the bottleneck.”
Agentic Coding and Returns to Expertise — the thesis that AI coding raises the value of judgment/expertise; “your role is the average of what you do” is the same idea.
From Vibe Coding to Agentic Engineering (Karpathy) — the maturation arc from prompt-and-pray to engineered harnesses; “loops are so last week / harness engineering” lands here.
Reflecting on a Year of Claude Code — the product-evolution counterpart (Claude Code’s own “build for the next model” history).

Try It

Invest in taste/curation, not implementation speed. If anyone can build anything, your edge is knowing which of the 90 prototypes is right and how to frame it — make that the skill you grow.
Label the process stage explicitly. When you share a prototype, say “this is exploration, not ship-ready” — the polish no longer signals the stage, so the words have to.
Keep a prototype graveyard. Build the ambitious thing that doesn’t quite work yet, shelve it, and re-run it at every model leap — the shape may be right and only the intelligence missing.
Try the daily-brief automation. A scheduled agent that triages your channels/inbox and lets you steer it conversationally (“deemphasize this, surface that”) is the highest-leverage personal pattern he cites — the same shape as scheduled tasks / OpenClaw-style agents.

Open Questions

Operator opinion, self-selecting users. OpenAI’s own tool, used by employees who “self-select for figuring out the next thing” — Ambrosino himself flags that “the stuff that works here is not going to work with everybody.” Treat the usage stats and the role-collapse read as situated, not universal.
Competitor-tool perspective. This is Codex, not Claude; included per the ai-podcasts looser bar as a second-observation source for the agentic-coding theses, not as a Claude product claim.

Jonathon's AI Wiki

Explorer

The New Shape of Product Work — Andrew Ambrosino (OpenAI Codex) on Lenny's Podcast

Key Takeaways

Why it matters

Try It

Open Questions

Graph View

Table of Contents

Backlinks

Jonathon's AI Wiki

Explorer

The New Shape of Product Work — Andrew Ambrosino (OpenAI Codex) on Lenny's Podcast

Key Takeaways

Why it matters

Related

Try It

Open Questions

Graph View

Table of Contents

Backlinks