Issue #700
Essential Reading For Engineering Leaders
Friday 20th March’s issue is presented by WorkOS
How To Test AI Agents That Never Produce the Same Output Twice
Nick Nisi built npx workos, an AI agent that writes auth into your codebase. But testing an agent that never produces the same output twice required a completely different approach: evals.
Technical Leaders Make These 4 Common Storytelling Mistakes
— Wes Kao
tl;dr: “(1) Over-reliance on technical details: Real-life is non-linear, but stories are linear. (2) Trying to remember too many tactics: Don’t try to remember a list of storytelling tips and strategies. (3) Too much backstory: Start right before you almost get eaten by a bear. (4) Trying to tell a story that’s too long.”
Leadership Management
Exploit vs Explore
— Mike Fisher
tl;dr: “For leaders, this maps uncomfortably well to the way teams behave under pressure. When metrics are strong and customers are happy, exploration often feels like a luxury. When things are going poorly, it feels irresponsible. In both cases, the instinct is to exploit harder, to optimize the known, to squeeze more value out of the current system.”
Leadership Management
WorkOS FGA: The Authorization Layer For AI Agents
— Aaron Tainter, Pavan Kulkarni
tl;dr: “Authentication proves an agent’s identity. Authorization defines its blast radius. Most agents today inherit a user’s full access token, turning a helpful assistant into a confused deputy that can leak production secrets to a shared Slack channel. This post digs into why that happens and how WorkOS FGA solves this by scoping the blast radius with resource-level permissions.”
Promoted by WorkOS
Security Agents
The Future Of Software Engineering With Anthropic
— Akash Bajwa
tl;dr: “A roundtable with Anthropic’s Ash Prabaker and engineering leaders from Stripe, NVIDIA, Microsoft, Google DeepMind, xAI, Apple, Scale AI, and Peter Steinberger explored how AI is reshaping software engineering - shifting workflows toward eval-driven development, agent-led coding, and new bottlenecks in long-horizon tasks, context, and regulation.”
Leadership Management
“Keep your fears to yourself, but share your courage with others.” — Edsger W. Dijkstra
Measuring Agents In Production
— Murat Demirbas
tl;dr: “This 2025 December paper, ‘Measuring Agents in Production’, cuts through the reality behind the hype. It surveys 306 practitioners and conducts 20 in-depth case studies across 26 domains to document what is actually running in live environments. The reality is far more basic, constrained, and human-dependent.”
Productivity Agents
Scaling Postgres Connections With PgBouncer
— Ben Dicken
tl;dr: “The Postgres process-per-connection model breaks down at scale. PgBouncer fixes this, but tuning it well is the hard part. This deep dive covers the three pooling modes, how to size your connection chain from max_client_conn down to max_connections, and real tuning examples for small, large, and single-tenant setups. Everything you need to configure PgBouncer with confidence.”
Promoted by PlanetScale
PostgreSQL
The PERFECT Code Review: How to Reduce Cognitive Load While Improving Quality
— Daniil Bastrich
tl;dr: “I’ve distilled a healthy, sustainable review process into an acronym: PERFECT. It prioritizes what truly matters - from business logic and edge cases to reliability and readability - while keeping subjective opinions in check. Here is how you can apply these principles to bring structure, clarity, and consistency to your code reviews.”
CodeReview
Rob Pike’s 5 Rules Of Programming
— Rob Pike
tl;dr: “(1) You can’t tell where a program is going to spend its time. (2) Measure before optimizing. (3) Fancy algorithms are slow when n is small, and n is usually small. (4) Prefer simple algorithms and data structures. (5) Data dominates.”
BestPractices
Lessons From Building Claude Code: How We Use Skills
— Thariq Shihipar
tl;dr: “A common misconception we hear about skills is that they are ‘just markdown files’, but the most interesting part of skills is that they’re not just text files. They’re folders that can include scripts, assets, data, etc. that the agent can discover, explore and manipulate. In Claude Code, skills also have a wide variety of configuration options including registering dynamic hooks.”



