Tun Shwe and Jeremy Frenay at Lenses argue that MCP servers fail in production not because of missing auth, but because teams expose agent-facing tools like human APIs. Security starts at interface design—fewer tools, constrained inputs, minimal data exposure—before a single line of OAuth code.
Simon Willison argues that coding agents become trustworthy when you stop reviewing their code line-by-line and start demanding proof: red-green TDD, runtime smoke tests, conformance suites, and sandboxed execution. The shift from human review to automated verification is what makes agent autonomy viable.
Amplifon built a centralized registry system for MCP servers and A2A agents across 26 countries and 10,000+ stores. The architecture—registries, metadata, blueprints, CI/CD-driven discovery—offers a concrete answer to the enterprise agent sprawl problem most organizations haven't started solving.
Deloitte's Tech Trends data shows 93% of enterprise AI spend goes to technology and tooling while just 7% funds culture, change management, and learning. Bill Briggs argues this imbalance directly explains why fewer than 30% of agentic pilots reach production at scale.
Stefano Fiorucci trained a small open-source model to outperform GPT-5 Mini at tic-tac-toe using reinforcement learning with verifiable rewards. The key lesson: environment design—reward signals, opponent calibration, batch sizing—determines whether RL training succeeds or collapses.
Enterprise agents struggle less from weak models than from human-shaped interfaces, raw observability data, and unsafe workflows. Andre Elizondo argues the real work happens earlier: transform the data, constrain the tools, and build evaluation loops that make production decisions inspectable and safer.
Andrej Karpathy rode in a near-perfect Waymo demo in 2014. It took a decade to become a paid product. That demo-to-product gap—not model architecture—is the binding constraint across self-driving, humanoid robots, and AI-powered education.
Mihail Eric's Stanford class on AI-native engineering reveals why multi-agent workflows fail without test contracts, consistent codebases, and incremental scaling—and why managing agents is really just managing people, with less forgiveness.
Emergent hit 7 million apps in 8 months by betting that the moat in AI coding isn't generation—it's verification, deployment, and the full software lifecycle. 80% of their users have zero programming knowledge.
As AI agents gain tool access and long-horizon autonomy, the bottleneck shifts from model intelligence to governance—permissions, guardrails, monitoring, and liability. That's where job displacement becomes real.