🧭 Issue 5: Building Platforms People Want to Use — Lessons from Developer Experience

🧭 Issue 5: Building Platforms People Want to Use — Lessons from Developer Experience

Published under: Command Line to Boardroom

Here’s a hard truth most infrastructure teams ignore

If developers avoid your platform, it doesn’t matter how technically sound it is.

You can have perfect IaC, rock-solid SLAs, hardened security — but if it takes 10 steps to spin up a service and every change request hits a ticket queue?

They’ll bypass it. Shadow infra will rise. Your platform will remain unused.

In this issue, I break down what it really takes to build a platform developers actually want to use — and why successful infra leaders must think like product managers, not gatekeepers.

Why Developer Experience (DX) is Infra Leadership's New Mandate

Modern engineering orgs move fast. But here's the paradox:

  • Devs want autonomy, not hand-holding.
  • Infra teams want governance, not chaos.
  • Product teams want speed, not red tape.

What can we do to reconcile all three?

Stop building platforms that control. And start building platforms that enable.

That means shifting from "we provide infra" → to "we design experiences."

🏗️ 5 Traits of Developer-Friendly Platforms

1️⃣ Self-Service by Default

🧪 If every deployment or namespace requires a JIRA ticket, you’re not building a platform — you’re managing a bottleneck.

Great platforms offer:

  • Templates
  • Paved paths
  • One-click environments

✅ Let devs launch what they need — within safe guardrails.

🔸 What It Means:

Developers should be able to provision what they need without waiting for ops. That means eliminating the JIRA ping-pong just to:

  • Create a namespace
  • Launch a new microservice
  • Request a TLS certificate

🧠 Why It Matters:

Manual approvals scale linearly with team size. Self-service scales exponentially.

If your developers are blocked by humans for routine infra, you’ve hard-coded friction into your org.

🛠️ Infra Practice:

  • Build internal developer portals (IDPs) with Git-backed templates
  • Offer UI + CLI + API access (give them choice)
  • Use RBAC + policies to enforce compliance, not block it

🧠 Leadership Parallel:

Great infra teams enable autonomy with safety, not control through tickets.

2️⃣ Golden Paths, Not Golden Cages

🧭 You can’t force adoption with mandates. But you can incentivize adoption with great defaults.

Your golden path should be:

  • Faster than DIY
  • Secure by default
  • Pre-approved for compliance

✅ Make “the right way” the easiest way.

🔸 What It Means:

A golden path is a pre-optimized, secure, pre-approved way to do something — but it's optional.

Developers can deviate. But sticking to the paved road should be the easiest, fastest, most supported choice.

⚠️ What Fails:

Too many infra teams create locked-down frameworks and call them platforms. Developers bypass them using their own Terraform scripts or personal AWS keys.

✅ What Works:

  • Offer starter templates (IaC + CI/CD + observability pre-baked)
  • Document expected outcomes, not just tools
  • Provide paved paths that work out-of-the-box

🧠 Leadership Parallel:

Don’t build walls. Build highways. The goal is influence by utility — not control by enforcement.

3️⃣ Clear, Discoverable Documentation

📚 If your platform requires Slack messages to explain it — it’s broken.

Great DX means:

  • READMEs for every module
  • Portal-based discovery
  • Shortcuts for common tasks

✅ Documentation is not an afterthought — it's a core UX layer.

🔸 What It Means:

Documentation is not a nice-to-have. It’s part of the interface to your platform.

Just like an API, if the documentation is vague, outdated, or hidden in someone's head — it's broken.

🧭 Signs of Bad DX:

  • Teams/Slack ping: “Hey, how do I deploy?”
  • Onboarding doc last updated 9 months ago
  • Trivial knowledge passed only via screen share

✅ What Good Looks Like:

  • Central platform portal or dev wiki
  • Code samples for every paved path
  • Task-based guides: “How to spin up X in < 5 mins”

🧠 Leadership Parallel:

Every undocumented capability is a liability. Platforms scale through clarity, not charisma.

4️⃣ Feedback Loops with Developers

🗣️ You’re not building infra for yourself.

You’re building it for users — internal devs.

Regular feedback channels:

  • UX surveys
  • Roadmap reviews
  • Office hours

✅ Treat platform engineering like product development — listen, iterate, improve.

🔸 What It Means:

If you’re not talking to your users, you’re building in a vacuum.

Your internal developers are your customers — and their needs evolve faster than your sprints.

📢 Feedback Channels:

  • Monthly platform office hours
  • #platform-feedback channels
  • 2-question NPS surveys (“How likely are you to recommend X?”)

✅ Behavior to Encourage:

  • Public product roadmaps
  • Changelogs that developers read
  • Transparent prioritization (“We heard you. This is coming next.”)

🧠 Leadership Parallel:

Great platforms aren't perfect. They're responsive. Platforms are never finished — they’re products in iteration.

5️⃣ Measure What Matters (and Share It)

📈 Want to prove your platform works? Start tracking:

  • Time to first deployment
  • % of teams using paved paths
  • Ticket volume vs. self-service volume
  • Platform NPS (yes, really)

✅ If you're not measuring DX(Developer eXperience), you're not improving it.

🔸 What It Means:

You can’t improve what you don’t measure.

And developers won’t trust your platform unless you can prove its value.

🧮 Metrics to Track:

  • ⏱️ Time to first deployment
  • 📉 Ticket reduction through self-service
  • 🧪 % of apps using golden paths
  • 🗣️ Platform NPS / satisfaction scores

🎯 Share your impact

  • Quarterly platform scorecards
  • Celebrate milestones (“We automated X, saved Y hours”)
  • Ask “What’s slowing you down?” instead of just reporting uptime

🧠 Leadership Parallel:

Infra success is no longer just uptime. It's velocity, autonomy, and adoption.

🔁 Platform Engineering ≠ Infra Engineering

Traditional infrastructure teams focus on uptime, access, and provisioning.

Platform engineering goes further:

  • Builds abstractions
  • Offers APIs
  • Manages experiences

And the best part? A well-designed platform doesn’t slow teams down — it enables velocity with governance.

A good platform is like a great city:

  • It has roads, lights, zoning — but still lets you go anywhere.
  • It encourages good behavior without enforcing it through bureaucracy.
  • It evolves by listening to its citizens.

So if you’re building internal tooling, infra, or cloud services:

Don’t ask, “Is it compliant?” Ask, “Will they love using it?”

Because developers vote with their behavior. And if your platform doesn’t feel like a product — it’ll be treated like a problem.


📚 What I Read This Week: Code Researcher (Microsoft Research)

This week, I explored Code Researcher, a compelling research paper from Microsoft that introduces a new class of agents for systems-level bug fixing — and I believe it’s a major step forward for AI-assisted infrastructure work.

TL;DR:

Code Researcher is the first deep research agent built for navigating large systems codebases like the Linux kernel. Unlike typical code agents that work on small repositories and clean bug descriptions, Code Researcher takes on crash reports, digs through code and commit history, and performs multi-step reasoning to suggest accurate patches — just like a human expert would.

🔍 Why This Mattered to Me (and You)

Most code LLMs today are good at scripting, but they struggle with large, messy systems code — the kind many infra leaders deal with.

Code Researcher flips the script by:

  • Chasing control and data flow across files
  • Using regex-based commit searches to trace regressions
  • Performing causal reasoning from crash reports to fixes
  • Synthesizing patches across multi-file, multi-author codebases

In short: it acts more like a thoughtful investigator than an autocomplete engine.

📈 What Impressed Me

  • In evaluations on Linux kernel crashes, Code Researcher achieved 58% resolution rate, significantly outperforming SWE-agent (37.5%)
  • It explored ~10x more files per bug than other agents — showing true “deep research”
  • It even used commit history to reach the same fix point that real developers identified
  • Generalized well to another large codebase: FFmpeg

My Takeaway as a Platform Leader:

This is the beginning of code agents that think before they type.

The real lesson here isn’t about bug fixing — it’s about designing AI systems that can trace context, reason across abstraction boundaries, and build informed hypotheses.

And if you’re leading infra or platform engineering at scale, you might want to start thinking of your systems as not just code, but knowledge graphs that need deep reasoning agents — not just fast coders.

🧾 Recommended For:

  • Platform Engineers, Kernel Developers, SREs
  • Anyone exploring AI-assisted debugging, traceability, or code intelligence
  • Leaders designing AI-first observability or incident response platforms


🔜 Next up in Issue 6: “How Not to Burn Out Your Infra Team — Leading Without Becoming the Bottleneck”

To view or add a comment, sign in

Others also viewed

Explore topics