Platform Engineering & Developer Experience

Platform as a Product

18 min Lesson 7 of 28

Platform as a Product

The single most important mindset shift in platform engineering is treating the Internal Developer Platform as a product, not an IT service. The distinction is not semantic. A service is delivered reactively — teams file tickets, the platform team fulfils them. A product is built proactively around user needs, ships iteratively with a public roadmap, measures adoption and satisfaction, and competes (internally) on value delivered. Teams that fail to make this shift end up building technically excellent platforms that nobody uses, while engineers route around them with bespoke scripts and shadow AWS accounts.

Netflix, Stripe, and Airbnb all run their platform organisations with dedicated product managers embedded in the platform team — not as overhead, but as the people most responsible for adoption rate and developer NPS. Google's internal Platform team has published that a 1-point improvement in developer satisfaction scores correlates with measurable reductions in toil hours across the engineering org. The ROI math is straightforward once you frame the platform as a product with identifiable users.

Know Your Users: Segmenting Internal Customers

Developer platforms serve heterogeneous audiences. A senior SRE running a latency-critical trading system has entirely different needs from a data scientist spinning up a one-off Spark job. Treating them identically produces a platform that serves neither well. Use the same techniques product managers apply externally:

User research: structured interviews, shadowing sessions, and surveys. Run quarterly developer surveys using the SPACE framework questions (Satisfaction, Performance, Activity, Communication, Efficiency). A 10-question survey sent to 200 engineers every quarter gives you directional signal you cannot get from metrics alone.
Persona definition: define 3-5 internal personas — e.g., "Green-field microservice developer", "ML Practitioner on GPU nodes", "Legacy migration team", "On-call SRE". Each persona has distinct pain points, preferred toolchains, and adoption blockers.
Jobs-to-be-done: rather than building features, identify the jobs developers are trying to accomplish. "I need to ship a new service to production without learning Kubernetes internals" is a job. It implies a golden path. "I need to know why my service is slow right now" is a different job — it implies an observability portal, not a deployment pipeline.

Internal users are simultaneously your most captive audience and your harshest critics. They have no competitive alternative (within the company), but they have the most dangerous escape hatch: writing everything themselves and calling it a "platform need". Your adoption rate is a direct signal of whether your product is solving real problems or adding bureaucratic overhead.

Building a Platform Roadmap

A platform roadmap is not a list of infrastructure upgrades. It is a prioritised sequence of user outcomes, with the platform capabilities needed to deliver them. Structure it in three horizons:

Now (0-3 months): bugs, critical gaps blocking adoption, things that are actively driving shadow IT. High-confidence, low-scope work committed to specific milestones.
Next (3-9 months): new capabilities validated through user research and pain-point data. Each item should trace back to a persona and a job-to-be-done. Include estimated impact (number of teams unblocked, hours of toil removed).
Later (9-18 months): strategic bets — multi-tenancy improvements, new cloud primitives, AI-assisted development workflows. Explicitly marked as directional and subject to change.

Publish the roadmap in your developer portal (a Backstage TechDocs page is the standard). Make it read-only for consumers but openly linkable so teams can plan against it. The most common failure is a roadmap that lives in a Confluence page nobody finds, updated quarterly by the platform PM alone. Treat roadmap updates the same as product release notes — notify in Slack, demo in engineering all-hands.

Platform product lifecycle: discovery drives prioritisation, build delivers capabilities, metrics feed back into the next cycle. Each persona shapes different roadmap items.

Adoption Over Mandates

The greatest mistake platform teams make is mandating their platform. "You must use our CI system by Q3 or your deployments will be blocked" is a policy failure disguised as a technical standard. It generates resentment, drives workarounds, and most importantly, it masks the signal you actually need: why are teams not adopting voluntarily?

The correct model is what Netflix calls "freedom and responsibility": the platform team makes the golden path the easiest and most rewarding path, and teams are free to deviate — but deviations are visible, documented, and carry explicit ownership costs. In practice this means:

Opt-in with gradients: make the first 80% of the journey effortless. The Backstage scaffold creates a repository, CI pipeline, and Kubernetes manifests in under 5 minutes. The team that chose to hand-roll their own pipeline spent 3 engineer-days. That contrast, visible to engineering leadership, is more powerful than any mandate.
Socialize success stories: a monthly "Platform Wins" Slack digest featuring three teams that shipped faster because of a platform capability is worth more than a quarterly all-hands deck. Peer recognition drives adoption in engineering cultures.
Make the off-ramp explicit: document the deviation process. If a team legitimately cannot use the standard database module (compliance reason, exotic workload), give them a clear path to get an exception, with a review cycle. An undocumented escape hatch is how you end up with 40 snowflake configurations 18 months later.

Track your platform adoption funnel exactly as a product team tracks a user funnel: awareness (does the team know the capability exists?), activation (have they tried it?), retention (are they still using it 90 days later?), and advocacy (are they recommending it to other teams?). A Backstage plugin that surfaces per-team adoption scores on a dashboard makes this visible to leadership without a manual reporting process.

Measuring Platform Success: Beyond Uptime

Platform SLOs are necessary but not sufficient. A platform with 99.9% uptime that nobody uses is failing. Instrument the following signal categories:

Adoption metrics: number of active services on the golden path (target: >70% of new services within 90 days of capability launch), number of monthly active users of the developer portal, percentage of teams using self-service provisioning vs. manual ticket requests.
Efficiency metrics: time-to-first-deploy for a net-new service (target: <2 hours for a greenfield microservice), lead time from commit to production (DORA), change failure rate (DORA).
Satisfaction metrics: quarterly SPACE survey score (Satisfaction dimension), platform-specific NPS question ("How likely are you to recommend the platform to a colleague on another team?"), direct feedback channels (a dedicated Slack channel triaged weekly by the platform PM).

Wire these into a live dashboard visible to the entire engineering organisation, not just the platform team. Transparency creates accountability and surfaces adoption blockers that would otherwise remain invisible to leadership.

# Backstage catalog-info.yaml for the platform portal itself
# The platform team dogfoods the platform — always register it in the catalog
apiVersion: backstage.io/v1alpha1
kind: System
metadata:
  name: internal-developer-platform
  title: Internal Developer Platform
  description: The platform team's own product — catalog, golden paths, self-service infra.
  annotations:
    backstage.io/techdocs-ref: dir:.
    github.com/project-slug: acme-org/idp
  labels:
    product-owner: platform-team
    tier: "0"  # tier-0 = highest criticality; platform outage = all teams blocked
spec:
  owner: platform-team
  domain: platform
---
# Example OKR embedded in TechDocs (roadmap.md linked from the catalog):
# Objective: Make every new microservice production-ready in under 2 hours.
# KR1: 80% of new services created via Backstage scaffold by end of quarter.
# KR2: Time-to-first-deploy P90 < 2 hours (measured via Backstage scaffolder events).
# KR3: Developer NPS for deployment tooling >= +40 (from current +22).

Platform OKRs and Organisational Alignment

Platform teams frequently struggle with justifying headcount and budget because their output is infrastructural — it is felt as absence of pain rather than presence of features. Use OKRs anchored to developer outcomes, not platform capabilities delivered:

Wrong: "Ship Backstage 1.20 upgrade and add three new scaffolder templates." (output)
Right: "Reduce time-to-first-production-deployment for new services by 60% (from 3 days to 4 hours) while maintaining a change failure rate below 2%." (outcome)

Align platform OKRs to the engineering organisation's top-level OKRs. If the company OKR is "double engineering velocity", the platform team's OKR cascade should show exactly how their roadmap contributes. This is the language that secures platform headcount in budget cycles.

The most dangerous failure mode for a platform-as-a-product org is the platform team becoming a bottleneck by owning too much. If every team that wants to customise a golden path has to open a PR on the platform team's repository and wait for a review, adoption will stall. The solution is clear ownership boundaries: the platform team owns the interfaces (the scaffold contract, the Crossplane API), while product teams can extend and customise within those interfaces without platform involvement. Think API, not gate.