Primitive Wedges
Platforms are earned. Primitives come first.
Today’s post is brought to you by Knock
Build your next lifecycle campaign from Slack (Sponsor)
You can now build, edit, and ship campaigns entirely from Slack. Prompt the Knock agent, watch it build, and get a link to review when it’s ready.
Use it to:
Create and manage lifecycle workflows
Draft and publish in-app tooltips, upgrade nudges, and more
Build and schedule newsletters and announcements
Filter and target audiences in real-time
And so much more
Please support our sponsor!
And just like that - we’re back on Substack. I’ll save the detail on all that for another time. For now, this post is about something I’ve been thinking about a lot lately.
Every other day I see another X thread or Product Hunt post promising the next great AI developer platform.
Big technology shifts create big reactions from founders. When the surface area gets messy, the answer gets packaged as something bigger: AI engineering clouds, agentic SDLC platforms, software factories, control planes for generated work.
But most companies still have to earn their way in through one narrow capability first.
In this post, I’ll cover what I mean by a primitive, why the next really large AI dev tool companies are likely to start with one, where I think those primitives are forming, and how to test whether a narrow product is a wedge or just a feature.
Note: this does not mean every dev tool company should start this way. Most probably will not, and plenty of excellent dev tool companies will never become primitives. That is fine. But the next really big AI dev tool companies are more likely to start by owning one small, painful, unavoidable capability.
They’ll own what I call a Primitive Wedge - a narrow capability that becomes the default way developers do one important thing, then expands after the dependency is real.
Own the verb. Become the default. Expand from there.
What is a primitive?
A primitive is not just a basic feature.
In computer science, the real primitives are not frameworks or tools. They are the irreducible building blocks we keep recombining: data, computation, control, abstraction, state, communication, and composition.
Most software engineering is choosing better ways to package, constrain, connect, and operate those primitives.
Data becomes tables, files, embeddings, ASTs, event logs, DOM trees, Git commits, and vector indexes.
Computation becomes functions, algorithms, jobs, evals, transformations, and search.
Control becomes workflows, retries, scheduling, branching, exceptions, concurrency, and state machines.
State becomes memory, databases, caches, queues, sessions, logs, and system state.
Abstraction becomes functions, types, modules, APIs, services, protocols, and packages.
Communication becomes HTTP, RPC, pub/sub, sockets, webhooks, queues, APIs, and tool calls.
Composition becomes pipelines, dependency graphs, Unix pipes, plugin systems, service orchestration, and package ecosystems.
That is the foundation.
Software engineering then adds the operational primitives that make real systems usable:
These aren’t academic primitives, but they are fundamental to building real software.
A system that ignores them may work as a project but fail as a product.
Software value tends to move upward as lower-level complexity gets packaged into interfaces other people can trust.
Each step hides some mess and creates a new surface area for others to build on. A data structure makes bits usable. An API makes a module usable by another team. A platform makes a system usable by an ecosystem.
That’s the path from primitive to infrastructure. The most valuable developer products usually win by turning one painful lower-level problem into a dependable higher-level interface.
Look at Stripe. The primitive is not “payments software” as a category. It is the clean interface Stripe gave developers for a messy bundle of state, identity, security, contracts, communication, and money movement. Supabase works the same way at a higher layer: it packages database, auth, storage, realtime, and generated APIs into one composable backend surface. Even naming primitives work like this. A URL, a package name, or a stable API identifier is small, but once other systems depend on it, it becomes infrastructure.
A useful definition:
A primitive is something that:
is reusable across many use cases
has a clear interface
composes with other things
hides complexity without hiding too much control
becomes more valuable when others build on it
feels like infrastructure rather than an app
The primitive is the thing underneath it that other workflows can reuse: secure remote code execution, agent memory, tool calling, eval infrastructure, sandboxed browser automation, agent identity, or observable workflow state.
That distinction matters because platforms are built around dependencies, not feature labels.
Platforms have to be earned
Why?
In a word, dependency.
Platform is an easy word to reach for. It makes companies feel bigger. But markets are typically less generous.
A company becomes a platform only after enough other companies depend on it that switching starts to feel like surgery.
Stripe made payments suck less before it became the financial operating system for the internet. Twilio gave developers one API to send an SMS before it became a communications platform. Vercel started with zero-config frontend and Jamstack deployments, then made the develop-preview-ship workflow feel natural before expanding into a broader platform for building, deploying, and running web apps.
Primitive first. Dependency second. Platform third.
It is tempting to skip the first two steps, especially given the speed at which we can build with AI.
Agents need to write code, run it, test it, deploy it, observe it, request approval, access secrets, remember context, coordinate with tools, and notify humans when something breaks.
The surface area is connected, so the founder instinct is to build the whole thing.
But that’s how you become useful in demos and non-essential in production.
Breadth makes the roadmap look serious.
Depth creates dependency.
A great question to ask is: what capability becomes unavoidable if AI changes how software gets built?
Once a company earns trust in one workflow, it can credibly expand into adjacent primitives.
Secure execution expands into testing, evals, and agent runtime.
Context retrieval expands into organisational memory.
Verification expands into quality control.
Observability expands into governance.
Permissioning expands into the control plane for non-human work.
Back to Vercel. It earned developer trust through deploys, previews, and the workflow around shipping web apps. Sandbox extends that trust into safe execution: short-lived Firecracker microVMs for running untrusted or AI-generated code with their own filesystem and network.
Cloudflare first became useful by making websites faster, safer, and more reliable: CDN, DNS, DDoS protection, WAF, and the network layer most teams did not want to build themselves. That trusted global network later became a place to run code through Workers. Containers and the Sandbox SDK extend the same platform into isolated Linux environments for running untrusted code, managing files, background processes, services, and agent workloads.
Owning the first workflow gives you a foothold in the adjacent primitive because the users and trust are already there.
A word of caution: new primitives get pulled in two directions. Startups try to own them as wedges. Earned platforms, and now also frontier model providers, try to absorb them as extensions.
If you are starting from zero, the market will make you prove it.
What happens when code is cheap?
Most technology waves create abundance in one part of the system, then expose scarcity somewhere else.
Scarcity moves.
Cloud made compute easier to access, which made deployment, reliability, and coordination matter more. Open source made building blocks abundant, then security, packaging, and maintenance became harder. Mobile made app creation easier, then discovery and retention became the choke points.
AI makes plausible code drafts abundant
So where does scarcity move?
Trust. Context. Safe execution. Verification. Permissioning. Observability. Human judgement.
The adoption curve is already ahead of the trust curve. Stack Overflow’s 2025 Developer Survey found that 84% of respondents use or plan to use AI tools, while 46% said they do not trust the accuracy of AI tool output and only 3% said they highly trust it. The top frustration was that AI solutions are almost right, but not quite, cited by 66% of developers. Debugging generated code taking more time came next, cited by 45%.
That is the wedge.
Not another code generator.
The gap between adoption and trust.
Cheap creation pushes value toward control. Acting agents make boundaries matter. Instant code generation raises the premium on verification.
I would stop asking: what can AI generate?
That question feels crowded to me.
So what questions matter more?
1. Secure execution: where does generated code run?
The first primitive is Secure Execution.
Start with the blunt one.
AI can write code.
Great.
Where does it run?
More provocatively: what could break when AI-generated work enters a real company?
Companies must be careful about running generated code in production, on a developer’s machine with access to secrets, or in any workflow where one bad command can ruin the afternoon, or much worse.
It needs a sandbox: a controlled environment where code can execute, packages can be installed, files can change, commands can run, and production systems stay untouched.
Boring, yes.
Usually a good sign.
This is the computation primitive meeting the runtime primitive.
The category is already forming. Modal describes AI code sandboxes as ephemeral runtime environments for LLM-generated code because generated code can hallucinate packages, crash environments, or create security vulnerabilities when executed unsafely. Daytona positions itself as secure infrastructure for running AI-generated code, with fast sandbox creation, isolated runtime protection, file, Git, LSP, and execute APIs, stateful snapshots, and agent/eval support.
The sandbox starts as a feature.
Then it becomes the container for action.
This is what a primitive looks like before the category name has settled.
Coding agents, test agents, data agents, and automation workflows that touch code all need execution.
The job to own: run untrusted work safely.
Narrow, painful, unavoidable.
Once you own safe execution, the expansion paths are natural: testing, evals, debugging, education, data analysis, agent runtime, and internal automation.
That’s the path from primitive to infrastructure.
2. Context: what does the system know?
The second primitive is Operational Memory.
We are all quick to blame models for being dumb when we don’t get the results we want. Most often it’s not a model problem.
The docs are stale. Architecture decisions live in someone’s head. The roadmap is in Linear. Customer complaints are in Zendesk. Incident history is buried in logs. API examples are out of date. The reason a weird function exists was explained once in Slack by an engineer who left eighteen months ago.
Then everyone blames the AI for mediocre work.
Sometimes the bottleneck is the organisation’s memory, not the model.
You can’t get good output from bad context.
That creates a new infrastructure layer: repo maps, living documentation, codebase memory, internal knowledge graphs, decision logs, dependency maps, context retrieval, and agent memory.
This is the data primitive, the naming primitive, and the persistence primitive getting repackaged for AI systems.
It is not enough for an agent to generate the code, the test, the migration, the pull request, and the documentation. It also needs a live understanding of the system it is changing: the decisions already made, the names that mean something, the dependencies that matter, and the constraints that are not sitting neatly in the docs.
It needs operational memory.
You can see companies circling this from different angles. Sourcegraph approaches it from the codebase: code search and code intelligence that give AI assistants context across local and remote repositories. Glean approaches it from company knowledge: permissions-aware search across documents, tickets, messages, and the other places work actually lives. Letta and Zep approach it from the agent side: durable memory and state that survive past a single context window.
The job to own: give the system the right context at the right time.
The durable value is the layer that knows enough about the work to make the prompt matter.
3. Verification: should this output be trusted?
The third primitive is Verification.
Right now, the market is still a little intoxicated by generation.
But generation is getting cheap. The expensive question is whether any of it should be trusted.
Tests, static analysis, security scans, policy checks, code review assistance, regression detection, runtime validation, and evals all matter more when generation gets cheap.
This is where types, contracts, tests, permissions, and observability start to converge.
Does the code work? Did it break another part of the system? Did it introduce a security issue? Did it violate an internal pattern? Did it duplicate logic? Did it increase maintenance cost? Did it solve the actual problem?
AI-generated code has one dangerous property: it looks finished before anyone understands it.
A blank screen is honest. It tells you thinking still needs to happen. A generated pull request can hide missing thinking behind confident syntax.
That’s why verification becomes the bottleneck.
Tools like CodeRabbit, Qodo, and Greptile are better examples of the verification primitive than another generator. They sit in the pull request and turn review into a repeatable quality layer: repository context, diff inspection, bug and security signals, team rules, suggested fixes, and better routing to human reviewers.
The useful question is simple: is this output safe, correct, and worth keeping?
The cheaper generation gets, the scarcer judgement feels.
4. Permissioning: what is the agent allowed to do?
The fourth primitive is Delegated Authority.
This is where the unsexy stuff starts to matter.
Agents are wonderful but weird.
They reason, call tools, make decisions, use context, skip steps, fail in odd ways, and sometimes get the right answer for the wrong reason.
Can this agent read customer data? Can it write to the database? Can it deploy? Can it access production logs? Can it send an email? Can it create a pull request? Can it merge one? Can it call this API? Can it spend money? Can it act without approval?
These questions are the difference between useful automation and organisational chaos.
This is identity, naming, security, and contracts becoming a product surface.
While AI only suggests things, permissions are annoying.
Once AI starts doing things, permissions become the product.
Arcade frames the developer version as agent authorisation: tools have OAuth scopes, users authorise specific actions, and agents act on behalf of users without owning broad access directly.
Keycard.ai aims to become the control plane for agent access: resolving agent identity, enforcing policy, issuing scoped credentials, supporting delegation between agents, and leaving an audit trail for tool calls.
The product details will vary by workflow, but the need is the same: agents need scoped, revocable, auditable access instead of a pile of long-lived keys and overpowered service accounts.
The job to own: delegated authority for non-human workers.
Auth has typically felt like a boring primitive. Much less so now.
But regardless, boring is often a good thing.
Boring primitives become huge because they sit underneath everything else.
5. Observability: what did the agent actually do?
The fifth primitive is Agent Traces.
Traditional observability tells you what software systems are doing: logs, metrics, traces, errors, latency, uptime.
Agentic software stretches that model.
The system is more than just the application. The system is the durable path the work takes through models, tools, policies, humans, and failure.
What did the agent do? Which tools did it call? Which context did it use? What did it ignore? What did it cost? What data did it touch? Where did the workflow fail? Can we replay it? Can we stop it from happening again?
This is the log primitive becoming a trace primitive, then becoming a governance primitive.
Tools like LangSmith treats conversation threads, tools, sub-agent delegation, and memory as first-class observability concepts.
Anthropic’s engineering framing is more explicit: Managed Agents separate the brain from the hands and the session. The session is an append-only durable log, the harness is the loop that calls Claude and routes tool calls, and the sandbox is the execution environment where Claude can run code and edit files.
The useful shift is that traces become the new logs.
In normal software, logs tell you what happened.
In agentic software, traces help you understand what happened, why it happened, and whether you should let it happen again.
The job to own: show me what the non-human worker did and why it mattered.
Companies using agents will need this because you can’t trust what you cannot inspect.
6. Orchestration: how does work move?
The sixth primitive is Durable Orchestration.
People hear workflow automation and think of the old category: Zapier, cron jobs, queues, triggers, approvals, internal tools.
AI introduces much more non-determinism
Now the workflow includes humans, agents, APIs, policies, retries, approvals, exceptions, and rollbacks. An agent does the first pass. A human approves. Another agent tests. A policy blocks one action. A message gets sent. A ticket gets created. A deployment happens. A rollback sits ready.
At that point, you are managing coordinated action.
This is control flow and composition showing up as a software engineering primitive.
And the non-deterministic nature of AI makes the work less predictable.
Temporal provides a useful reference point here. It argues that AI applications and agents are distributed systems on steroids and need workflows that can survive crashes, retry tool calls, preserve state, support humans in the loop, and resume where they left off. Temporal maps chains, graphs, and agentic loops to workflows, tool calls to activities, memory to durable workflow state, and human review to signals, updates, and queries.
The agent alone is rarely the system.
The job to own: moving work between humans, agents, and tools without losing control.
Some companies here will look like agent runtimes. Some will look like durable execution engines, task queues, approval layers, or coordination infrastructure.
The label matters less than the job: software work is getting more distributed, and distributed work needs orchestration.
Primitive bundling
There is another threat, often existential, for founders.
Primitive bundling.
Frontier model providers are selling more than intelligence now. They are packaging the runtime around intelligence.
Claude Managed Agents are a prebuilt, configurable agent harness that runs in managed infrastructure, with environments where Claude can read files, run commands, browse the web, execute code securely, use bash, file, web, and MCP tools, preserve event history, support stateful sessions, and handle long-running asynchronous work.
That’s a direct claim on the agent runtime.
The model provider is basically saying: here is the session, harness, and sandbox around the model.
It’s not hard to imagine them expanding into future adjacencies.
But bundling isn’t destiny.
One counterargument is model and harness agnosticism. Many teams won’t want their agent runtime, evals, memory, permissions, audit trail, and workflow state trapped inside one model provider’s managed environment. They’ll want to switch models, compare providers, route work by cost or latency, run some jobs locally, keep sensitive workloads in their own cloud, and preserve the workflow even when the underlying model changes.
That’s important because the model is not always the system of record. In most software organisations, the system of record is often the codebase, issue tracker, CI pipeline, deployment environment, policy layer, approval trail, and production runtime. The model participates in that system.
A good question to ask yourself is whether your primitive becomes more valuable as the world gets more multi-model, multi-agent, and multi-harness.
If your primitive is generic enough to become part of the default model runtime, you may be building a feature inside someone else’s platform.
If your primitive is workflow-specific, domain-specific, compliance-heavy, deeply integrated, model-agnostic, harness-agnostic, or tied to a distribution channel the model provider doesn’t own, you’re on more solid ground and more resistant to disintermediation.
Ask who has the most natural right to own this once the market understands it.
Sometimes the answer is the model provider.
Sometimes it’s the neutral layer everyone needs because no one wants the model provider to own the whole workflow.
This should make founders more honest about the primitive they choose.
What should founders look for over the next five years?
My bet: the next wave of AI dev tool budgets moves toward trust, constraints, inspection, and safe shipment.
Here is where I would look.
1. Sandboxes become a default dependency for agent products
Any product that lets agents write, run, test, or modify code will need a safe execution layer. The sandbox stops being a dev-only convenience and becomes part of the product contract: where work happens before it is trusted.
2. Context layers become part of the software supply chain
The winning systems will know the repo, the company, the user, the policy, and the reason past decisions were made. Context becomes infrastructure because every agent workflow is only as good as the world it can see.
3. Verification moves closer to creation
Code review, tests, evals, security checks, and policy enforcement become more important when the marginal cost of generation keeps falling. The review layer cannot stay downstream if the creation layer is producing work continuously.
4. Agent traces become governance artifacts
The audit question will no longer be whether the system returned an answer. It will be what it did, what it touched, and why it was allowed. Traces become the evidence layer for companies that need to trust non-human work.
5. Identity vendors move aggressively into agent control planes
The old question was which human can access which system. The new question is which agent can act for which human, under which policy, for how long, and with what record left behind.
6. Earned platforms keep expanding into adjacent primitives
The companies already sitting inside trusted workflows have a better shot at extending, but only where the extension feels earned. Distribution helps, but trust still has to map to the new job.
7. Foundation model providers keep expanding
The generic pieces will get pulled closer to the model. Sessions, tools, sandboxes, memory, and traces are too close to the default agent runtime for model providers to ignore.
Summary
The next five years of dev tools will be shaped by companies that own small, unavoidable actions.
The thing everyone needs but would rather avoid building. The thing underneath the workflow. The thing called again and again. The thing that becomes invisible when it works and painful when it breaks.
The opportunity is in the verbs underneath the surface layer.
Run code safely. Retrieve context. Verify output. Delegate authority. Observe behaviour. Coordinate work.
These are primitives for AI-native development.
Most dev tools in this market will not become primitives, and they do not need to. But if you are trying to build one of the rare companies that does, resist the urge to build the whole future.
Find the one painful capability other workflows will have to build on.
Then try to own it.
A transparent plug
Quick plug: there is an adjacent problem we are building around at devtune.ai.
It is not one of the dev tool primitives above. It is the distribution problem around them.
If you sell developer tools, AI systems are starting to shape how buyers discover, compare, and shortlist products. The old distribution stack was search, social, community, docs, analysts, and word of mouth. The new one still includes all of that, but now it is mediated by models that retrieve, summarise, rank, and recommend.
DevTune helps dev tool companies understand that surface area: how they appear in AI answers, which sources get retrieved, how their positioning is interpreted, where competitors are winning the narrative, and what should change across docs, content, messaging, and technical proof. The goal is not to game models. It is to make sure the right evidence exists, is legible, and shows up when buyers ask the questions that matter.
If that is becoming a live problem for your team, take a look or send me a note.
BEFORE YOU GO
Book a free 1:1 consultation call with me - I keep a handful of slots open each week for founders and product growth leaders to explore working together and get some free advice along the way. Book a call.
View your free public Dev Tool AI Market Presence Report - 500+ dev tools across 42+ verticals and growing, all in the Dev Tool AI Search Landscape.
Sponsor this newsletter - Reach over 13000 founders, leaders and operators working in product and growth at some of the world’s best tech companies including Paypal, Adobe, Canva, Miro, Amplitude, Google, Meta, Tailscale, Twilio and Salesforce.
PS: Thanks again to our sponsor: Knock







