Edge & Fluid Compute

What runs at the edge. Vercel's fluid runtime explained.

For most of web history, you had two server choices: a long-running Node process on a VM, or short-lived serverless functions in AWS Lambda. Then came edge runtimes (run JS in 300+ cities, milliseconds from the user) and now Fluid Compute (one process handles many concurrent requests, like Node, but elastic). Knowing which to pick for which job is one of the most impactful architecture calls you make in 2026.

What "edge" actually means

"The edge" is shorthand for the data center physically closest to the user. A request to your origin in Virginia from a user in Mumbai takes ~250ms one-way. An edge function running in a Mumbai POP responds in 5ms. That round-trip difference compounds when your page makes many small requests.

Edge runtimes (Vercel Edge, Cloudflare Workers, Deno Deploy, etc.) are built on the V8 isolate model rather than full Node containers. An isolate is a sandboxed JavaScript context, cheap to spin up, microseconds to start. The catch: you only get a subset of the Node API. No fs. No native modules. Many npm packages do not work.

What you get: fetch, Web Streams, Web Crypto, TextEncoder, URL, Request / Response. Standard web APIs.
What you give up: file system, child processes, most native bindings, raw TCP sockets, parts of process.
Hard limits: tight memory budgets (often 128 MB), tight CPU budgets (often 50ms wall-clock per invocation), small bundle sizes (often a few MB).

app/api/hello/route.ts

// Opt into edge with one line.
export const runtime = "edge";

export async function GET(req: Request) {
  const url = new URL(req.url);
  const name = url.searchParams.get("name") ?? "world";
  return new Response("Hello, " + name + "!", {
    headers: { "content-type": "text/plain" },
  });
}

Same code, different runtime

On Vercel, the difference between an edge function and a Node function is one export. The function code itself stays the same as long as you only use web-standard APIs.

Where edge wins (and where it loses)

Edge is built for one thing: tiny, fast, stateless requests. It is brilliant for:

Auth checks and redirects (look at the cookie, return 302 or 200).
A/B testing and feature flag evaluation.
Geo personalization (return different content per country).
Image optimization, OG image generation.
Streaming AI responses (low TTFB matters here).

It is the wrong choice for:

Anything that needs heavy npm dependencies (Prisma, Sharp, Puppeteer).
Long-running work (LLM agent loops, large file processing).
Tight, latency-sensitive database queries to a fixed-region DB. A fast function in Mumbai is no help if your Postgres is in Virginia. Now every query is 250ms.

Latency triangle

Function + database round-trip = your real user latency. Putting the function at the edge while leaving the database centralized often makes things worse. Co-locate, or use a globally distributed DB.

Node functions: the "classic" default

Node functions run real Node.js in a container. You get the full ecosystem: Prisma, Sharp, image libraries, anything on npm. The cost is a slower cold start (a few hundred ms when scaling up) and being pinned to a single region (or a few regions if you pay for it).

In the traditional serverless model, each request gets its own container. If you have 100 concurrent users, you spin up 100 containers. They each idle while waiting for I/O (a database query, an LLM call), burning compute time you pay for. This was the model for a decade. It worked, but it was wasteful for I/O-bound apps.

Fluid Compute: the "always-on Lambda"

Fluid Computeis Vercel's 2025 rethink of serverless. Instead of one container per request, a single Node instance handles many concurrent requests, the way a long-running Node server does. The runtime stays warm. When traffic spikes, Vercel spins up additional instances. When it drops, they scale down.

Drastically lower cold starts: a warm instance handles requests with no spin-up. Cold starts only happen on the first request to a brand-new instance.
Lower cost for I/O-bound work: while one request is waiting for an LLM token, the same instance is happily handling three other requests. You are billed for CPU time, not wall clock, so idle waiting is effectively free.
Full Node compatibility: it is still real Node.js, so all your npm deps work. No edge subset.

app/api/chat/route.ts

// No special opt-in needed: Fluid is the default for Node functions on Vercel.
// The runtime keeps the instance warm and lets multiple in-flight requests share it.

import { streamText, convertToModelMessages } from "ai";

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = streamText({
    model: "anthropic/claude-opus-4-5",
    messages: convertToModelMessages(messages),
  });

  // While this stream is open, the same instance accepts more incoming requests.
  return result.toUIMessageStreamResponse();
}

Why Fluid is huge for AI

LLM responses are slow (5 to 30+ seconds) and almost entirely I/O. Under classic serverless that is 30 seconds of paid CPU time per request. Under Fluid, the same instance can serve dozens of streams in parallel for the same CPU bill. AI apps are the killer use case.

How to decide

Use this decision tree as a starting point.

Does the request need a quick decision (under 50ms) and no heavy deps? Edge. Auth, redirects, geo, A/B.
Is it I/O-bound (DB queries, LLM calls, external HTTP)? Node + Fluid. Especially anything streaming.
Is it CPU-bound or long-running (image processing, video encoding, agentic loops)? Node function with a longer timeout, or move it to a background worker / cron / queue.
Does it need persistent connections (WebSockets, pub/sub)? A separate long-running server. Serverless and edge are both poor fits.

A quick benchmark mental model

Rough numbers to keep in your head:

bash

# Cold start
Edge isolate         ~5ms
Node Fluid (warm)    ~0ms (warm) / ~200ms (cold)
Classic Lambda Node  ~300-1000ms (cold)

# Per-request RAM ceiling
Edge isolate         ~128 MB
Node Fluid           512 MB - 4 GB (configurable)

# Concurrency per instance
Edge isolate         many (isolates are cheap)
Node Fluid           dozens of in-flight requests
Classic Lambda       1 per container

The bottom line: edge for tiny stateless work, Fluid for everything else. Classic single-request-per-container serverless is increasingly a legacy pattern on Vercel.

Quick quiz

Quiz1 / 3

Why is putting a function at the edge sometimes worse than running it in a single region?

Recap

Edge runtimes use V8 isolates, run close to users, give you web-standard APIs and tight resource limits.
Node functions give you the full npm ecosystem, run in a region, and historically suffered from cold starts and per-request containers.
Fluid Computeis Vercel's new default: warm Node instances multiplex concurrent requests, slashing cost for I/O-bound work.
Edge wins for short, stateless, latency-critical work (auth, redirects, geo). Fluid Node wins for I/O-heavy, dep-heavy, or streaming work.
Always co-locate functions with their databases. A fast function plus a slow DB call equals a slow user experience.