Your API routes should just enqueue

The bug that doesn’t exist anymore

Here’s a class of bug I used to ship and no longer can: a user clicks “Tailor my resume,” the route calls an LLM inline, the model takes 28 seconds, the serverless function hits its 30-second wall, the request 504s, and — this is the worst part — the model did finish, the user just never got the result, and on a metered plan I got billed for 28 seconds of a function doing nothing but waiting.

In RoleReady, that bug is architecturally impossible. Not “we added a timeout guard.” Impossible, because route handlers are not allowed to do the work. They enqueue it. There are 204 route files in the app and the canonical ones all obey the same contract.

The contract

Every long-running route does exactly five things, in order:

Authenticate the caller.
Validate the input (Zod).
Check the user owns the thing they’re acting on.
Pre-create an ai_tasks row with status pending.
Send an Inngest event with that task’s ID, and return the ID.

export async function POST(req: Request) {
  const { userId } = await withAuth(req);
  const { jobId } = ParseTailorBody.parse(await req.json());
  await assertOwnsJob(userId, jobId);

  // pre-create the row BEFORE enqueuing so the client has something to poll
  const task = await createAiTask({ userId, kind: 'tailor_resume', status: 'pending' });
  await inngest.send({
    name: 'ai/tailor.resume',
    data: { existingTaskId: task.id, jobId, userId },
  });

  return Response.json({ aiJobId: task.id }, { status: 202 });
}

Note the 202. The route is telling the truth: accepted, not done. The actual work lives in a separate Inngest function:

export const tailorResume = inngest.createFunction(
  { id: 'ai-tailor-resume', retries: 3, onFailure: markTaskFailed },
  { event: 'ai/tailor.resume' },
  async ({ event, step }) => {
    const intel = await step.run('job-intel', () => loadJobIntelligence(event.data.jobId));
    const result = await step.run('baml', () => b.TailorResume(/* ... */));
    await step.run('persist', () => saveTailoredResume(event.data.existingTaskId, result));
  },
);

The client polls GET /api/ai-jobs/{aiJobId} and reads pending → running → complete (or failed) off the row. There are 33 of these worker functions in the codebase. Every one of them runs with retries: 3 and an onFailure handler that flips the task to failed with a reason the UI can show.

Why the pre-created row matters

The subtle decision is creating the ai_tasks row in the route, before the event is sent, and passing its ID into the event as existingTaskId. The naive version lets the worker create its own row when it starts. That opens a race: the route returns, the client immediately polls, and the row doesn’t exist yet — so the UI flashes “not found” before it flashes “running.” Pre-creating closes the gap. The ID exists the instant the client has it.

It also means the failure path is honest. If Inngest never even picks up the event, the row still sits there as pending, and a sweeper can age it out to failed instead of it vanishing silently. A task that was accepted should always have a terminal state.

What this buys you

No request ever waits on a model. The slowest thing a route does is one INSERT and one event send — single-digit milliseconds. Wall-clock cost on the serverless platform drops to near zero per AI action.
Retries are free and invisible. A transient model 500 is a retried step, not a user-facing error. The user sees “running” the whole time.
Progress is real. The spinner is backed by a database row that actually changes state, not a setTimeout pretending.
The expensive code runs in one place. Billing and quota are enforced once, on the enqueue route (consumeFeatureQuotaIfAllowed), so there’s no path where a worker does paid work for a user who’s over their limit.

The rule that keeps it honest

The reason this doesn’t rot is that it’s written down as law, not convention. The repo’s AGENTS.md — which both I and any AI agent working in the codebase must follow — states it flatly:

App routes enqueue; Inngest executes. No route-local fire-and-forget; no inline BAML in handlers for canonical long tasks; pre-create the ai_tasks row and return aiJobId.

There’s exactly one sanctioned exception: the interactive command parser that turns “schedule a follow-up with Stripe next Tuesday” into a typed plan runs inline, because it’s a synchronous conversation turn and billing is gated on the execution route it produces, not the parse. Every other long task enqueues.

When you don’t need this

I’d be lying if I said every app needs Inngest. If your slowest route is a 200ms database query, this is overhead you’ll resent. The pattern earns its complexity precisely when you have work that is (a) slower than a request should be, (b) flaky enough to want retries, and (c) something the user wants progress on. AI features are all three at once, which is why every AI feature in the product goes through it and nothing else does.

The test is simple: if you’ve ever shipped a route that calls a model inline and prayed, you already have the bug. Enqueue it.