BlogChat Island: Reliable AI Chat Without Streaming
Devlog

Chat Island: Reliable AI Chat Without Streaming

April 2, 2026·7 min read
Part of Idearc

A small island floating above a chaotic sea of interconnected components, representing an isolated piece of state protected from the surrounding system.

Every time the chat broke, Claude recommended streaming as the fix. I kept saying no.

My hesitation wasn't about complexity. Streaming commits you to a paradigm where the response is already rendering in the UI before the server is done with it. If you want to intercept the response, detect tool calls, run follow-up queries, or validate before showing anything, streaming makes all of that awkward. Complete responses keep the server in charge until there's something worth sending.

Choosing complete responses didn't eliminate complexity. It moved it. And where it landed was chat state inside a server-rendered app that refreshes its component tree whenever the AI writes data. That's where this gets interesting, and that's what I'm calling the Chat Island.

Two Ways to Build AI Chat

Two terms that matter for the rest of this post.

Streaming sends tokens to the client as they arrive. The model starts generating, the SSE connection opens, and the UI renders incrementally. The user sees something immediately.

Complete response waits. The server calls the model, waits for the full reply, does whatever it needs to do with it, then sends a single JSON response. The client renders once.

Streaming wins on feel. Everything else gets worse. Here's the tradeoff:

Streaming Complete response
Perceived latency Low — first token fast Higher — full wait
Server route complexity High — SSE, chunked handling Low — standard JSON
Client state complexity High — partial renders, append logic Low — single state update
Testability Hard — async chunks, timing-sensitive Straightforward
Error handling Complex — mid-stream failures Standard try/catch
Response interception Awkward — response is already rendering Clean — full control before the user sees anything
Cancellation User sees a half-rendered response get cut off Response either arrives complete or doesn't arrive

For most AI chat, that feel advantage is enough. In a tool-use-heavy app where the server needs to act on the model's output before the user sees it, it isn't. The interception row is the one that mattered to me.

Why Complete Responses

Complete responses give you control. Everything else follows from that.

Control. The server owns the entire lifecycle of the request. Before the user sees anything, you can run tool calls, execute database writes, detect empty replies and inject fallbacks, and validate the response. Nothing is in flight on the client while you're still working. The user gets the finished thing.

Simplicity. A complete-response route handler is just a POST endpoint. Call the model, do your work, return JSON. No SSE setup, no chunked transfer encoding, no client-side append logic. Primitives every developer already knows.

Testability. The entire request lifecycle is synchronous and server-side. Mock the model, assert the response. No timing sensitivity, no partial state to simulate, no async token stream to fake.

Here's the route:

ts
const result = await withGeminiTiming('chat', () => chat.sendMessage(message))

if (functionCalls?.length > 0) {
  // execute tool calls: add_features, update_feature, delete_feature
  // write results to the database
  const followUp = await withGeminiTiming('chat:followup', () => chat.sendMessage(functionResponses))
  const reply = followUp.response.text().trim() || 'Done. What else can I help with?'
  await supabase.from('conversation_messages').insert([
    { role: 'user', message },
    { role: 'assistant', message: reply },
  ])
  return NextResponse.json({ reply, actions_taken })
}

// no tool calls — plain text reply
const reply = result.response.text().trim()
await supabase.from('conversation_messages').insert([
  { role: 'user', message },
  { role: 'assistant', message: reply },
])
return NextResponse.json({ reply, actions_taken: [] })

The wait is the only real hurdle. That's where this gets interesting.

The Problem You Inherit

In Next.js App Router, refreshing server-side data means calling router.refresh(). It re-runs the server component tree and pushes fresh props down to the client. It doesn't know or care that your chat is in the middle of a request.

If ChatProvider lives inside that tree, its state gets rebuilt. The thinking indicator vanishes. The cancel button disappears. The input stays disabled with no explanation. Any response that arrives after the refresh lands in a context that has effectively reset.

In a tool-use-heavy app, this breaks things. The AI assistant doesn't just answer questions. It adds features, updates the idea, detects dependencies, and writes to the database. Some requests chain multiple AI calls and take up to 10 seconds. From the user's seat it was maddening. Chat worked sometimes, did nothing other times, and occasionally responded to something I had already moved on from.

This is obvious in hindsight.

flowchart TD Page["IdeaPage\n(Server Component)"] WT["WorkspaceTabs"] CP["ChatProvider ⚠️\n(inside re-render boundary)"] WTI["WorkspaceTabsInner"] Conv["ConversationPanel"] Page --> WT --> CP --> WTI --> Conv WTI -- "router.refresh()" --> Page style CP fill:#e74c3c,color:#fff style Page fill:#2c3e50,color:#fff

Chat Island

Move ChatProvider outside the component tree that refreshes. It needs to exist on an island, above the blast radius of router.refresh().

When the AI completes a request that wrote data, here's what happens:

  1. ChatProvider fires onActionsCompleted(), a callback passed in by the parent
  2. The parent shell fetches fresh data directly from Supabase and updates local state
  3. The server component tree never re-renders

ChatProvider knows nothing about workspace data. The moment it does, refreshes step all over chat. The callback is the boundary.

flowchart TD Page["IdeaPage\n(Server Component)"] subgraph island["Chat Island (never re-renders)"] WT["WorkspaceTabs\n(shell + local state)"] CP["ChatProvider ✓"] WTI["WorkspaceTabsInner"] Conv["ConversationPanel"] end DB[(Supabase)] Page --> WT WT --> CP --> WTI --> Conv WTI -- "onActionsCompleted()" --> WT WT -- "direct fetch" --> DB style CP fill:#27ae60,color:#fff style island fill:#1a2a3a,color:#fff

router.refresh() is never called.

tsx
export function WorkspaceTabs(props: WorkspaceTabsProps) {
  const [localFeatures, setLocalFeatures] = useState(props.features)
  const [localCompetitors, setLocalCompetitors] = useState(props.competitors)

  const handleActionsCompleted = useCallback(async (actions: string[]) => {
    const supabase = createClient()
    const { data } = await supabase
      .from('features')
      .select('*, feature_dependencies!feature_id(*)')
      .eq('idea_id', props.ideaId)
    if (data) setLocalFeatures(data)
  }, [props.ideaId])

  return (
    <ChatProvider ideaId={props.ideaId} onActionsCompleted={handleActionsCompleted}>
      <WorkspaceTabsInner
        {...props}
        localFeatures={localFeatures}
        localCompetitors={localCompetitors}
        // ...
      />
    </ChatProvider>
  )
}

That boundary is the Chat Island. The rest of the app updates around the chat, not through it.

The Generation Counter

Once ChatProvider is isolated, a new problem surfaces. The user can cancel a request, but the HTTP fetch has already left. You can't call it back.

The generation counter handles this. Every sendMessage call increments a counter and captures its value:

ts
const myGeneration = ++generationRef.current

When the response arrives, it checks in:

ts
if (generationRef.current !== myGeneration) return

Anything that should invalidate an in-flight request just bumps the counter:

Event Effect
User cancels Counter increments
45s timeout fires Counter increments
User sends a new message Counter increments

The late response arrives, checks the counter, finds a mismatch, and returns. No state corruption. No stuck UI. No AbortController.

Ref Mirrors and the Zero-Dep sendMessage

When a React context value changes, every component reading from that context re-renders. Trigger that mid-request and you disrupt the state managing the response.

The root cause: sendMessage had isStreaming in its dependency array. Every setIsStreaming(true) recreated the function, issued a new context value, and triggered the cascade.

ts
const sendMessage = useCallback(async (text: string) => {
  if (!isStreaming) { ... }
}, [isStreaming]) // recreated every time isStreaming changes

The fix is a ref mirror. State stays reactive for the UI. The ref stays in sync. sendMessage reads the ref and drops the dependency entirely.

ts
const isStreamingRef = useRef(false)
useEffect(() => { isStreamingRef.current = isStreaming }, [isStreaming])

const sendMessage = useCallback(async (text: string) => {
  if (isStreamingRef.current) return
  // ...
}, []) // empty deps. stable reference, forever.

The same pattern applies to onActionsCompleted. It's a prop that changes on every render. Mirror it into a ref, read the ref, deps stay empty.

Two things in a chat context that must never be confused: reactive UI state that drives rendering, and stable function references that components depend on. Refs are the bridge.

Building the Experience on a Solid Foundation

Once this is in place, the latency argument mostly disappears.

The case for streaming is instant feedback. The user sees tokens arriving and knows something is happening. But that problem is solvable without streaming. A thinking indicator that evolves over time communicates progress just as well, and it's a handful of lines.

Elapsed Message
0s Thinking...
5s Working on it...
15s Almost there...

No timers in the component. No special state. Just a number and some thresholds.

The cancel button appears after 4 seconds and calls cancelStreaming(), which bumps the generation counter. The in-flight response arrives, finds a mismatch, and returns. The UI resets cleanly.

A 45-second timeout does the same. Bumps the counter, sets a visible error, no stuck input.

What started as an awkward wait is now a responsive, pleasant experience. Not because we added streaming. Because we managed the wait.

Own Your Chat State

Chat state doesn't behave like app data. App data refreshes when the server says so. Chat state belongs to the user. It accumulates, it persists, and it should never be disrupted by something happening elsewhere in the app.

The Chat Island is just that principle made concrete. ChatProvider sits outside the component tree that refreshes. It communicates through callbacks, not through shared state. The app updates around it, not through it.

Get that boundary right and everything else slots in cleanly: the generation counter, the ref mirrors, the thinking indicator. Get it wrong and no amount of UX polish will make the chat feel reliable.

This is part 1 of a 2-part series. Part 2 is a deeper dive and covers the full implementation: ChatContext, the WorkspaceTabs restructure, the test suite, and a reference PRD you can adapt for your own app.

Welcome to Chat Island.

tool-callingai-chatnextjsstate-managementserver-componentschat-architecturestreamingsoftware-architecture