Chapter 9

Streaming and Suspense: Progressive Rendering in Practice

Traditional server-side rendering has a fundamental problem: it is blocking. The server must finish rendering the entire page before it can send a single byte to the browser. If the page includes a database query that takes 800ms, the user stares at a blank screen or spinner for 800ms before seeing any content at all.

Streaming solves this. It allows the server to begin sending data to the client before rendering is complete, enabling progressive page presentation. The user sees meaningful content almost immediately — the page fills in as data becomes available, rather than appearing all at once after the slowest query completes.

HTTP Chunked Transfer Encoding: The Foundation of Streaming

Streaming works because of HTTP/1.1's chunked transfer encoding mechanism. In a standard HTTP response, the server must set a Content-Length header before transmitting the body — the browser needs to know how much data to expect. Chunked transfer encoding removes this requirement:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: text/html

1a\r\n                          ← Size of first chunk (hexadecimal)
<html><head>...</head>\r\n      ← First chunk content
...\r\n                         ← More chunks follow...
0\r\n                           ← Zero-length chunk signals end of stream
\r\n

As soon as the browser receives any chunk, it can begin parsing and rendering — it does not wait for the complete response. Next.js exploits this mechanism to split the page into multiple chunks and send each chunk as soon as its data is ready.

In Node.js, this corresponds to the ReadableStream API. Next.js App Router uses Web Streams API internally to implement streamed HTML rendering. The connection between the rendering model and the wire protocol is direct: each Suspense boundary becomes a stream segment.

Suspense Boundaries: Telling Next.js Where to Cut

React's <Suspense> component is the control interface for streaming. It tells React: "this part of the UI might not be ready yet; use the fallback as a placeholder until it is."

In Next.js App Router, <Suspense> boundaries are simultaneously HTML stream cut points:

// app/dashboard/page.tsx
import { Suspense } from 'react'
import { RevenueChart } from './RevenueChart'
import { RecentOrders } from './RecentOrders'
import { TopProducts } from './TopProducts'
import { ChartSkeleton, OrdersSkeleton, ProductsSkeleton } from './Skeletons'

export default function DashboardPage() {
  return (
    <div className="dashboard">
      <h1>Analytics Overview</h1>

      {/* Each Suspense boundary is an independent stream segment */}
      <Suspense fallback={<ChartSkeleton />}>
        <RevenueChart />
      </Suspense>

      <div className="grid grid-cols-2">
        <Suspense fallback={<OrdersSkeleton />}>
          <RecentOrders />
        </Suspense>

        <Suspense fallback={<ProductsSkeleton />}>
          <TopProducts />
        </Suspense>
      </div>
    </div>
  )
}

// app/dashboard/RevenueChart.tsx
import { db } from '@/lib/db'

// Async Server Component — triggers Suspense boundary
export async function RevenueChart() {
  // Assume this query takes ~600ms
  const data = await db.order.groupBy({
    by: ['month'],
    _sum: { amount: true },
    orderBy: { month: 'asc' },
  })

  return (
    <div className="chart">
      {data.map(d => (
        <div
          key={d.month}
          style={{ height: `${(d._sum.amount ?? 0) / 100}px` }}
          className="chart-bar"
        />
      ))}
    </div>
  )
}

The rendering sequence proceeds as follows:

~0ms: Next.js sends the HTML header and the static page skeleton — including the HTML for ChartSkeleton, OrdersSkeleton, and ProductsSkeleton. The user sees the page structure immediately.
~300ms: The RecentOrders query completes. Next.js streams the real content for that section, along with a small inline <script> that instructs the client React runtime to replace the skeleton with the actual content.
~500ms: TopProducts completes, streamed and swapped in.
~600ms: RevenueChart completes, streamed and swapped in.

The user sees the page structure at 0ms and watches content fill in progressively. Without streaming, they would see nothing until 600ms — when the slowest query finishes.

loading.tsx: Route-Level Suspense Shortcut

Next.js provides a file convention that creates a top-level Suspense boundary automatically: create loading.tsx in a route directory and it becomes the fallback for a Suspense boundary wrapping that route's entire page content:

// app/dashboard/loading.tsx
export default function DashboardLoading() {
  return (
    <div className="dashboard animate-pulse">
      <div className="h-8 w-48 bg-gray-200 rounded mb-6" />
      <div className="h-64 bg-gray-200 rounded mb-4" />
      <div className="grid grid-cols-2 gap-4">
        <div className="h-48 bg-gray-200 rounded" />
        <div className="h-48 bg-gray-200 rounded" />
      </div>
    </div>
  )
}

loading.tsx provides a coarse-grained loading state for the entire route — it shows while any async operation in the page tree is pending. For finer granularity, add <Suspense> boundaries within the page itself.

The two approaches are complementary: loading.tsx for the route-level entry state, inline <Suspense> for independent sections that should load at different times.

Skeleton UI Design Principles

The quality of skeleton screens directly affects perceived performance. A well-designed skeleton communicates "content is coming" rather than "something broke." Key principles:

// components/skeletons/CardSkeleton.tsx
export function CardSkeleton() {
  return (
    // 1. Match exact dimensions of real content to prevent layout shift (CLS)
    <div className="card h-[240px] w-full">
      <div className="animate-pulse space-y-3 p-4">
        {/* 2. Shapes hint at content type (wide bar = heading, narrower = body text) */}
        <div className="h-4 bg-gray-200 rounded w-3/4" />
        <div className="h-3 bg-gray-200 rounded w-full" />
        <div className="h-3 bg-gray-200 rounded w-5/6" />
        {/* 3. Represent action areas */}
        <div className="pt-4 flex gap-2">
          <div className="h-8 bg-gray-200 rounded w-24" />
          <div className="h-8 bg-gray-200 rounded w-16" />
        </div>
      </div>
    </div>
  )
}

The shimmer animation — a moving highlight across the skeleton — communicates loading state more clearly than a static pulsing background:

/* globals.css */
@keyframes shimmer {
  0% { background-position: -200% 0; }
  100% { background-position: 200% 0; }
}

.skeleton-shimmer {
  background: linear-gradient(
    90deg,
    #f0f0f0 25%,
    #e0e0e0 50%,
    #f0f0f0 75%
  );
  background-size: 200% 100%;
  animation: shimmer 1.5s infinite;
}

The most important principle: the skeleton must have the same dimensions as the real content. If the skeleton is smaller and the content is larger, the page layout shifts when content loads — causing a high Cumulative Layout Shift (CLS) score that hurts both user experience and Core Web Vitals.

Nested Suspense: Granular Loading States

Suspense boundaries nest independently. An inner boundary completing does not affect an outer one, and vice versa:

// app/shop/[id]/page.tsx
import { Suspense } from 'react'

export default async function ProductDetailPage({
  params,
}: {
  params: Promise<{ id: string }>
}) {
  const { id } = await params

  return (
    <div className="product-detail">
      {/* Hero section — fast, rendered synchronously in the page component */}
      <ProductHero id={id} />

      <div className="grid grid-cols-3">
        <div className="col-span-2">
          {/* Product description — medium latency */}
          <Suspense fallback={<DescriptionSkeleton />}>
            <ProductDescription id={id} />
          </Suspense>

          {/* User reviews — slower (aggregation query) */}
          <Suspense fallback={<ReviewsSkeleton />}>
            <ProductReviews id={id} />
          </Suspense>
        </div>

        <aside>
          {/* Real-time price and stock — slowest, but highest priority for user */}
          <Suspense fallback={<PriceSkeleton />}>
            <PriceAndStock id={id} />
          </Suspense>

          {/* Recommendations — least critical, can load last */}
          <Suspense fallback={<RecommendationsSkeleton />}>
            <Recommendations id={id} />
          </Suspense>
        </aside>
      </div>
    </div>
  )
}

Each <Suspense> boundary resolves independently. ProductDescription streams in when its data is ready, without waiting for ProductReviews. Each section gives the user instant feedback: "I'm loading; I'll be here shortly."

The Waterfall Problem and How Promise.all Solves It

Streaming eliminates the waiting-for-each-other problem between independent components. But if a single component makes multiple sequential requests internally, the waterfall problem still exists within that component:

// ❌ Waterfall: three requests run in sequence
// Total time = 300 + 500 + 200 = 1000ms
export async function DashboardStats({ userId }: { userId: string }) {
  const user = await getUser(userId)         // 300ms — then waits for this to finish
  const orders = await getOrders(userId)     // 500ms — then waits for this
  const stats = await getStats(userId)       // 200ms

  return <StatsDisplay user={user} orders={orders} stats={stats} />
}

// ✅ Parallel with Promise.all: all requests fire simultaneously
// Total time = max(300, 500, 200) = 500ms
export async function DashboardStats({ userId }: { userId: string }) {
  const [user, orders, stats] = await Promise.all([
    getUser(userId),
    getOrders(userId),
    getStats(userId),
  ])

  return <StatsDisplay user={user} orders={orders} stats={stats} />
}

Take this further by splitting independent data sources into separate components with their own Suspense boundaries:

// ✅ Optimal: independent Suspense boundaries, truly concurrent
export default function Dashboard({ userId }: { userId: string }) {
  return (
    <div>
      <Suspense fallback={<UserSkeleton />}>
        <UserInfo userId={userId} />     {/* await getUser internally — 300ms */}
      </Suspense>
      <Suspense fallback={<OrdersSkeleton />}>
        <OrderList userId={userId} />    {/* await getOrders internally — 500ms */}
      </Suspense>
      <Suspense fallback={<StatsSkeleton />}>
        <StatsPanel userId={userId} />   {/* await getStats internally — 200ms */}
      </Suspense>
    </div>
  )
}

All three components' async operations execute concurrently on the server. Each streams to the client independently as it completes. The user sees StatsPanel at 200ms, UserInfo at 300ms, and OrderList at 500ms — rather than seeing all three together at 1000ms (sequential) or 500ms (Promise.all in one component). Progressive loading provides better perceived performance even when total data transfer time is the same.

The `use` Hook: Reading Promises in Client Components

Sometimes you need to handle a Promise in a Client Component — typically one that was initiated by a Server Component. React 19's use Hook allows a Client Component to "suspend" during rendering while waiting for a Promise to resolve:

// app/notifications/page.tsx — Server Component
import { Suspense } from 'react'
import { NotificationList } from './NotificationList'
import { fetchNotifications } from '@/lib/notifications'

export default function NotificationsPage() {
  // Important: do NOT await — pass the Promise itself to the Client Component
  const notificationsPromise = fetchNotifications()

  return (
    <Suspense fallback={<div>Loading notifications...</div>}>
      <NotificationList notificationsPromise={notificationsPromise} />
    </Suspense>
  )
}

// app/notifications/NotificationList.tsx — Client Component
'use client'

import { use } from 'react'

interface Notification {
  id: string
  message: string
  read: boolean
  timestamp: string
}

export function NotificationList({
  notificationsPromise,
}: {
  notificationsPromise: Promise<Notification[]>
}) {
  // use() suspends this component's rendering until the Promise resolves.
  // During suspension, the nearest Suspense boundary shows its fallback.
  const notifications = use(notificationsPromise)

  return (
    <ul className="notification-list">
      {notifications.map(n => (
        <li
          key={n.id}
          className={`notification ${n.read ? 'opacity-50' : 'font-medium'}`}
        >
          <span>{n.message}</span>
          <time>{n.timestamp}</time>
        </li>
      ))}
    </ul>
  )
}

The use Hook differs from await in important ways: it works inside Client Components (where async functions are not supported at the component level in React 18, though React 19 relaxes this), it can be called conditionally (unlike other Hooks), and it integrates with both Suspense and Error Boundaries. The Promise is initiated on the server (where fetchNotifications may have direct database access), then handed to the client component as a prop — the data fetching starts as early as possible.

Complete Real-World Example: RSC + Suspense + Streaming

Here is the full architecture pattern applied to a real analytics dashboard:

// app/analytics/page.tsx
import { Suspense } from 'react'
import { cookies } from 'next/headers'
import { redirect } from 'next/navigation'

// Dynamic rendering — reads cookies for authentication
export const dynamic = 'force-dynamic'

export default async function AnalyticsPage() {
  const cookieStore = await cookies()
  const token = cookieStore.get('auth_token')?.value
  if (!token) redirect('/login')

  // Page renders the skeleton immediately, all Suspense fallbacks stream first
  return (
    <div className="analytics-dashboard">
      <header className="dashboard-header">
        <h1>Analytics</h1>
        {/* Fast data — appears quickly */}
        <Suspense fallback={<span className="skeleton-shimmer w-24 h-4 inline-block" />}>
          <LiveVisitorCount token={token} />
        </Suspense>
      </header>

      {/* Key metrics — medium latency */}
      <div className="metrics-grid">
        <Suspense fallback={<MetricsSkeleton count={4} />}>
          <KeyMetrics token={token} />
        </Suspense>
      </div>

      <div className="charts-section">
        {/* Trend chart — slower (aggregation query) */}
        <Suspense fallback={<ChartSkeleton height={300} />}>
          <TrendChart token={token} period="7d" />
        </Suspense>

        {/* Geographic heatmap — slowest (geospatial aggregation) */}
        <Suspense fallback={<MapSkeleton />}>
          <GeoMap token={token} />
        </Suspense>
      </div>

      {/* Detail table — allow it to load last */}
      <Suspense fallback={<TableSkeleton rows={10} />}>
        <EventsTable token={token} />
      </Suspense>
    </div>
  )
}

// app/analytics/KeyMetrics.tsx
import { fetchMetrics } from '@/lib/analytics'

export async function KeyMetrics({ token }: { token: string }) {
  // Parallel fetch for all metrics — not sequential
  const [pageViews, sessions, bounceRate, avgDuration] = await Promise.all([
    fetchMetrics(token, 'pageviews'),
    fetchMetrics(token, 'sessions'),
    fetchMetrics(token, 'bounce_rate'),
    fetchMetrics(token, 'avg_duration'),
  ])

  return (
    <div className="grid grid-cols-4 gap-4">
      <MetricCard label="Page Views" value={pageViews} />
      <MetricCard label="Sessions" value={sessions} />
      <MetricCard label="Bounce Rate" value={`${bounceRate}%`} />
      <MetricCard label="Avg. Duration" value={`${avgDuration}s`} />
    </div>
  )
}

The user experience of this dashboard: the full layout skeleton appears at ~0ms. The live visitor count appears at ~100ms. Key metrics appear at ~400ms (the longest of the four parallel metric fetches). The trend chart appears at ~800ms. The geo map appears at ~1200ms (it's the slowest — geospatial aggregation is expensive). The detail table appears whenever it's ready. At no point does the user see a blank page or a single full-page spinner.

Combining Error Boundaries with Suspense

Suspense handles the loading state. Error Boundaries handle the error state. Together they form complete async UI management:

// components/AsyncBoundary.tsx — Client Component
'use client'

import { Component, ReactNode, Suspense } from 'react'

interface ErrorBoundaryState {
  hasError: boolean
  error?: Error
}

class ErrorBoundary extends Component<
  { children: ReactNode; fallback: ReactNode },
  ErrorBoundaryState
> {
  constructor(props: { children: ReactNode; fallback: ReactNode }) {
    super(props)
    this.state = { hasError: false }
  }

  static getDerivedStateFromError(error: Error) {
    return { hasError: true, error }
  }

  render() {
    if (this.state.hasError) {
      return this.props.fallback
    }
    return this.props.children
  }
}

// Convenience wrapper combining both
export function AsyncBoundary({
  children,
  loadingFallback,
  errorFallback,
}: {
  children: ReactNode
  loadingFallback: ReactNode
  errorFallback: ReactNode
}) {
  return (
    <ErrorBoundary fallback={errorFallback}>
      <Suspense fallback={loadingFallback}>
        {children}
      </Suspense>
    </ErrorBoundary>
  )
}

Usage at the call site becomes clean and consistent:

<AsyncBoundary
  loadingFallback={<MetricsSkeleton count={4} />}
  errorFallback={<div className="error-card">Failed to load metrics. <button>Retry</button></div>}
>
  <KeyMetrics token={token} />
</AsyncBoundary>

Summary

Streaming leverages HTTP chunked transfer encoding to let the server send data as it becomes available, rather than waiting for the entire page to render. Suspense boundaries are the interface through which React tells Next.js where to cut the stream. loading.tsx provides a route-level Suspense shortcut for the initial loading state. Avoiding data waterfalls requires Promise.all for multiple requests within a single component, and splitting independent data sources into separate components with their own Suspense boundaries. The use Hook extends the Suspense model to Client Components that receive Promises as props. Applied together — RSC for data access, Suspense for boundary declaration, and streaming for progressive delivery — this system enables user-perceived performance that approaches native application responsiveness while retaining the SEO and accessibility benefits of server rendering.

Rate this chapter

4.7 / 5 (38 ratings)