Architecture

How to Scale a Next.js App to 100k Users Without Breaking the Bank

A practical playbook for scaling Next.js apps to 100k users using ISR, multi-layer caching, CDN optimization, and backend tuning -- all for under $500/month.

A
Abhishek Patel9 min read

Infrastructure engineer with 10+ years building production systems on AWS, GCP,…

How to Scale a Next.js App to 100k Users Without Breaking the Bank
How to Scale a Next.js App to 100k Users Without Breaking the Bank

Your Next.js App Won't Scale Itself

Scaling a Next.js application to handle 100,000 concurrent users isn't about throwing money at bigger servers. It's about understanding where your bottlenecks live and eliminating them systematically. I've taken Next.js apps from choking at 5,000 users to comfortably serving 150,000 -- and the cost increase was under 40%. The secret? Most performance problems are self-inflicted: uncached API calls, bloated bundles, unoptimized images, and SSR pages that should be static.

This guide covers the exact playbook I use. We'll go layer by layer -- rendering strategy, caching, CDN, backend, and infrastructure -- with real numbers and cost estimates at each step.

What Does "Scaling Next.js" Actually Mean?

Definition: Scaling a Next.js application means optimizing its rendering pipeline, caching layers, and infrastructure so it serves more concurrent users with acceptable response times (under 200ms for pages, under 100ms for API routes) without proportionally increasing costs.

Scaling has two dimensions: vertical (bigger machines) and horizontal (more machines). Next.js apps benefit heavily from horizontal scaling because the framework already supports stateless rendering. But before you add servers, you need to make sure each server is doing as little unnecessary work as possible.

Step 1: Audit Your Rendering Strategy

The single biggest scaling lever in Next.js is choosing the right rendering strategy per page. Most teams default to SSR everywhere and wonder why their servers melt under load.

  1. Identify static pages -- marketing pages, blog posts, documentation, product listings that change infrequently. These should use Static Generation (SSG) or Incremental Static Regeneration (ISR).
  2. Move dynamic-but-cacheable pages to ISR -- product pages, user profiles, dashboards with minute-level freshness. Set revalidation periods of 60-300 seconds.
  3. Reserve SSR for truly dynamic content -- personalized pages, real-time data, authenticated views that can't be cached at the edge.
  4. Use client-side rendering for user-specific widgets -- shopping carts, notification bells, profile menus. Render the page shell server-side, hydrate user data client-side.

Pro tip: A single SSR page at 100ms render time can handle about 10 requests/second per CPU core. Switch that to ISR with a 60-second revalidation, and one core serves effectively unlimited traffic for that page. For 100k users, this difference is the difference between 50 servers and 3.

Step 2: Implement Multi-Layer Caching

Caching is the force multiplier. Every cache hit is a request your server never processes. You need caching at four layers:

CDN / Edge Cache

Put Cloudflare, CloudFront, or Vercel's edge network in front of everything. Static assets get cached automatically. For ISR pages, the CDN serves stale content while revalidation happens in the background. A properly configured CDN handles 95%+ of requests without touching your origin.

Application-Level Cache

Use Redis or Memcached for computed data that's expensive to generate. Database query results, API response aggregations, session data. A Redis cache with 100MB of memory can eliminate thousands of database queries per second.

// Example: Cache expensive computation with Redis
import { Redis } from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);

async function getPopularProducts() {
  const cached = await redis.get('popular-products');
  if (cached) return JSON.parse(cached);

  const products = await db.product.findMany({
    orderBy: { sales: 'desc' },
    take: 50,
  });

  await redis.set('popular-products', JSON.stringify(products), 'EX', 300);
  return products;
}

In-Memory Cache

For data that changes rarely and is read constantly -- feature flags, configuration, navigation menus -- cache directly in the Node.js process memory. Libraries like lru-cache work well here. Zero network latency, zero serialization cost.

Browser Cache

Set proper Cache-Control headers. Static assets should have long TTLs with content hashes in filenames. API responses that don't change per-user can carry s-maxage for CDN caching with stale-while-revalidate for seamless updates.

Step 3: Optimize Your Bundle

A bloated JavaScript bundle doesn't just slow down the client -- it increases server-side rendering time and memory usage. Every kilobyte matters at scale.

  • Analyze your bundle -- run next build and check the output. Any page over 200KB of JS needs attention.
  • Dynamic imports -- lazy-load heavy components (charts, editors, modals) with next/dynamic. If a component isn't visible on initial load, don't ship it.
  • Replace heavy libraries -- swap moment.js (300KB) for date-fns (tree-shakeable). Replace lodash with individual imports or native methods.
  • Tree-shake aggressively -- use named imports, enable sideEffects: false in package.json for your own code.
  • Image optimization -- use next/image with proper sizing. Serve WebP/AVIF. A single unoptimized 5MB hero image costs more bandwidth than your entire JS bundle.

Step 4: Backend and Database Tuning

Your Next.js frontend can be perfectly optimized, but if your API routes hit a slow database, nothing else matters.

Connection Pooling

Serverless and edge functions create new database connections per invocation. Without pooling, 1,000 concurrent users means 1,000 database connections -- and most databases choke at 100-200. Use PgBouncer for PostgreSQL or Prisma Accelerate for managed pooling.

Query Optimization

Enable query logging and find your slowest queries. Add database indexes on columns used in WHERE, JOIN, and ORDER BY clauses. Use SELECT only the columns you need, not SELECT *. Batch N+1 queries with data loaders.

Read Replicas

For read-heavy applications (most web apps), route read queries to replicas and writes to the primary. This alone can 3-5x your database throughput. Most ORMs support read/write splitting natively.

Step 5: Horizontal Scaling Infrastructure

PlatformCost at 100k UsersAuto-ScalingCold StartsBest For
Vercel$150-400/moAutomaticMinimal (Edge)Teams wanting zero DevOps
AWS ECS/Fargate$200-500/moTarget trackingNone (always-on)Teams needing full control
AWS Lambda + CloudFront$50-200/moAutomatic100-500msSpiky traffic patterns
Kubernetes (EKS)$300-800/moHPA/KEDANoneMulti-service architectures
Coolify/Self-hosted$80-150/moManual/limitedNoneCost-sensitive solo devs

Watch out: Vercel's pricing scales with function invocations, not just traffic. A misconfigured ISR page that revalidates every second can generate millions of invocations. Always set sensible revalidation periods and monitor your usage dashboard.

Step 6: CDN and Edge Optimization

At 100,000 users, your CDN isn't optional -- it's your primary serving layer. Configure it properly:

  1. Cache static assets forever -- Next.js hashes filenames, so set Cache-Control: public, max-age=31536000, immutable for /_next/static/.
  2. Cache ISR pages at the edge -- CloudFront or Cloudflare should cache HTML responses with the s-maxage header Next.js sets.
  3. Use edge middleware sparingly -- middleware runs on every request. Keep it lightweight: authentication checks, redirects, geo-routing. Don't do database queries in middleware.
  4. Enable compression -- Brotli at the edge reduces transfer sizes by 15-20% over gzip. Most CDNs support it natively.
  5. Configure origin shield -- a single point-of-presence that aggregates cache misses before they hit your origin. Reduces origin load by 50-80% during cache warming.

Cost Breakdown: Scaling to 100k Users

ComponentServiceMonthly CostNotes
Compute3x c6g.large (ECS)$150ARM instances, 2 vCPU, 4GB RAM each
CDNCloudFront$50-80~2TB transfer/month
CacheElastiCache (Redis)$40cache.t4g.small
DatabaseRDS PostgreSQL$100db.r6g.large with read replica
Load BalancerALB$25Fixed + LCU charges
MonitoringCloudWatch + Grafana$30Custom metrics and dashboards
Total$395-425/moCompare to $2,000+ without optimization

That's under $0.005 per user per month. The key savings come from ISR eliminating 95% of SSR compute and Redis eliminating 80% of database queries.

Frequently Asked Questions

How many concurrent users can a single Next.js server handle?

A single Next.js server on a 2-vCPU machine handles roughly 500-2,000 concurrent users depending on your rendering strategy and page complexity. SSR-heavy apps sit at the low end. Apps using ISR and proper caching reach the high end. With a CDN absorbing static requests, even a modest server handles significant traffic because most requests never reach the origin.

Should I use Vercel or self-host for high traffic?

Vercel is excellent up to moderate traffic and for teams without DevOps experience. At 100k+ users, self-hosting on AWS (ECS or EKS) typically costs 40-60% less and gives you full control over caching, networking, and scaling behavior. The tradeoff is operational complexity -- you need someone who understands container orchestration and CDN configuration.

Is ISR better than SSR for scaling?

Almost always. ISR generates a static page, caches it, and serves it until the revalidation period expires. One render serves millions of requests. SSR renders every request individually, consuming CPU proportionally to traffic. Use SSR only for pages that must reflect real-time or per-user data that can't tolerate even a few seconds of staleness.

How do I handle database connections at scale?

Use a connection pooler like PgBouncer or Prisma Accelerate. Without pooling, each serverless function or container opens its own database connection, quickly exhausting limits. PgBouncer in transaction mode lets thousands of application connections share 50-100 actual database connections, which is typically plenty for 100k users.

What's the cheapest way to scale Next.js?

Maximize static generation and ISR to minimize compute. Use Cloudflare (free tier handles massive traffic) as your CDN. Deploy on a single well-tuned VPS ($20-40/month) behind the CDN. Add Redis ($15/month) for API caching. This setup handles 100k users for under $100/month if your pages are mostly static or ISR with long revalidation windows.

How do I monitor performance at scale?

Track three metrics religiously: Time to First Byte (TTFB), server CPU utilization, and cache hit ratio. Use Next.js built-in analytics or integrate with Datadog/Grafana. Set alerts for TTFB exceeding 500ms and CPU above 70%. A dropping cache hit ratio is your earliest warning of scaling problems -- investigate before users notice.

Start With the Highest Leverage

Don't try to implement everything at once. The order matters. First, audit your rendering strategy and move pages to ISR where possible -- this alone might solve your scaling problem. Second, add a CDN if you don't have one. Third, implement Redis caching for expensive queries. Only after these three steps should you consider horizontal scaling. Most Next.js apps that "need" 10 servers actually need 2 servers and better caching. Measure before you scale, cache before you compute, and static before dynamic.

A

Written by

Abhishek Patel

Infrastructure engineer with 10+ years building production systems on AWS, GCP, and bare metal. Writes practical guides on cloud architecture, containers, networking, and Linux for developers who want to understand how things actually work under the hood.

Related Articles

Enjoyed this article?

Get more like this in your inbox. No spam, unsubscribe anytime.

Comments

Loading comments...

Leave a comment

Stay in the loop

New articles delivered to your inbox. No spam.