Skip to main content

Guide

Expand your crawl budget — 7 levers that actually work

Seven engineering levers that expand crawl budget on JavaScript sites: sitemap sharding, canonical discipline, pre-rendering, cache headers, internal linking, response-code hygiene, and redirect cleanup.

15 min readProcedure: 45 min planIntermediateUpdated

Introduction

If you are still building the mental model of how crawl budget works on JavaScript sites, start with crawl budget fundamentals — this expansion guide assumes you already know which signals throttle versus reward crawl.

Crawl budget is not directly allocatable: you cannot ask Google for more. You can, however, remove signals that tell Googlebot to throttle your site and add signals that reward additional crawl.

The seven levers below are ordered by effort-to-impact. The first three move the needle on most sites within two sprints; the remaining four are long-tail improvements.

Step-by-step

How to: Expand crawl budget

  1. 1

    Shard the sitemap by priority

    Split one sitemap into three: fresh canonicals (daily lastmod), stable canonicals (weekly), and long tail (monthly). Googlebot processes shards in parallel, which effectively increases fan-out from your site.

  2. 2

    Enforce canonical discipline on filter URLs

    Return rel=canonical to the canonical filter URL for every filter permutation. This removes millions of URLs from the crawlable surface without hurting discoverability. The URL graph compresses to what actually ranks.

  3. 3

    Pre-render the canonical set

    Route crawler traffic on canonical URLs to a pre-rendered cache. Render queue depth drops, and Googlebot can crawl more URLs per cycle because each one is cheaper.

  4. 4

    Tune cache headers for recrawl efficiency

    Set Cache-Control max-age and Last-Modified so bot revalidation is cheap. Combined with pre-rendering, most recrawl hits become 304 Not Modified — the fastest possible response for the crawler.

    text
    HTTP/1.1 200 OK
    Content-Type: text/html; charset=utf-8
    Cache-Control: public, max-age=3600, stale-while-revalidate=86400
    Vary: User-Agent
    Last-Modified: Tue, 21 Apr 2026 09:15:00 GMT
    ETag: "snapshot-2d41f5a"
  5. 5

    Re-weight internal links on high-priority pages

    High-priority pages should link to other high-priority pages within 1-2 clicks. PageRank distributes through internal links; a page 4 clicks deep from the homepage is functionally invisible to crawlers at scale.

  6. 6

    Fix Soft 404s and 3xx redirect chains

    Search Console's Index Coverage report surfaces Soft 404s. Each one wastes a crawl slot. Redirect chains longer than 1 hop also burn budget. Cleaning both is mechanical but time-consuming.

  7. 7

    Remove client-side blocking on hydration

    Large JavaScript bundles that block rendering stall the render queue. Pre-rendering bypasses this for crawler traffic, but for non-pre-rendered pages, hydration performance still caps crawl rate. Audit the top 20 URL patterns for bundle size.

FAQ

Questions engineers ask about this guide

Sitemap sharding and canonical discipline produce visible change in 2-4 weeks. Pre-rendering and cache headers take 1-2 weeks. Internal linking changes propagate slowest — 4-12 weeks depending on link graph size.

The old Crawl Rate setting in Search Console was removed. Google now adjusts automatically based on host load and site health. The levers above are the only way to signal "we can handle more."

Yes, if the removed pages were zero-value. Removing Soft 404s and thin pages concentrates crawl on the remaining URLs. Do not remove pages that have any organic traffic or inbound links.

Editorial trust

Written by ostr.io engineering team · Engineering Team. We build and run pre-rendering infrastructure for more than 200 engineering teams, which is where the numbers and code samples on this page come from.

Last updated . Editorial scope and review policy: About prerender.info.