Skip to content
C Codeloom
Astro

Astro Pagefind Search Tutorial

Add fast, static, client-side search to an Astro site using Pagefind. Learn how the index is built, how to wire up the UI, how to scope and weight content, and pitfalls around build order and large sites.

·4 min read · By Codeloom
Beginner 8 min read

What you'll learn

  • Why Pagefind is a great fit for static Astro sites
  • How the post-build indexing step works
  • How to add a working search UI in a few lines
  • How to scope, weight, and filter results
  • Pitfalls with build order, large sites, and CSP

Prerequisites

  • An Astro site that builds successfully

What and Why

Most static sites do not need a search server. They need a small index file the browser can query directly. Pagefind is a static search library that reads your built HTML, generates a compact, chunked index, and ships a tiny client that fetches only the chunks it needs at query time. No backend, no API key, no monthly bill.

For an Astro site that already builds to static HTML, Pagefind is the path of least resistance. You add a post-build step and a couple of dozen lines of UI code, and search just works.

Mental Model

Pagefind runs after your site is built, not during. It points at the dist/ directory, walks every HTML file, extracts the indexable text (everything inside data-pagefind-body or, by default, the main content), and emits a /pagefind/ directory of JSON, WASM, and JS files.

The client script lazy-loads chunks of the index based on the query. A search for “react” only downloads the chunks that contain words starting with those letters. That is why Pagefind scales to thousands of pages without a server: most users only ever touch a few kilobytes.

Hands-on Example

Install and add a build step:

npm install -D pagefind

Update package.json:

{
  "scripts": {
    "build": "astro build && pagefind --site dist"
  }
}

Mark your post content as indexable in your layout:

---
const { title } = Astro.props;
---
<article data-pagefind-body>
  <h1>{title}</h1>
  <slot />
</article>

Add a search UI on /search.astro:

---
---
<input id="q" type="search" placeholder="Search posts..." />
<ul id="results"></ul>

<script>
  const pagefind = await import('/pagefind/pagefind.js');
  await pagefind.init();
  const q = document.getElementById('q');
  const out = document.getElementById('results');
  q.addEventListener('input', async () => {
    const { results } = await pagefind.search(q.value);
    out.innerHTML = '';
    for (const r of results.slice(0, 10)) {
      const data = await r.data();
      const li = document.createElement('li');
      li.innerHTML = `<a href="${data.url}">${data.meta.title}</a>`;
      out.appendChild(li);
    }
  });
</script>
Build time:
astro build  ->  dist/*.html
                     |
                     v
                pagefind --site dist
                     |
                     v
                dist/pagefind/{index chunks, wasm, js}

Runtime:
user types "css"
      |
      v
pagefind.js -> fetch only chunks for "cs*"
      |
      v
results list rendered in browser
Build-time indexing and runtime querying

Common Pitfalls

The first pitfall is running Pagefind before Astro builds. The index needs dist/ to exist. Chain astro build && pagefind in a single script.

The second is forgetting data-pagefind-body. Without it, Pagefind tries to guess the main content, and on layout-heavy pages it can index navigation or footers. Be explicit.

The third is large sites with many duplicates. Pagefind handles 10k+ pages fine, but only if each page has distinct content. Boilerplate-heavy sites bloat the index. Use data-pagefind-ignore on shared sections.

The fourth is Content Security Policy. Pagefind loads WASM and JS dynamically. If your CSP blocks wasm-unsafe-eval or scripts from self, init will fail. Add the right directives.

Finally, the dev server does not run the indexer. Search will appear broken on astro dev. Test against a real build, or write a small npm run preview flow that builds first.

Best Practices

Mark each indexable region with data-pagefind-body and noise regions with data-pagefind-ignore. The cleaner the input, the better the ranking.

Use Pagefind filters (data-pagefind-filter="category:CSS") to power faceted UI like checkboxes for tags or categories. The client API exposes these out of the box.

Set custom weights with data-pagefind-weight on titles or summaries to push them up the ranking. A 1.5x boost on the title is usually enough.

Run Pagefind in CI as part of npm run build. Treat the pagefind/ directory as a build artifact, not source. Do not commit it.

For very large sites, consider chunk size tuning and a debounced input. The default 150ms debounce avoids tearing through the index on every keystroke.

Wrap-up

Pagefind gives a static Astro site fast, scalable search with no infrastructure. Build, index, drop in a small client, and you have a search box that pays for nothing and ships in kilobytes. Mark your content, mind your CSP, and keep the index step in your build pipeline.