Lighthouse CI Without a SaaS

The pitch for a performance-monitoring SaaS is always some version of the same three sentences. It watches your Core Web Vitals. It alerts you when they regress. It shows you a dashboard. Pricing starts around $20/month and climbs into the hundreds once you add seats and URLs.

Strip the dashboard away and look at what you’re actually paying for: something runs Lighthouse on a trigger, stores the numbers, and tells you when they get worse. Two of those three things ship for free in a tool Google maintains, and you can wire all of it into a GitHub Actions workflow this afternoon.

I’ve argued before that your Lighthouse score is a lie — that the headline number you screenshot for a client is noisy, gameable, and mostly theater. That’s still true. But the same tool, run on every pull request with the right guardrails, produces the one performance signal that isn’t theater: did this change make things worse? The absolute score is a lie. The delta is the truth. CI is where Lighthouse stops being a vanity metric and starts being a brake.

What a Perf-Monitoring SaaS Actually Sells You

Break the product into its parts and the picture gets clearer:

Triggered Lighthouse runs — on every deploy, PR, or schedule. Free in GitHub Actions.
Failing the build / alerting on regression — the actual quality gate. Free, via Lighthouse CI assertions.
A comment on the PR with the numbers — so reviewers see the impact inline. Free, with a few lines of github-script.
Historical trend dashboards — score-over-time across months. This is the genuine product.
Real-user monitoring (RUM) — field data from actual visitors. A different category of thing entirely.

Items 1 through 3 are a thin wrapper around tools you can run yourself in an afternoon. Item 4 is the real value-add — and even that is open-source software you can self-host. Item 5 isn’t something most of these tools do well anyway; it’s an upsell, and lab tooling like Lighthouse was never the right instrument for it.

So the question isn’t “is perf monitoring worth it.” It’s “which 5% of this am I actually paying a subscription for.” For most teams shipping a content site, the answer is: the parts you could have had for nothing.

The Reason People Reach for the SaaS: Variance

There’s a real reason “just add Lighthouse to CI” fails on the first try, and it’s worth naming because it’s exactly the pain the SaaS sells a fix for.

Run Lighthouse twice on the identical page and you’ll get a performance score of 92, then 87. The runner shares a CPU with other tenants. Network timing jitters. Garbage collection pauses land in different places. None of your code changed, but the number did.

Gate your build on a single raw score and you’ll get red builds on green code. Within a week someone adds continue-on-error: true, and the check is dead. A flaky gate is worse than no gate, because it trains the team to ignore it.

Lighthouse CI — @lhci/cli, the official tool from the Lighthouse team — solves the variance two ways, and getting both right is the difference between a gate people trust and one they route around.

Median of N runs. Run the audit three to five times and take the median. Variance collapses toward the real number.
Assert on the stable signals, not the headline score. Some metrics barely move between runs — total byte weight, request count, whether images are correctly sized, whether you’re on HTTPS. Others are jittery on shared hardware, like Total Blocking Time. Gate hard on the stable ones. Set the jittery ones to warn, or give them generous thresholds.

Every “add Lighthouse to CI in 5 minutes” tutorial skips this part, and that’s precisely why those setups get torn out a month later. Handle the variance and the gate becomes something a team will actually leave turned on.

The Setup

The audience here is people running static sites, and that makes one feature matter more than any other: staticDistDir.

Lighthouse CI can serve your built output directory itself. No starting a dev server, no start-server-and-test, no juggling ports and readiness checks. You point it at dist/ and it spins up its own static server, runs the audits, and tears it down. For a static site this removes the single most annoying part of CI performance testing.

1. The config: `lighthouserc.json`

Drop this at the repo root. JSON is the most foolproof format — it’s auto-discovered and has no module-system gotchas. (If you want comments or logic, use a lighthouserc.cjs file exporting an object instead.)

{
  "ci": {
    "collect": {
      "staticDistDir": "./dist",
      "numberOfRuns": 5
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "categories:accessibility": ["error", { "minScore": 1 }],
        "categories:seo": ["error", { "minScore": 1 }],
        "total-byte-weight": ["error", { "maxNumericValue": 512000 }],
        "uses-responsive-images": "error",
        "total-blocking-time": ["warn", { "maxNumericValue": 300 }],
        "unused-javascript": "off"
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

Reading that top to bottom:

staticDistDir points at your build output. LHCI serves it and audits every HTML file it finds. To audit only a subset, set collect.url to an explicit list instead.
numberOfRuns: 5 is the median trick. Three is the practical floor; five is better if you can spare the minute.
preset: "lighthouse:recommended" gives you a sane baseline of assertions for free. The assertions block then overrides specific ones: hard error on the categories and metrics that don’t flake, a warn on Total Blocking Time because it’s jittery on shared runners, and off for audits that don’t apply to a static content site.
upload.target: "temporary-public-storage" uploads each report to Google-hosted storage and hands back a public URL. The reports expire after a few days — long enough to click through from a PR, not a permanent archive. If you’d rather not put reports on a public URL at all, drop the upload block entirely; assertions still run and still fail the build.

The error-level assertions are the entire point. When one trips, the run exits non-zero, and that’s what turns a red X on the commit.

2. The workflow

Two ways to wire it up. Pick based on how you feel about depending on a marketplace action.

Option A — the treosh/lighthouse-ci-action. Least YAML, friendliest outputs:

# .github/workflows/lighthouse.yml
name: Lighthouse CI
on: pull_request

permissions:
  contents: read
  pull-requests: write

jobs:
  lighthouse:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
      - uses: actions/setup-node@v6
        with:
          node-version: 22
          cache: npm

      - run: npm ci
      - run: npm run build

      - name: Run Lighthouse CI
        id: lhci
        uses: treosh/lighthouse-ci-action@v12
        with:
          configPath: ./lighthouserc.json
          temporaryPublicStorage: true
          uploadArtifacts: true

Build the site first so dist/ exists, then the action runs collect, assert, and upload from your config. If an error assertion fails, the step exits non-zero and the build goes red — that’s the gate. The id: lhci exposes the run’s outputs to the next step.

Option B — raw @lhci/cli. If you’d rather not pull in a third-party action — an instinct I respect — it’s two lines:

npm install --save-dev @lhci/cli

      - run: npm run build
      - run: npx lhci autorun --config=./lighthouserc.json

lhci autorun is collect, assert, and upload in one command. Same gate, one fewer dependency on someone else’s repo.

3. The PR comment

Neither option posts a comment on its own. The treosh action exposes the run’s manifest (scores per run) and links (each URL mapped to its public report) as outputs. Read those and post a sticky comment with actions/github-script:

      - name: Comment Lighthouse results on the PR
        if: github.event_name == 'pull_request'
        uses: actions/github-script@v8
        env:
          MANIFEST: ${{ steps.lhci.outputs.manifest }}
          LINKS: ${{ steps.lhci.outputs.links }}
        with:
          script: |
            const manifest = JSON.parse(process.env.MANIFEST || '[]')
            const links = JSON.parse(process.env.LINKS || '{}')
            const pct = (n) => Math.round(n * 100)
            const marker = '## Lighthouse CI results'

            const rows = manifest
              .filter((run) => run.isRepresentativeRun)
              .map((run) => {
                const s = run.summary
                const report = links[run.url] || '#'
                return `| ${run.url} | ${pct(s.performance)} | ${pct(s.accessibility)} | ${pct(s['best-practices'])} | ${pct(s.seo)} | [report](${report}) |`
              })
              .join('\n')

            const body = [
              marker,
              '',
              '| URL | Perf | A11y | Best Practices | SEO | Report |',
              '| --- | ---: | ---: | ---: | ---: | --- |',
              rows,
              '',
              '_Median of 5 runs. Public reports expire after a few days._',
            ].join('\n')

            const { data: comments } = await github.rest.issues.listComments({
              ...context.repo,
              issue_number: context.issue.number,
            })
            const existing = comments.find((c) => c.body.startsWith(marker))
            const api = github.rest.issues

            if (existing) {
              await api.updateComment({ ...context.repo, comment_id: existing.id, body })
            } else {
              await api.createComment({ ...context.repo, issue_number: context.issue.number, body })
            }

Three things in there are worth calling out, because they’re the difference between a comment that helps and one that annoys:

isRepresentativeRun. The manifest holds one entry per run. The median run is flagged isRepresentativeRun: true. Filter to it so you report the median, not all five.
Pass outputs through env:, never inline them. Splicing ${{ steps.lhci.outputs.manifest }} straight into the script body is a code-injection footgun — a stray quote in the data breaks the script, and worse, the data is now executing as code. Read it from process.env instead.
Make the comment sticky. Find an existing comment by its marker line and update it in place. Otherwise every push leaves another comment and the PR turns into a wall of Lighthouse tables.

The result is one comment per PR that updates on every push, showing the median scores with a link to the full report — the exact thing the SaaS puts behind a login.

What You Give Up (and When the SaaS Is Worth It)

I’m not going to pretend this matches a paid product feature-for-feature. Here’s the honest ledger.

Historical trends. Temporary public storage expires, so you get per-PR snapshots, not a six-month graph. If you genuinely need long-term trend lines, run the Lighthouse CI server — it’s an open-source Node app plus a small database, and it happily lives on the same $0 tier as the rest of your stack. Point upload.target at it instead of public storage. You keep the dashboards; you still pay no subscription.

Real-user monitoring. Lighthouse is lab data — one synthetic run in a controlled environment. It catches regressions before they ship, which is exactly what you want from a gate. It does not tell you what someone on a three-year-old Android phone on hotel Wi-Fi actually experiences. For that you want field data, and the free CrUX API or a tiny web-vitals beacon posting to your own endpoint covers it. Different tool, different question, also not a SaaS.

The dashboard and the Slack integration. These are real conveniences. If a team will genuinely live in a performance dashboard every day, and the monthly fee buys back more than the afternoon of setup plus ongoing maintenance, then buy it — that’s a reasonable trade. The mistake is paying for it reflexively, before you’ve felt the pain it solves.

The point was never that perf SaaS is a scam. It’s that the core loop — audit, assert, comment, block — became commodity infrastructure when Google open-sourced the whole thing. You’re usually paying a subscription for the polish on top, and for a content site or a small team, the polish isn’t where the value is.

The Takeaway

A Lighthouse score on its own is a lie. But wired into CI as a gate — median of five runs, hard assertions on the metrics that don’t flake, a sticky PR comment showing the deltas, the build going red when a change regresses — it becomes the thing that quietly keeps a fast site fast.

That whole setup is one config file, one workflow file, and one github-script step. It runs on infrastructure you already pay for, and it costs nothing on top.

Add the workflow. Set the thresholds where your site sits today, not where you wish it were — a gate you can actually pass is the only kind that stays on. Then the next pull request that tries to ship a 400KB hero image or a render-blocking script gets caught by a robot in CI, instead of by a real person on a phone three weeks after it went live.