Karl Seguin 3c0c75be10 Add XHR finalizer and ArenaPool
Any object we return from Zig to V8 becomes a v8::Global that we track in our
`ctx.identity_map`. V8 will not free such objects. On the flip side, on its own,
our Zig code never knows if the underlying v8::Object of a global can still be
used from JS. Imagine an XHR request where we fire the last readyStateChange
event..we might think we no longer need that XHR instance, but nothing stops
the JavaScript code from holding a reference to it and calling a property on it,
e.g. `xhr.status`.

What we can do is tell v8 that we're done with the global and register a callback.
We make our reference to the global weak. When v8 determines that this object
cannot be reached from JavaScript, it _may_ call our registered callback. We can
then clean things up on our side and free the global (we actually _have_ to
free the global).

v8 makes no guarantee that our callback will ever be called, so we need to track
these finalizable objects and free them ourselves on context shutdown. Furthermore
there appears to be some possible timing issues, especially during context shutdown,
so we need to be defensive and make sure we don't double-free (we can use the
existing identity_map for this).

An type like XMLHttpRequest can be re-used. After a request succeeds or fails,
it can be re-opened and a new request sent. So we also need a way to revert a
"weak" reference back into a "strong" reference. These are simple v8 calls on
the v8::Global, but it highlights how sensitive all this is. We need to mark
it as weak when we're 100% sure we're done with it, and we need to switch it to
strong under any circumstances where we might need it again on our side.

Finally, none of this makes sense if there isn't something to free. Of course,
the finalizer lets us release the v8::Global, and we can free the memory for the
object itself (i.e. the `*XMLHttpRequest`). This PR also adds an ArenaPool. This
allows the XMLHTTPRequest to be self-contained and not need the `page.arena`.
On init, the `XMLHTTPRequest` acquires an arena from the pool. On finalization
it releases it back to the pool. So we now have:

- page.call_arena: short, guaranteed for 1 v8 -> zig -> v8 flow
- page.arena long: lives for the duration of the entire page
- page.arena_pool: ideally lives for as long as needed by its instance (but no
guarantees from v8 about this, or the script might leak a lot of global, so worst
case, same as page.arena)
2026-01-24 07:59:41 +08:00
2026-01-24 07:59:41 +08:00
2026-01-19 07:12:40 -08:00
2025-12-11 12:25:25 -08:00
2025-12-18 20:10:38 +08:00
2026-01-19 07:36:46 +08:00
2024-11-26 09:35:14 +01:00
2024-12-13 11:28:14 +01:00
2025-11-17 07:22:37 -08:00
2025-11-17 07:22:37 -08:00
2024-05-13 12:10:15 +02:00
2025-12-18 20:10:38 +08:00
2026-01-07 17:22:49 +01:00
2023-11-20 14:57:42 +01:00

Logo

Lightpanda Browser

lightpanda.io

License Twitter Follow GitHub stars

Lightpanda is the open-source browser made for headless usage:

  • Javascript execution
  • Support of Web APIs (partial, WIP)
  • Compatible with Playwright1 , Puppeteer, chromedp through CDP

Fast web automation for AI agents, LLM training, scraping and testing:

  • Ultra-low memory footprint (9x less than Chrome)
  • Exceptionally fast execution (11x faster than Chrome)
  • Instant startup

Puppeteer requesting 100 pages from a local website on a AWS EC2 m5.large instance. See benchmark details.

Quick start

Install

Install from the nightly builds

You can download the last binary from the nightly builds for Linux x86_64 and MacOS aarch64.

For Linux

curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-x86_64-linux && \
chmod a+x ./lightpanda

For MacOS

curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-aarch64-macos && \
chmod a+x ./lightpanda

For Windows + WSL2

The Lightpanda browser is compatible to run on windows inside WSL. Follow the Linux instruction for installation from a WSL terminal. It is recommended to install clients like Puppeteer on the Windows host.

Install from Docker

Lightpanda provides official Docker images for both Linux amd64 and arm64 architectures. The following command fetches the Docker image and starts a new container exposing Lightpanda's CDP server on port 9222.

docker run -d --name lightpanda -p 9222:9222 lightpanda/browser:nightly

Dump a URL

./lightpanda fetch --dump https://lightpanda.io
info(browser): GET https://lightpanda.io/ http.Status.ok
info(browser): fetch script https://api.website.lightpanda.io/js/script.js: http.Status.ok
info(browser): eval remote https://api.website.lightpanda.io/js/script.js: TypeError: Cannot read properties of undefined (reading 'pushState')
<!DOCTYPE html>

Start a CDP server

./lightpanda serve --host 127.0.0.1 --port 9222
info(websocket): starting blocking worker to listen on 127.0.0.1:9222
info(server): accepting new conn...

Once the CDP server started, you can run a Puppeteer script by configuring the browserWSEndpoint.

'use strict'

import puppeteer from 'puppeteer-core';

// use browserWSEndpoint to pass the Lightpanda's CDP server address.
const browser = await puppeteer.connect({
  browserWSEndpoint: "ws://127.0.0.1:9222",
});

// The rest of your script remains the same.
const context = await browser.createBrowserContext();
const page = await context.newPage();

// Dump all the links from the page.
await page.goto('https://wikipedia.com/');

const links = await page.evaluate(() => {
  return Array.from(document.querySelectorAll('a')).map(row => {
    return row.getAttribute('href');
  });
});

console.log(links);

await page.close();
await context.close();
await browser.disconnect();

Telemetry

By default, Lightpanda collects and sends usage telemetry. This can be disabled by setting an environment variable LIGHTPANDA_DISABLE_TELEMETRY=true. You can read Lightpanda's privacy policy at: https://lightpanda.io/privacy-policy.

Status

Lightpanda is in Beta and currently a work in progress. Stability and coverage are improving and many websites now work. You may still encounter errors or crashes. Please open an issue with specifics if so.

Here are the key features we have implemented:

  • HTTP loader (Libcurl)
  • HTML parser (html5ever)
  • DOM tree
  • Javascript support (v8)
  • DOM APIs
  • Ajax
    • XHR API
    • Fetch API
  • DOM dump
  • CDP/websockets server
  • Click
  • Input form
  • Cookies
  • Custom HTTP headers
  • Proxy support
  • Network interception

NOTE: There are hundreds of Web APIs. Developing a browser (even just for headless mode) is a huge task. Coverage will increase over time.

You can also follow the progress of our Javascript support in our dedicated zig-js-runtime project.

Build from sources

Prerequisites

Lightpanda is written with Zig 0.15.2. You have to install it with the right version in order to build the project.

Lightpanda also depends on zig-js-runtime (with v8), Libcurl and html5ever.

To be able to build the v8 engine for zig-js-runtime, you have to install some libs:

For Debian/Ubuntu based Linux:

sudo apt install xz-utils ca-certificates \
    clang make curl git

You also need to install Rust.

For systems with Nix, you can use the devShell:

nix develop

For MacOS, you need cmake and Rust.

brew install cmake

Install Git submodules

The project uses git submodules for dependencies.

To init or update the submodules in the vendor/ directory:

make install-submodule

This is an alias for git submodule init && git submodule update.

Build and run

You an build the entire browser with make build or make build-dev for debug env.

But you can directly use the zig command: zig build run.

Embed v8 snapshot

Lighpanda uses v8 snapshot. By default, it is created on startup but you can embed it by using the following commands:

Generate the snapshot.

zig build snapshot_creator -- src/snapshot.bin

Build using the snapshot binary.

zig build -Dsnapshot_path=../../snapshot.bin

See #1279 for more details.

Test

Unit Tests

You can test Lightpanda by running make test.

End to end tests

To run end to end tests, you need to clone the demo repository into ../demo dir.

You have to install the demo's node requirements

You also need to install Go > v1.24.

make end2end

Web Platform Tests

Lightpanda is tested against the standardized Web Platform Tests.

The relevant tests cases are committed in a dedicated repository which is fetched by the make install-submodule command.

All the tests cases executed are located in the tests/wpt sub-directory.

For reference, you can easily execute a WPT test case with your browser via wpt.live.

Run WPT test suite

To run all the tests:

make wpt

Or one specific test:

make wpt Node-childNodes.html

Add a new WPT test case

We add new relevant tests cases files when we implemented changes in Lightpanda.

To add a new test, copy the file you want from the WPT repo into the tests/wpt directory.

⚠️ Please keep the original directory tree structure of tests/wpt.

Contributing

Lightpanda accepts pull requests through GitHub.

You have to sign our CLA during the pull request process otherwise we're not able to accept your contributions.

Why?

Javascript execution is mandatory for the modern web

In the good old days, scraping a webpage was as easy as making an HTTP request, cURL-like. Its not possible anymore, because Javascript is everywhere, like it or not:

  • Ajax, Single Page App, infinite loading, “click to display”, instant search, etc.
  • JS web frameworks: React, Vue, Angular & others

Chrome is not the right tool

If we need Javascript, why not use a real web browser? Take a huge desktop application, hack it, and run it on the server. Hundreds or thousands of instances of Chrome if you use it at scale. Are you sure its such a good idea?

  • Heavy on RAM and CPU, expensive to run
  • Hard to package, deploy and maintain at scale
  • Bloated, lots of features are not useful in headless usage

Lightpanda is built for performance

If we want both Javascript and performance in a true headless browser, we need to start from scratch. Not another iteration of Chromium, really from a blank page. Crazy right? But thats what we did:

  • Not based on Chromium, Blink or WebKit
  • Low-level system programming language (Zig) with optimisations in mind
  • Opinionated: without graphical rendering

  1. Playwright support disclaimer: Due to the nature of Playwright, a script that works with the current version of the browser may not function correctly with a future version. Playwright uses an intermediate JavaScript layer that selects an execution strategy based on the browser's available features. If Lightpanda adds a new Web API, Playwright may choose to execute different code for the same script. This new code path could attempt to use features that are not yet implemented. Lightpanda makes an effort to add compatibility tests, but we can't cover all scenarios. If you encounter an issue, please create a GitHub issue and include the last known working version of the script. ↩︎

Description
The open-source browser made for headless usage
Readme AGPL-3.0 19 MiB
Languages
Zig 73.3%
HTML 25.4%
Rust 0.7%
JavaScript 0.4%