This commit involves a number of changes to finalizers, all aimed towards
better consistency and reliability.
A big part of this has to do with v8::Inspector's ability to move objects
across IsolatedWorlds. There has been a few previous efforts on this, the most
significant being https://github.com/lightpanda-io/browser/pull/1901. To recap,
a Zig instance can map to 0-N v8::Objects. Where N is the total number of
IsolatedWorlds. Generally, IsolatedWorlds between origins are...isolated...but
the v8::Inspector isn't bound by this. So a Zig instance cannot be tied to a
Context/Identity/IsolatedWorld...it has to live until all references, possibly
from different IsolatedWorlds, are released (or the page is reset).
Finalizers could previously be managed via reference counting or explicitly
toggling the instance as weak/strong. Now, only reference counting is supported.
weak/strong can essentially be seen as an acquireRef (rc += 1) and
releaseRef (rc -= 1). Explicit setting did make some things easier, like not
having to worry so much about double-releasing (e.g. XHR abort being called
multiple times), but it was only used in a few places AND it simply doesn't work
with objects shared between IsolatedWorlds. It is never a boolean now, as 3
different IsolatedWorlds can each hold a reference.
Temps and Globals are tracked on the Session. Previously, they were tracked on
the Identity, but that makes no sense. If a Zig instance can outlive an Identity,
then any of its Temp references can too. This hasn't been a problem because we've
only seen MutationObserver and IntersectionObserver be used cross-origin,
but the right CDP script can make this crash with a use-after-free (e.g.
`MessageEvent.data` is released when the Identity is done, but `MessageEvent` is
still referenced by a different IsolateWorld).
Rather than deinit with a `comptime shutdown: bool`, there is now an explicit
`releaseRef` and `deinit`.
Bridge registration has been streamlined. Previously, types had to register
their finalizer AND acquireRef/releaseRef/deinit had to be declared on the entire
prototype chain, even if these methods just delegated to their proto. Finalizers
are now automatically enabled if a type has a `acquireRef` function. If a type
has an `acquireRef`, then it must have a `releaseRef` and a `deinit`. So if
there's custom cleanup to do in `deinit`, then you also have to define
`acquireRef` and `releaseRef` which will just delegate to the _proto.
Furthermore these finalizer methods can be defined anywhere on the chain.
Previously:
```zig
const KeywboardEvent = struct {
_proto: *Event,
...
pub fn deinit(self: *KeyboardEvent, session: *Session) void {
self._proto.deinit(session);
}
pub fn releaseRef(self: *KeyboardEvent, session: *Session) void {
self._proto.releaseRef(session);
}
}
```
```zig
const KeyboardEvent = struct {
_proto: *Event,
...
// no deinit, releaseRef, acquireref
}
```
Since the `KeyboardEvent` doesn't participate in finalization directly, it
doesn't have to define anything. The bridge will detect the most specific place
they are defined and call them there.
This fixes a bug in MCP where interactive elements were not assigned
a backendNodeId, preventing agents from clicking or filling them. Also
extracts link collection to a shared browser module.
This is done for a couple reasons. The first is just to have things a little
more self-contained for eventually supporting more advanced "wait" logic, e.g.
waiting for a selector.
The other is to provide callers with more fine-grained controlled. Specifically
the ability to manually "tick", so that they can [presumably] do something
after every tick. This is needed by the test runner to support more advanced
cases (cases that need to test beyond 'load') and it also improves (and fixes
potential use-after-free, the lp.waitForSelector)
Add a detectForms MCP tool and lp.detectForms CDP command that return
structured form metadata from the current page. Each form includes its
action URL, HTTP method, and fields with names, types, required status,
values, select options, and backendNodeIds for use with the fill tool.
This lets AI agents discover and fill forms in a single step instead of
calling interactiveElements, filtering for form fields, and guessing
which fields belong to which form.
New files:
- src/browser/forms.zig: FormInfo/FormField structs, collectForms()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Small tweaks to https://github.com/lightpanda-io/browser/pull/1896
Improve the wait ergonomics with an Option with default parameter. Revert
page pointer logic to original (don't think that change was necessary).
Add `--wait_until` and `--wait_ms` CLI arguments to configure session wait behavior. Updates `Session.wait` to evaluate specific page load states (`load`, `domcontentloaded`, `networkidle`, `fixed`) before completing the wait loop.
Previously, semantic_tree_text hardcoded prune = false, which bypassed the structural node filters and allowed empty none nodes to pollute the root of the text dump.
Returns a structured list of all interactive elements on a page:
buttons, links, inputs, ARIA widgets, contenteditable regions, and
elements with event listeners. Includes accessible names, roles,
listener types, and key attributes.
Event listener introspection (both addEventListener and inline
handlers) is unique to LP — no other browser exposes this to
automation code.
- Add more structural roles (banner, navigation, main, list, etc.).
- Implement fallback for accessible names (SVG titles, image alt text).
- Skip children for leaf-like semantic nodes to reduce redundancy.
- Disable pruning in the default semantic tree view.
Adds a not-documented "wpt" mode to --dump which outputs a formatted
report.cases.
This is meant to make working on a single WPT test case easier, particularly
with some coding tool. Claude recommended this output for its own use.
Instead of telling claude to start the browser in serve mode, then run the
wptrunner, and merge the two outputs (and then stop the server), you can do:
zig build run -- fetch --dump wpt "http://localhost:8000/dom/nodes/CharacterData-appendChild.html"
(you still need the wpt server up)
Follow up to https://github.com/lightpanda-io/browser/pull/1646
The encodeURL (renamed to ensureEncoded and exposed in this commit) already
handled already-encoded URLs, so this was largely a matter of exposing the
functionality.
The reason this isn't baked directly into Page.navigate is that, in some places
e.g. internal navigation, the URL is already know to be encoded. So it's up
to every caller to make sure they are passing a valid URL to navigate.
When set (defaults to not set/false), --dump will include iframe contents.
I was hoping I could add a mode to strip_mode to this, but since dump is used
extensively (e.g. innerHTML), this is something that has to be off by default
(for correctness).
Adds a new `mcp` run mode to start an MCP server over stdio.
Implements tools for navigation and JS evaluation, along with
resources for HTML and Markdown page content.
It's possible (in fact normal) for client.abort to be called twice on a schedule
navigation. We immediately abort any pending requests once a secondary
navigation is called (is that right?), and then again when the page shuts down.
The first abort will kill the transfer, so the XHR object has to null this value
so that, on context shutdown, when the finalizer is called, we don't try to
free it again.
V8's inspector world is made up of 4 components: Inspector, Client, Channel and
Session. Currently, we treat all 4 components as a single unit which is tied to
the lifetime of CDP BrowserContext - or, loosely speaking, 1 "Inspector Unit"
per page / v8::Context.
According to https://web.archive.org/web/20210622022956/https://hyperandroid.com/2020/02/12/v8-inspector-from-an-embedder-standpoint/
and conversation with Gemini, it's more typical to have 1 inspector per isolate.
The general breakdown is the Inspector is the top-level manager, the Client is
our implementation which control how the Inspector works (its function we expose
that v8 calls into). These should be tied to the Isolate. Channels and Sessions
are more closely tied to Context, where the Channel is v8->zig and the Session
us zig->v8.
This PR does a few things
1 - It creates 1 Inspector and Client per Isolate (Env.js)
2 - It creates 1 Session/Channel per BrowserContext
3 - It merges v8::Session and v8::Channel into Inspector.Session
4 - It moves the Inspector instance directly into the Env
5 - BrowserContext interacts with the Inspector.Session, not the Inspector
4 is arguably unnecessary with respect to the main goal of this commit, but
the end-goal is to tighten the integration. Specifically, rather than CDP having
to inform the inspector that a context was created/destroyed, the Env which
manages Contexts directly (https://github.com/lightpanda-io/browser/pull/1432)
and which now has direct access to the Inspector, is now equipped to keep this
in sync.
This adds a crash handler which reports a crash (if telemetry is enabled). On a
crash, this looks for `curl` (using the PATH env), and forks the process to then
call execve. This relies on a new endpoint to be setup to accept the "report".
Also, we include very little data..I figured just knowing about crashes would
be a good place to start.
A panic handler is provided, which override's Zig default handler and hooks
into the crash handler.
An `assert` function is added and hooks into the crash handler. This is
currently only used in one place (Session.zig) to demonstrate its use. In
addition to reporting a failed assert, the assert aborts execution in
ReleaseFast (as opposed to an undefined behavior with std.debug.assert).
I want to hook this into the v8 global error handler, but only after direct_v8
is merged.
Much of this is inspired by bun's code. They have their own assert (1) and
a [more sophisticated] crashHandler (2).
:
(1) beccd01647/src/bun.zig (L2987)
(2) beccd01647/src/crash_handler.zig (L198)