Commit Graph

438 Commits

Author SHA1 Message Date
Karl Seguin
1a4086c98c de-duplicate context shutdown in isolated worl deinit 2026-02-03 23:04:44 +08:00
Karl Seguin
c07b83335b add a few comments 2026-02-03 15:58:29 +08:00
Karl Seguin
7e575c501a Add a dedicated browser_context and page_arena to CDP.
The BrowserContext currently uses 3 arenas:
1 - Command-specific, which is like the call_arena, but for the processing of a
    single CDP command
2 - Notification-specific, which is similar, but for the processing of a single
    internal notification event
3 - Arena, which is just the session arena and lives for the duration of the
    BrowseContext/Session

This is pretty coarse and can results in significant memory accumulation if a
browser context is re-used for multiple navigations.

This commit introduces 3 changes:
1 - Rather than referencing Session.arena, the BrowerContext.arena is now its
    own arena. This doesn't really change anything, but it does help keep things
    a bit better separated.

2 - Introduces a page_arena (not to be confused with Page.arena). This arena
    exists for the duration of a 1 page, i.e. it's cleared when the
    BrowserContext receives the page_created internal notification. The
    `captured_responses` now uses this arena, which means captures only exist
    for the duration of the current page. This appears to be consistent with
    how chrome behaves (In fact, Chrome seems even more aggressive and doesn't
    appear to make any guarantees around captured responses). CDP refers to this
    lifetime as a "renderer" and has an experimental message, which we don't
    support, `Network.configureDurableMessages` to control this.

3 - Isolated Worlds are now more self contained with an arena from the ArenaPool.

There are currently 2 places where the BrowserContext.arena is still used:
1 - the isolated_world list
2 - the custom headers

Although this could be long lived, I believe the above is ok. We should just
really think twice whenever we want to use it for anything else.
2026-02-03 15:48:27 +08:00
Karl Seguin
181f265de5 Rework Inspector usage
V8's inspector world is made up of 4 components: Inspector, Client, Channel and
Session. Currently, we treat all 4 components as a single unit which is tied to
the lifetime of CDP BrowserContext - or, loosely speaking, 1 "Inspector Unit"
per page / v8::Context.

According to https://web.archive.org/web/20210622022956/https://hyperandroid.com/2020/02/12/v8-inspector-from-an-embedder-standpoint/
and conversation with Gemini, it's more typical to have 1 inspector per isolate.
The general breakdown is the Inspector is the top-level manager, the Client is
our implementation which control how the Inspector works (its function we expose
that v8 calls into). These should be tied to the Isolate. Channels and Sessions
are more closely tied to Context, where the Channel is v8->zig and the Session
us zig->v8.

This PR does a few things
1 - It creates 1 Inspector and Client per Isolate (Env.js)
2 - It creates 1 Session/Channel per BrowserContext
3 - It merges v8::Session and v8::Channel into Inspector.Session
4 - It moves the Inspector instance directly into the Env
5 - BrowserContext interacts with the Inspector.Session, not the Inspector

4 is arguably unnecessary with respect to the main goal of this commit, but
the end-goal is to tighten the integration. Specifically, rather than CDP having
to inform the inspector that a context was created/destroyed, the Env which
manages Contexts directly (https://github.com/lightpanda-io/browser/pull/1432)
and which now has direct access to the Inspector, is now equipped to keep this
in sync.
2026-01-30 15:59:33 +08:00
Karl Seguin
4a1d71b6b8 Merge pull request #1437 from lightpanda-io/remove_unused
Remove unused import
2026-01-30 06:55:11 +08:00
Karl Seguin
a19a125aec Remove unused import
And a few unused functions
2026-01-29 19:44:10 +08:00
Karl Seguin
1a05da9e55 Remove js.ExecutionWorld
The ExecutionWorld doesn't do anything meaningful. It doesn't map to, or
abstract any, v8 concepts. It creates a js.Context, destroys the context and
points to the context. Those all all things the Env can do (and it isn't like
the Env is over-burdened as-is).

Plus the benefit of going through the Env is that we can track/collect all
known Contexts for an isolate in 1 place (the Env), which can facilitate things
like context creation/deletion notifications.
2026-01-29 11:22:01 +08:00
Pierre Tachoire
68fbc0bde3 use inspector.resetContextGroup during cdp deinit
Ensure the inspector is correctly reset from context before deinit it.
It fixes the contextCollected crash in a better way.
2026-01-28 11:11:38 +01:00
Karl Seguin
066069baad Add defensiveness around Parser.appendCallback
We're seeing an assertion in Page.appendNew fail because the node has a parent.
According to html5ever, this shouldn't be possible (appendNew is only called
from the Parser). BUT, it's possible we're mutating the node in a way that
we shouldn't...maybe there's JavaScript executing as we're parsing which is
mutating the node.

In release, this will be more defensive. In debug, this still crashes. It's
possible this is valid (like I said, maybe there's JS interleaved which is
mutating the node), but if so, I'd like to know the exact scenario that produces
this case.
2026-01-28 07:33:04 +08:00
Karl Seguin
3b12240615 remove newString helper in favor of .wrap 2026-01-26 08:00:04 +08:00
Karl Seguin
a3d2dd8366 Convert most Attribute related calls from []const u8 -> String 2026-01-26 07:52:27 +08:00
Karl Seguin
16ef487871 Make "Safe" variants of Attribute work on String 2026-01-26 07:52:27 +08:00
Karl Seguin
54c45a0cfd Make js.Bridge aware of string.String for input parameters
Avoids having to allocate small strings when going from v8 -> Zig. Also
added a discriminatory type, string.Global which uses the arena, rather than
the call_arena, if an allocation _is_ necessary. (This is similar to a feature
we had before, but was lost in zigdom). Strings from v8 that need to be
persisted, can be allocated directly v8 -> arena, rather than v8 -> call_arena
-> arena.

I think there are a lot of places where we should use string.String - where
strings are expected to be short (e.g. attribute names). But started with just
document.querySelector and querySelectorAll.
2026-01-26 07:52:27 +08:00
Karl Seguin
830f759f0b zig fmt, remove unused code 2026-01-24 07:37:30 +08:00
Karl Seguin
9092651b5b Merge branch 'main' into fix_context_lifetime 2026-01-20 08:50:41 +08:00
Karl Seguin
2c53b48e0a add missing handlescope 2026-01-20 08:11:38 +08:00
Karl Seguin
a6e7ecd9e5 Move more asserts to custom asserter.
Deciding what should be an lp.assert, vs an std.debug.assert, vs a debug-only
assert is a little arbitrary.

debug-only asserts, guarded with an `if (comptime IS_DEBUG)` obviously avoid the
check in release and thus have a performance advantage. We also use them at
library boundaries. If libcurl says it will always emit a header line with a
trailing \r\n, is that really a check we need to do in production? I don't think
so. First, that code path is checked _a lot_ in debug. Second, it feels a bit
like we're testing libcurl (in production!)..why? A debug-only assertion should
be good enough to catch any changes in libcurl.
2026-01-19 09:12:16 +08:00
Karl Seguin
62aa564df1 Remove Global v8::Local<V8::Context>
When we create a js.Context, we create the underlying v8.Context and store it
for the duration of the page lifetime. This works because we have a global
HandleScope - the v8.Context (which is really a v8::Local<v8::Context>) is that
to the global HandleScope, effectively making it a global.

If we want to remove our global HandleScope, then we can no longer pin the
v8.Context in our js.Context. Our js.Context now only holds a v8.Global of the
v8.Context (v8::Global<v8::Context).

This PR introduces a new type, js.Local, which takes over a lot of the
functionality previously found in either js.Caller or js.Context. The simplest
way to think about it is:

1 - For v8 -> zig calls, we create a js.Caller (as always)
2 - For zig -> v8 calls, we go through the js.Context (as always)
3 - The shared functionality, which works on a v8.Context, now belongs to js.Local

For #1 (v8 -> zig), creating a js.Local for a js.Caller is really simple and
centralized. v8 largely gives us everything we need from the
FunctionCallbackInfo or PropertyCallbackInfo.  For #2, it's messier, because we
can only create a local v8::Context if we have a HandleScope, which we may or
may not.

Unfortunately, in many cases, what to do becomes the responsibility of the caller
and much of the code has to become aware of this local-ness. What does it means
for our code? The impact is on WebAPIs that store .Global. Because the global
can't do anything. You always need to convert that .Global to a local
(e.g. js.Function.Global -> js.Function).

If you're 100% sure the WebAPI is only being invoked by a v8 callback, you can
use `page.js.local.?.toLocal(some_global).call(...)` to get the local value.

If you're 100% sure the WebAPI is only being invoked by Zig, you need to create
 `js.Local.Scope` to get access to a local:

```zig
var ls: js.Local.Scope = undefined;
page.js.localScope(&ls);
defer ls.deinit();
ls.toLocal(some_global).call(...)
// can also access `&ls.local` for APIs that require a *const js.Local
```
For functions that can be invoked by either V8 or Zig, you should generally push
the responsibility to the caller by accepting a `local: *const js.Local`. If the
caller is a v8 callback, it can pass `page.js.local.?`. If the caller is a Zig
callback, it can create a `Local.Scope`.

As an alternative, it is possible to simply pass the *Page, and check
`if page.js.local == null` and, if so, create a Local.Scope. But this should only
be done for performance reasons. We currently only do this in 1 place, and it's
because the Zig caller doesn't know whether a Local will actually be needed and
it's potentially called on every element creating from the parser.
2026-01-19 07:28:33 +08:00
Karl Seguin
0e4aa38aaa Merge pull request #1312 from lightpanda-io/stagehand-zigdom
Stagehand zigdom
2026-01-16 23:10:19 +00:00
Pierre Tachoire
4325b80d64 axnode: small fixes 2026-01-16 17:30:43 +01:00
Pierre Tachoire
fbe07836f9 cdp: return a valide response for Page.getFrameTree on STARTUP
Stagehand expects a valid response for this specific command.
Add also `Target.activateTarget`
2026-01-16 16:27:55 +01:00
Pierre Tachoire
cbc028b040 cdp: accept multiple attachToTarget calls 2026-01-16 09:10:41 +01:00
Pierre Tachoire
2074c0149f axnode: add aria-labelledby support 2026-01-16 09:01:39 +01:00
Pierre Tachoire
61ed97dd45 axnode: use writeString for content's name 2026-01-16 09:00:57 +01:00
Pierre Tachoire
a358c46b9f axnode: ignore script and style children 2026-01-16 08:28:16 +01:00
Pierre Tachoire
50c1e2472b axnode: encode json string into stripWhitespaces 2026-01-16 08:27:43 +01:00
Pierre Tachoire
d50e056114 axnode: ignore non-html tags 2026-01-15 16:42:40 +01:00
Pierre Tachoire
d7d956d966 axnode: fix invalid enum 2026-01-15 15:40:52 +01:00
Pierre Tachoire
bd3966bf8d axnode: add focus on webroot 2026-01-15 15:37:49 +01:00
Pierre Tachoire
74578ba274 axnode: implement list marker 2026-01-15 15:37:49 +01:00
Pierre Tachoire
cb89742d2f axnode: add li level 2026-01-15 15:37:48 +01:00
Pierre Tachoire
6d0f991c17 axnode: add hr properties 2026-01-15 15:37:48 +01:00
Pierre Tachoire
d126d2a0f9 axnode: ignore hidden input 2026-01-15 15:37:47 +01:00
Pierre Tachoire
b51cca5617 axnode: use select.getValue 2026-01-15 15:37:47 +01:00
Pierre Tachoire
dc54dad290 axnode: add more attributes for input elements 2026-01-15 15:37:47 +01:00
Pierre Tachoire
7d6ab5a708 axnode: force manual formatting in switches
In order to uses less space and improve the readability.

zig fmt allows only 1 switch case per line or all in one line.
When having a lot of conditions, splitting the line is useful.
2026-01-15 15:37:46 +01:00
Pierre Tachoire
07acb9308d axnode: fallback button name to their tagname 2026-01-15 15:37:46 +01:00
Pierre Tachoire
ef315a46bc axnode: don't extract all text content as name
ignore name extraction for more elements
2026-01-15 15:37:45 +01:00
Pierre Tachoire
eb45bd051c axtree: simpler AXValue 2026-01-15 15:37:45 +01:00
Pierre Tachoire
65102edc98 axtree: remove useless error return 2026-01-15 15:37:44 +01:00
Pierre Tachoire
04eda96416 axtree: reverse writeNode return logic 2026-01-15 15:37:44 +01:00
Pierre Tachoire
f5036bdf5e axtree: use a simpler union switch 2026-01-15 15:37:44 +01:00
Pierre Tachoire
b6df85da7a axtree: add improvements 2026-01-15 15:37:43 +01:00
Pierre Tachoire
9775b39a8d axnode: use absolute urls 2026-01-15 15:37:43 +01:00
Pierre Tachoire
d6d74c5024 first version of AXTree 2026-01-15 15:37:42 +01:00
Karl Seguin
223a6170d5 Fix use-after free
On CDP.BrowserContext.deinit, clear the isolated world ExecutionContext before
terminating the session. This is important as the isolated_world list is
allocated from the session.arena.

Also, semi-revert 63f1c85964. Before all this
we were running microtasks on ExecutionWorld.removeContext. That didn't seem
right (and I thought it was the original source of the bug). But, for the "real"
Page context, this is critical, since Microtasks can reference the Page object.
Since microTasks are isolation-level, it's possible for a microtasks for Page1
to execute after Page1 goes away (if we create a new page, Page2). This re-adds
the microtask "draining", but only for the Page (i.e. in Page.deinit).
2026-01-14 09:37:10 +08:00
Karl Seguin
2322cb9b83 remove unused code, remove references to v8::Persistent 2026-01-13 12:58:30 +08:00
Karl Seguin
8438b7d561 remove remaining direct v8 references 2026-01-13 12:58:30 +08:00
Karl Seguin
701de08e8a have our js.Context directly hold a js handle 2026-01-13 12:57:06 +08:00
Karl Seguin
6ecf52cc03 port Platform and Inspector to use v8's C handles/functions directly 2026-01-13 12:56:07 +08:00