First, for macrotasks, it ensures that the correct context is entered.
More important, for microtasks, it protects against use-after-free. This is
something we already did, to some degree, in page.deinit: we would run
microtasks before erasing the page, in case any microtasks referenced the page.
The new approach moves the guard to the context (since we have multiple
contexts) and protects against a microtasks enqueue a new task during shutdown
(something the original never did).
This seems to solve some potential use-after-free issues. By informing the
Inspector that the context is gone, it seems to effectively ensure that no more
messages are sent from the inspector for things related to the context.
1 - Embed Page into Session, avoids having to allocate/deallocate the page
2 - Fix false-positive arena pool leak detection
3 - On form submit, pre-check if navigation will allowed before building the
contents
4 - Make sure QueuedNavigation structure isn't use after page is removed
The WPT tests navigation-api/navigation-methods/navigate-history-push-same-url.html
crashed with a @memcpy arguments alias error.
It seems to be due to the reuse of the previous page.url string.
Forcing to duplicate it fixes the crash.
The BrowserContext currently uses 3 arenas:
1 - Command-specific, which is like the call_arena, but for the processing of a
single CDP command
2 - Notification-specific, which is similar, but for the processing of a single
internal notification event
3 - Arena, which is just the session arena and lives for the duration of the
BrowseContext/Session
This is pretty coarse and can results in significant memory accumulation if a
browser context is re-used for multiple navigations.
This commit introduces 3 changes:
1 - Rather than referencing Session.arena, the BrowerContext.arena is now its
own arena. This doesn't really change anything, but it does help keep things
a bit better separated.
2 - Introduces a page_arena (not to be confused with Page.arena). This arena
exists for the duration of a 1 page, i.e. it's cleared when the
BrowserContext receives the page_created internal notification. The
`captured_responses` now uses this arena, which means captures only exist
for the duration of the current page. This appears to be consistent with
how chrome behaves (In fact, Chrome seems even more aggressive and doesn't
appear to make any guarantees around captured responses). CDP refers to this
lifetime as a "renderer" and has an experimental message, which we don't
support, `Network.configureDurableMessages` to control this.
3 - Isolated Worlds are now more self contained with an arena from the ArenaPool.
There are currently 2 places where the BrowserContext.arena is still used:
1 - the isolated_world list
2 - the custom headers
Although this could be long lived, I believe the above is ok. We should just
really think twice whenever we want to use it for anything else.
page.js currently always references the page context. But through the inspector
JavaScript can be executed in different contexts. When we go from V8->Zig we
correctly capture the current context within the caller's Local. And, because of
this, mapping or anything else that happens against local.ctx, happens in the
right context.
EXCEPT...our code still accesses page.js. So you can have a v8->zig call
happening in Context-2, and our Zig call then tries to do something on Context-1
via page.js.
I'm introducing a change that updates page.js based on the current Caller and
restores it at the end of the Caller. This change is super small, but
potentially has major impact. It's hard to imagine that we haven't run into
problems with this before, and it's hard to imagine what problems this change
might introduce. Certainly, if anyone copies page.js, they'll be in for a rude
surprise, but i don't think we do that anywhere.
==Fix 1==
The problem flow:
1 - The module is dynamically imported, this creates a cache entry with no
module and no module_promise, and starts an async fetch.
2 - Before dynamicModuleSourceCallback fires (from step 1 above), the same
module is imported as a child of a call graph, i.e. via resolveModuleCallback.
Here the module is compiled, but never evaluated (we only evaluate the root
module). This is where things start to go sour. Our cache entry now has a
module, but no module_promise.
3 - The async fetch completes and calls dynamicModuleSourceCallback which call
Context.module. This returns early (the module is already cached thanks to
step 2). But it then calls resolveDynamicModule which (a) has a module and (b)
no module_promise.
Our fix works because, if Context.module finds the cached module (from step 2),
it now also checks for the module_promise. If it doesn't find it, it evaluates
the module (which sets it).
I've since expanded the code to handle more intermediary states.
The original PR had:
if (gop.value_ptr.module_promise == null) {
const mod = local.toLocal(cache_mod);
if (mod.getStatus() == .kInstantiated) {
return self.evaluateModule(want_result, mod, url, true);
}
}
But now the code is:
if (gop.value_ptr.module_promise == null) {
const mod = local.toLocal(cache_mod);
if (mod.getStatus() == .kUninstantiated and try mod.instantiate(resolveModuleCallback) == false) {
return error.ModuleInstantiationError;
}
return self.evaluateModule(want_result, mod, url, true);
}
It seems that v8 handles double-instantiation and double-evaluations safely.
Handle more partial-load states.
Handle more partial-load states + fix possible dangling pointer.
==Fix 2==
We were using `gop` after potentially writing to the map (via a nested call to
mod.evaluate()). We now re-fetch the map entry to be able to safely write to it
Currently, when loading a module, if the module is found in the cache, it's
immediately returned. However, that can result in a timing issue where the
module is cached, but not evaluated, and doesn't have an associated promise.
This commit tries to ensure a module is always evaluated and that the cache
entry has a module promise.
This might fix an crash handler issue. I couldn't reproduce the issue though.
I believe it requires specific timing which is hard to reproduce in a test.
We previously figured that we could release the XHR object as soon as the JS
reference was out of scope. But the callbacks could still exist and thus the
XHR request should proceed.
This commit ensures the XHR instance remains valid so long as we have an active
request.
Might help with https://github.com/lightpanda-io/browser/issues/1448 but I can't
reliably reproduce this, so I'm not 100% sure it resolve the issue. That bug
appears to be caused by some timing interaction between the underlying HTTP
request and the v8 GC.