browser

mirror of https://github.com/lightpanda-io/browser.git synced 2026-03-22 04:34:44 +00:00

Author	SHA1	Message	Date
Pierre Tachoire	56cc881ac0	Fcdp: fix attachtToTarget and attachToBrowserTarget resp	2026-03-03 15:01:53 +01:00
Pierre Tachoire	06ef6d3e6a	cdp: attachToTarget must add the session id	2026-03-03 12:58:00 +01:00
Pierre Tachoire	14b58e8062	add target.attachToBrowserTarget	2026-03-03 12:58:00 +01:00
Pierre Tachoire	eee232c12c	cdp: allow multiple calls to attachToTarget Playwright, when creating a new CDPSession, sends an attachToBrowserTarget followed by another attachToTarget to re-attach itself to the existing target. see playwright/axtree.js from demo/ repository.	2026-03-03 12:58:00 +01:00
Karl Seguin	10ad5d763e	Rename page.id to page._frame_id This field was recently added and is used to generate correct frameIds in CDP messages. They remain the same during a navigation event, so calling them page.id might cause surprises since navigation events create new pages, but retain the original id. Hence, frame_id is more accurate and hopefully less surprising. (This is a small cleanup prior to doing some iframe navigation work).	2026-03-02 16:21:29 +08:00
Karl Seguin	21be3db51f	Callers to page.navigate ensure URL is properly encoded. Follow up to https://github.com/lightpanda-io/browser/pull/1646 The encodeURL (renamed to ensureEncoded and exposed in this commit) already handled already-encoded URLs, so this was largely a matter of exposing the functionality. The reason this isn't baked directly into Page.navigate is that, in some places e.g. internal navigation, the URL is already know to be encoded. So it's up to every caller to make sure they are passing a valid URL to navigate.	2026-02-26 12:22:06 +08:00
Karl Seguin	e2a1ce623c	Rework CDP frameIds (and loaderIds and requestIds and interceptorIds) Our BrowsingContext currently supports 1 target. So we have a per-BC target_id. Previously, our target had 1 "frame" - our page. So we often treated the targetId as the frameId. But to work with frames, we need page-specific frameIds and loaderIds. This tries to clean up our ids (a little). frameIds are now ids derived from a new incrementing page.id. This page.id has to be passed around (via http Requests and through notifications) in order to properly generate messages with a frameId.	2026-02-19 13:01:41 +08:00
Karl Seguin	7e575c501a	Add a dedicated browser_context and page_arena to CDP. The BrowserContext currently uses 3 arenas: 1 - Command-specific, which is like the call_arena, but for the processing of a single CDP command 2 - Notification-specific, which is similar, but for the processing of a single internal notification event 3 - Arena, which is just the session arena and lives for the duration of the BrowseContext/Session This is pretty coarse and can results in significant memory accumulation if a browser context is re-used for multiple navigations. This commit introduces 3 changes: 1 - Rather than referencing Session.arena, the BrowerContext.arena is now its own arena. This doesn't really change anything, but it does help keep things a bit better separated. 2 - Introduces a page_arena (not to be confused with Page.arena). This arena exists for the duration of a 1 page, i.e. it's cleared when the BrowserContext receives the page_created internal notification. The `captured_responses` now uses this arena, which means captures only exist for the duration of the current page. This appears to be consistent with how chrome behaves (In fact, Chrome seems even more aggressive and doesn't appear to make any guarantees around captured responses). CDP refers to this lifetime as a "renderer" and has an experimental message, which we don't support, `Network.configureDurableMessages` to control this. 3 - Isolated Worlds are now more self contained with an arena from the ArenaPool. There are currently 2 places where the BrowserContext.arena is still used: 1 - the isolated_world list 2 - the custom headers Although this could be long lived, I believe the above is ok. We should just really think twice whenever we want to use it for anything else.	2026-02-03 15:48:27 +08:00
Karl Seguin	181f265de5	Rework Inspector usage V8's inspector world is made up of 4 components: Inspector, Client, Channel and Session. Currently, we treat all 4 components as a single unit which is tied to the lifetime of CDP BrowserContext - or, loosely speaking, 1 "Inspector Unit" per page / v8::Context. According to https://web.archive.org/web/20210622022956/https://hyperandroid.com/2020/02/12/v8-inspector-from-an-embedder-standpoint/ and conversation with Gemini, it's more typical to have 1 inspector per isolate. The general breakdown is the Inspector is the top-level manager, the Client is our implementation which control how the Inspector works (its function we expose that v8 calls into). These should be tied to the Isolate. Channels and Sessions are more closely tied to Context, where the Channel is v8->zig and the Session us zig->v8. This PR does a few things 1 - It creates 1 Inspector and Client per Isolate (Env.js) 2 - It creates 1 Session/Channel per BrowserContext 3 - It merges v8::Session and v8::Channel into Inspector.Session 4 - It moves the Inspector instance directly into the Env 5 - BrowserContext interacts with the Inspector.Session, not the Inspector 4 is arguably unnecessary with respect to the main goal of this commit, but the end-goal is to tighten the integration. Specifically, rather than CDP having to inform the inspector that a context was created/destroyed, the Env which manages Contexts directly (https://github.com/lightpanda-io/browser/pull/1432) and which now has direct access to the Inspector, is now equipped to keep this in sync.	2026-01-30 15:59:33 +08:00
Karl Seguin	9092651b5b	Merge branch 'main' into fix_context_lifetime	2026-01-20 08:50:41 +08:00
Karl Seguin	a6e7ecd9e5	Move more asserts to custom asserter. Deciding what should be an lp.assert, vs an std.debug.assert, vs a debug-only assert is a little arbitrary. debug-only asserts, guarded with an `if (comptime IS_DEBUG)` obviously avoid the check in release and thus have a performance advantage. We also use them at library boundaries. If libcurl says it will always emit a header line with a trailing \r\n, is that really a check we need to do in production? I don't think so. First, that code path is checked _a lot_ in debug. Second, it feels a bit like we're testing libcurl (in production!)..why? A debug-only assertion should be good enough to catch any changes in libcurl.	2026-01-19 09:12:16 +08:00
Karl Seguin	62aa564df1	Remove Global v8::Local<V8::Context> When we create a js.Context, we create the underlying v8.Context and store it for the duration of the page lifetime. This works because we have a global HandleScope - the v8.Context (which is really a v8::Local<v8::Context>) is that to the global HandleScope, effectively making it a global. If we want to remove our global HandleScope, then we can no longer pin the v8.Context in our js.Context. Our js.Context now only holds a v8.Global of the v8.Context (v8::Global<v8::Context). This PR introduces a new type, js.Local, which takes over a lot of the functionality previously found in either js.Caller or js.Context. The simplest way to think about it is: 1 - For v8 -> zig calls, we create a js.Caller (as always) 2 - For zig -> v8 calls, we go through the js.Context (as always) 3 - The shared functionality, which works on a v8.Context, now belongs to js.Local For #1 (v8 -> zig), creating a js.Local for a js.Caller is really simple and centralized. v8 largely gives us everything we need from the FunctionCallbackInfo or PropertyCallbackInfo. For #2, it's messier, because we can only create a local v8::Context if we have a HandleScope, which we may or may not. Unfortunately, in many cases, what to do becomes the responsibility of the caller and much of the code has to become aware of this local-ness. What does it means for our code? The impact is on WebAPIs that store .Global. Because the global can't do anything. You always need to convert that .Global to a local (e.g. js.Function.Global -> js.Function). If you're 100% sure the WebAPI is only being invoked by a v8 callback, you can use `page.js.local.?.toLocal(some_global).call(...)` to get the local value. If you're 100% sure the WebAPI is only being invoked by Zig, you need to create `js.Local.Scope` to get access to a local: ```zig var ls: js.Local.Scope = undefined; page.js.localScope(&ls); defer ls.deinit(); ls.toLocal(some_global).call(...) // can also access `&ls.local` for APIs that require a const js.Local ``` For functions that can be invoked by either V8 or Zig, you should generally push the responsibility to the caller by accepting a `local: const js.Local`. If the caller is a v8 callback, it can pass `page.js.local.?`. If the caller is a Zig callback, it can create a `Local.Scope`. As an alternative, it is possible to simply pass the *Page, and check `if page.js.local == null` and, if so, create a Local.Scope. But this should only be done for performance reasons. We currently only do this in 1 place, and it's because the Zig caller doesn't know whether a Local will actually be needed and it's potentially called on every element creating from the parser.	2026-01-19 07:28:33 +08:00
Pierre Tachoire	fbe07836f9	cdp: return a valide response for Page.getFrameTree on STARTUP Stagehand expects a valid response for this specific command. Add also `Target.activateTarget`	2026-01-16 16:27:55 +01:00
Pierre Tachoire	cbc028b040	cdp: accept multiple attachToTarget calls	2026-01-16 09:10:41 +01:00
Karl Seguin	05cb5221d4	Quick-check sameness in Node.isEqualNode Exclusively use the not_implemented log filter.	2025-12-26 09:57:33 +08:00
Karl Seguin	d9c53a3def	Page.scheduleNavigation for location changes	2025-12-22 12:19:08 +08:00
Karl Seguin	bb1ea39c54	backport a variety of smaller CDP changes	2025-12-19 10:31:07 +08:00
Muki Kiboigo	ac85341cab	add NavigationKind to navigate	2025-12-09 17:10:59 -08:00
Pierre Tachoire	0d8dd84df5	support url on createTarget and send lifecycle events Support url parameter on createTarget. we now navigate on createTarget to dispatch events correctly, even in case of about:blank	2025-12-09 11:29:00 +01:00
Karl Seguin	d3973172e8	re-enable minimum viable CDP server	2025-10-28 18:56:03 +08:00
Karl Seguin	2e734fae57	This is the last of the big changes to the js code This Pr largely tightens up a lot of the code. 'v8' is no longer imported outside of js. A number of helper functions have been moved to the js.Context. For example, js.Function.getName used to call: ```zig return js.valueToString(allocator, name, self.context.isolate, self.context.v8_context); ``` It now calls: ```zig return self.context.valueToString(name, .{ .allocator = allocator }); ``` Page.main_context has been renamed to `Page.js`. This, in combination with new promise helpers, turns: ```zig const resolver = page.main_context.createPromiseResolver(); try resolver.resolve({}); return resolver.promise(); ``` into: ```zig return page.js.resolvePromise({}); ```	2025-10-03 15:06:16 +08:00
Pierre Tachoire	94fe34bd10	cdp: multiple isolated worlds	2025-09-17 14:42:08 +02:00
Pierre Tachoire	5ea97c4910	cdp: add send error options with session id by default	2025-09-17 14:42:05 +02:00
Karl Seguin	1443f38e5f	Zig 0.15.1 Depends on https://github.com/lightpanda-io/zig-v8-fork/pull/89	2025-08-29 10:42:06 +08:00
Karl Seguin	211012d367	move intercept_state and extra_headers from CDP instance to BrowserContext	2025-08-18 13:23:17 +08:00
Karl Seguin	01223601f2	Reduce allocations made during request interception Stream (to json) the Transfer as a request and response object in the various network interception-related events (e.g. Network.responseReceived). Add a page.request_intercepted boolean flag for CDP to signal the page that requests have been intercepted, allowing Page.wait to prioritize intercept handling (or, at least, not block it).	2025-08-15 14:01:57 +08:00
Karl Seguin	c96fb3c2f2	support CDP proxy override	2025-08-11 21:37:03 +08:00
sjorsdonkers	6f5141d5fb	browser context proxyServer	2025-06-17 18:43:12 +02:00
sjorsdonkers	0c0ddc10ee	rename scope jscontext Some checks failed e2e-test / zig build release (push) Has been cancelled Details zig-test / zig build dev (push) Has been cancelled Details zig-test / zig test (push) Has been cancelled Details e2e-test / puppeteer-perf (push) Has been cancelled Details e2e-test / demo-scripts (push) Has been cancelled Details e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled Details e2e-test / perf-fmt (push) Has been cancelled Details zig-test / browser fetch (push) Has been cancelled Details zig-test / perf-fmt (push) Has been cancelled Details nightly build / build-linux-x86_64 (push) Has been cancelled Details nightly build / build-linux-aarch64 (push) Has been cancelled Details nightly build / build-macos-aarch64 (push) Has been cancelled Details nightly build / build-macos-x86_64 (push) Has been cancelled Details wpt / web platform tests json output (push) Has been cancelled Details wpt / perf-fmt (push) Has been cancelled Details	2025-06-13 10:30:50 +02:00
Karl Seguin	97c769e805	Rework internal navigation to prevent deadlocking The mix of sync and async HTTP requests requires care to avoid deadlocks. Previously, it was possible for async requests to use up all available HTTP state objects duration a navigation flow (either directly, or via an internal redirect (e.g. click, submit, ...)). This would block the navigation, which, because everything is single thread, would block the I/O loop, resulting in a deadlock. The correct solution seems to be to remove all synchronous I/O. And I tried to do that, but I ran into a wall with module-loading, which is initiated from V8. V8 says "give me the source for this module", and I don't see a great way to tell it: wait a bit. So I went back to trying to make this work with the hybrid model, despite last weeks failures to get it to work. I changed two things: 1 - The http client will only directly initiate an async request if there's at least 2 free state objects available (1 for the request, and leaving 1 free for any synchronous requests) 2 - Delayed navigation retries until there's at least 1 free http state object available. Commits from last week did help with this. First, we're now guaranteed to have a single sync-request at a time (previously, we could have had 2). Secondly, the async connection is now async end-to-end (previously, it could have blocked on an empty state pool). We could probably make this a bit more obviously by reserving 1 state object for synchronous requests. But, since the long term solution is probably having no synchronous requests, I'm happy with anything that lets me move past this issue.	2025-06-12 12:34:51 +08:00
Karl Seguin	305460dedb	Merge pull request #768 from lightpanda-io/setExtraHTTPHeaders Some checks failed e2e-test / zig build release (push) Has been cancelled Details e2e-test / puppeteer-perf (push) Has been cancelled Details e2e-test / demo-scripts (push) Has been cancelled Details e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled Details e2e-test / perf-fmt (push) Has been cancelled Details zig-test / zig build dev (push) Has been cancelled Details zig-test / browser fetch (push) Has been cancelled Details zig-test / zig test (push) Has been cancelled Details zig-test / perf-fmt (push) Has been cancelled Details setExtraHTTPHeaders	2025-06-06 16:45:07 +08:00
sjorsdonkers	bacef41a3b	extra header feedback	2025-06-06 10:33:15 +02:00
Karl Seguin	fdd1a778f3	Properly drain event loop when navigating between pages	2025-06-06 12:53:45 +08:00
Karl Seguin	2feba3182a	Replace std.log with a structured logger Outputs in logfmt in release and a "pretty" print in debug mode. The format along with the log level will become arguments to the binary at some point in the future.	2025-05-27 19:57:58 +08:00
sjorsdonkers	3f31573bcb	No need to navigate to about:blank	2025-05-21 09:43:15 +02:00
sjorsdonkers	0929bd217d	load aboutblank doc	2025-05-21 09:43:15 +02:00
sjorsdonkers	8930e2f06e	isolated polyfill + create when needed	2025-05-05 08:46:32 +02:00
sjorsdonkers	7dde0be043	share sessionstate and underlying DOM global with the isolated	2025-04-29 23:17:39 +02:00
sjorsdonkers	4db80cb9e7	Adopt state into the isolated world	2025-04-29 18:10:55 +02:00
Karl Seguin	7309fec51d	Fully fake contextCreated emit contextCreated when it's needed, not when it actually happens. I thought we could make this sync-up, but we'd need to create 3 contexts to satisfy both puppeteer and chromedp. So rather than having it partially driven by notifications from Browser, I rather just fake it all for now.	2025-04-29 13:29:42 +08:00
Karl Seguin	9044925f32	emit context created on createTarget event for chromedp	2025-04-29 10:58:23 +08:00
Karl Seguin	2d5ff8252c	Reorganize v8 contexts and scope - Pages within the same session have proper isolation - they have their own window - they have their own SessionState - they have their own v8.Context - Move inspector to CDP browser context - Browser now knows nothing about the inspector - Use notification to emit a context-created message - This is still a bit hacky, but again, it decouples browser from CDP	2025-04-29 10:22:08 +08:00
Karl Seguin	1fca035cfe	Make CDP less generic. It's still generic over the client - we need to assert messages written to and be able to send specific commands, but it's no longer generic over Browser/ Session/Page/etc..	2025-04-24 18:06:55 +08:00
Karl Seguin	f38a0d2d67	Remove BrowserContext URL Add BrowserContext.getURL which gets the URL from the session.page.	2025-04-08 22:51:17 +08:00
Karl Seguin	be9e953971	Add CDP Node Registry This expands on the existing CDP node work used in DOM.search. It introduces a node registry to track all nodes returned to the client and give lookups to get a node from a Id or a *parser.node. Eventually, the goal is to have the Registry emit the DOM.setChildNodes event whenever necessary, as well as support many of the missing DOM actions. Added tests to existing search handlers. Reworked search a little bit to avoid some unnecessary allocations and to hook it into the registry. The generated Node is currently incomplete. The parentId is missing, the children are missing. Also, we still need to associate the v8 ObjectId to the node. Finally, I moved all action handlers into a nested "domain" folder.	2025-03-28 19:00:29 +08:00

45 Commits