Commit Graph

170 Commits

Author SHA1 Message Date
sjorsdonkers
32d9fc0d32 Pass objectGroup as groupName 2025-04-09 13:40:00 +02:00
sjorsdonkers
3a7da6665f unittest scaffolding 2025-04-09 11:33:44 +02:00
sjorsdonkers
2f47e04de7 Use findOrAddValue for precise JsValue 2025-04-09 11:33:41 +02:00
sjorsdonkers
7dc3add5fd reolveNode WIP 2025-04-09 11:32:23 +02:00
Karl Seguin
f38a0d2d67 Remove BrowserContext URL
Add BrowserContext.getURL which gets the URL from the session.page.
2025-04-08 22:51:17 +08:00
Karl Seguin
b76875bf5d use netsurf's mousevent 2025-04-08 22:43:53 +08:00
Karl Seguin
0253de80de Add a dumb renderer to get coordinates
FlatRenderer positions items on a single row, giving each a height and width of
1.

Added getBoundingClientRect to the DOMelement which, when requested for the
first time, will place the item in with the renderer.

The goal here is to give elements a fixed position and to make it easy to map
x,y coordinates onto an element. This should work, at least with puppeteer,
since it first requests the boundingClientRect before issuing a click.
2025-04-08 22:43:53 +08:00
Karl Seguin
0139437c3d Wrap getDocument response in a root object 2025-04-08 10:05:32 +08:00
Karl Seguin
8f4be9b76f break when child node list fails 2025-04-07 10:40:46 +08:00
Karl Seguin
4d075818f6 Lazily load nodes
Node registry now only tracks the node id (which we need to be consistent) and
the underlying parser.Node. All other data is loaded on-demand (i.e. when we
serialize the node). This allows us to serialize node values as they appear
when they are serialized, as opposed to when they are registered.
2025-04-04 11:24:34 +08:00
Karl Seguin
68d1be3b94 Add children node to CDP Node representation
Add Node writer. Different CDP messages want different child depths. For now,
only support immediate children, but the new writer should make it easy to
support variable.
2025-04-03 21:28:57 +08:00
Karl Seguin
af68b10c5d Better CDP node serialization
Include direct descendant, with hooks for other serialization options.

Don't include parentId if null.
2025-04-03 21:18:18 +08:00
Karl Seguin
be9e953971 Add CDP Node Registry
This expands on the existing CDP node work used in  DOM.search. It introduces
a node registry to track all nodes returned to the client and give lookups to
get a node from a Id or a *parser.node.

Eventually, the goal is to have the Registry emit the DOM.setChildNodes event
whenever necessary, as well as support many of the missing DOM actions.

Added tests to existing search handlers. Reworked search a little bit to avoid
some unnecessary allocations and to hook it into the registry.

The generated Node is currently incomplete. The parentId is missing, the
children are missing. Also, we still need to associate the v8 ObjectId to the
node.

Finally, I moved all action handlers into a nested "domain" folder.
2025-03-28 19:00:29 +08:00
Pierre Tachoire
82e67b7550 Merge pull request #489 from lightpanda-io/microtasks
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / puppeteer-perf (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
wpt / web platform tests (push) Has been cancelled
wpt / perf-fmt (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
nightly build / build-linux-x86_64 (push) Has been cancelled
nightly build / build-linux-aarch64 (push) Has been cancelled
nightly build / build-macos-aarch64 (push) Has been cancelled
nightly build / build-macos-x86_64 (push) Has been cancelled
run v8 micro tasks
2025-03-27 17:15:50 +01:00
Pierre Tachoire
3f1d0df7f9 cdp: run microtasks after send inspector 2025-03-27 15:49:48 +01:00
Karl Seguin
c6538e1038 Add an insecure_disable_tls_host_verification command line option
When set, this disables the host verification of all HTTP requests. Available
for both the fetch and serve mode.

Also introduced an App.Config, for future command line options which need to
be passed more deeply into the code.
2025-03-27 18:02:30 +08:00
Karl Seguin
22aa126b29 Cleaner merge
Switch to non-blocking sockets.

Fix TLS handshake/receive/send ordering
2025-03-23 19:05:35 +08:00
Karl Seguin
2017d4785b replace zig-async-io and std.http.Client with a custom HTTP client 2025-03-23 19:01:40 +08:00
Pierre Tachoire
7607ab2c84 cdp: target: implement detach from target 2025-03-20 09:36:00 +01:00
Pierre Tachoire
fe7f6bee1c cdp: create a cdp state for target_auto_attach 2025-03-20 09:35:59 +01:00
Pierre Tachoire
b43658eb3f cdp: target: add test for #474
Can't attach to just created target
2025-03-20 09:35:59 +01:00
Karl Seguin
21c9dde858 Zig 0.14 compatibility 2025-03-19 16:28:15 +01:00
Karl Seguin
ba8a0179d5 Share the HTTP client globally 2025-03-19 11:09:58 +08:00
Pierre Tachoire
9fe10747ce Merge pull request #476 from karlseguin/implicit_browser_context
Some checks failed
e2e-test / zig build release (push) Has been cancelled
wpt / web platform tests (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
e2e-test / puppeteer-perf (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
wpt / perf-fmt (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
Implicitly create BrowserContext on createTarget if one doesn't exist
2025-03-18 09:21:50 +01:00
Karl Seguin
cd33a089d1 flatten events, include aarch + os, remove eid 2025-03-18 08:26:58 +08:00
Karl Seguin
6b83281539 Add navigate telemetry 2025-03-18 08:25:44 +08:00
Karl Seguin
430779979e Implicitly create BrowserContext on createTarget if one doesn't exist 2025-03-17 20:45:57 +08:00
Pierre Tachoire
aca01d81d6 cdp: use .zig-cache to save js script debug files 2025-03-14 11:41:21 +01:00
Pierre Tachoire
6a0b154d67 cdp: dump runtime js only in debug mode 2025-03-14 11:41:20 +01:00
Karl Seguin
3fe28d5441 Optimize memory usage
The two bigger changes here are:

1- The http_client has been moved from the Session to the Browser, allowing
   its connection pool to be re-used across multiple sessions

2- The browser now has a page_arena which is used for all page-level allocation
   and which can be re-used between pages (currently retains 1MB of memory).
   Previously, pages uses an arena that was tied to the lifetime of the page,
   thus it could not be re-used.

Using the Bench allocator for zig-js-runtime, allocated bytes went from
1347037879 to 834932438 (in a RUNS=1000 of puppeteer demo).

Various other changes to try to simplify the API and remove the possibility
of invalid states. For example, session.newPage() now includes the logic for
page.start() so that there should now never be a page that wasn't started.
2025-03-12 13:38:22 +08:00
Karl Seguin
e3409a27e7 fix test 2025-03-11 10:51:40 +08:00
Karl Seguin
5182edce6f Remove CDP FrameId
I don't know if FrameId is related to an <iframe>, and whether each Page has
1 implicit "frame". But, playwright seems to treat frameId and targetId as
interchangeable, and chrome seems to agree (at leas to some degree); chrome will
return a targetId and reuse that value for the frameId.

So the simplest solution is just to remove our concept of a frameId and use
targetId exclusively. This doesn't seem to cause any issues with puppeteer.
2025-03-11 10:37:43 +08:00
Pierre Tachoire
6ca1e6c6dd cdp: let the inspector return the response
When a command is forwarded to the inspector, it handles directly the
reponse to the message.
2025-03-10 14:57:10 +01:00
Pierre Tachoire
f3a1a6a191 cdp: add a Page.getFrameTree unit test 2025-03-10 14:57:10 +01:00
Pierre Tachoire
675932c65b cdp: improve playwright support
The getTargetInfo result must return a `targetInfo` key.

Here is an example returned by Chrome:
```json
{
  "id": 16,
  "result": {
    "targetInfo": {
      "targetId": "d93a1bbc-f906-4bbb-bb4d-a2285234b091",
      "type": "browser",
      "title": "",
      "url": "",
      "attached": true,
      "canAccessOpener": false
    }
  }
}
```
2025-03-10 14:57:05 +01:00
Karl Seguin
9de84aee2e Don't send CDP result when message is forward to inspector.
Rely on inspector to send the result, otherwise we'll send 2 responses to the
same message (one ourselves and one from the inspector), which Playwright does
not like.
2025-03-10 14:34:32 +01:00
Karl Seguin
adb8779d00 allow Target.getTargetInfo to be called without parameters 2025-03-10 14:34:32 +01:00
Karl Seguin
fbb0e675f5 send attach events before result 2025-03-10 14:34:32 +01:00
Karl Seguin
a3e2b5246e Make CDP server more authoritative with respect to IDs
The TL;DR is that this commit enforces the use of correct IDs, introduces a
BrowserContext, and adds some CDP tests.

These are the ids we need to be aware of when talking about CDP:
- id
- browserContextId
- targetId
- sessionId
- loaderId
- frameId

The `id` is the only one that _should_ originate from the driver. It's attached
to most messages and it's how we maintain a request -> response flow: when
the server responds to a specific message, it echo's back the id from the
requested message. (As opposed to out-of-band events sent from the server which
won't have an `id`). When I say "id" from this point forward, I mean every id
except for this req->res id.

Every other id is created by the browser.

Prior to this commit, we didn't really check incoming ids from the driver. If
the driver said "attachToTarget" and included a targetId, we just assumed that
this was the current targetId. This was aided by the fact that we only used
hard-coded IDS. If _we_ only "create" a frameId of "FRAME-1", then it's tempting
to think the driver will only ever send a frameId of "FRAME-1".

The issue with this approach is that _if_ the browser and driver fall out of sync
and there's only ever 1 browserContextId, 1 sessionId and 1 frameId, it's not
impossible to imagine cases where we behave on the thing.

Imagine this flow:
- Driver asks for a new BrowserContext
- Browser says OK, your browserContextId is 1
- Driver, for whatever reason, says close browserContextId 2
- Browser says, OK, but it doesn't check the id and just closes the only
  BrowserContext it knows about (which is 1)

By both re-using the same hard-coded ids, and not verifying that the ids sent
from the client correspond to the correct ids, any issues are going to be hard
to debug.

Currently LOADER_ID and FRAEM_ID are still hard-coded. Baby steps.
2025-03-10 14:34:32 +01:00
Karl Seguin
99fb82e244 Turn CDP into a generic so that mocks can be injected for testing
ADD CDP testing helpers (mock Browser, Session, Page and Client). These are
placeholders until tests are added which use them.

Added a couple CDP tests.
2025-02-21 13:17:35 +08:00
Karl Seguin
c4eeef2a86 On CDP process error, let client decide how to close
Fixes issue where CDP closes the client, but client still registers a recv
operation.
2025-02-17 12:05:25 +08:00
Karl Seguin
b60a91f53c fix memory leak 2025-02-17 11:45:19 +08:00
Karl Seguin
1846d0bc21 drats, zig fmt again 2025-02-12 18:32:33 +08:00
Karl Seguin
d282055e10 Merge branch 'main' into cdp_struct 2025-02-12 17:56:47 +08:00
Karl Seguin
6ab64d155b Refactor CDP
CDP is now an struct which contains its own state a browser and a session.

When a client connection is made and successfully upgrades, the client creates
the CDP instance. There is now a cleaner separation betwen Server, Client and
CDP.

Removed a number of allocations, especially when writing results/events from
CDP to the client. Improved input message parsing. Tried to remove some usage
of undefined.
2025-02-12 16:47:37 +08:00
Karl Seguin
c0c0694fcc Make TCP server websocket-aware
Adding HTTP & websocket awareness to the TCP server.

HTTP server handles `GET /json/version` and websocket upgrade requests.

Conceptually, websocket handling is the same code as before, but receiving
data will parse the websocket frames and writing data will wrap it in
a websocket frame.

The previous `Ctx` was split into a `Server` and a `Client`. This was
largely done to make it easy to write unit tests, since the `Client` is
a generic, all its dependencies (i.e. the server) can be mocked out. This
also makes it a bit nicer to know if there is or isn't a client (via the
server's client optional).

Added a MemoryPool for the Send object (I thought that was a nice touch!)

Removed MacOS hack on accept/conn completion usage.

Known issues:
- When framing an outgoing message, the entire message has to be duped. This
is no worse than how it was before, but it should be possible to eliminate
this in the future. Probably not part of this PR.

- Websocket parsing will reject continuation frames. I don't know of a single
client that will send a fragmented message (websocket has its own
message fragmentation), but we should probably still support this just in
case.

- I don't think the receive, timeout and close completions can safely be
re-used like we're doing. I believe they need to be associated with a specific
client socket.

- A new connection creates a new browser session. I think this is right (??),
but for the very first, we're throwing out a perfectly usable session. I'm
thinking this might be a change to how Browser/Sessions work.

- zig build test won't compile. This branch reproduces the issue with none
of these changes:
https://github.com/karlseguin/browser/tree/broken_test_build

(or, as a diff to main):
https://github.com/lightpanda-io/browser/compare/main...karlseguin:broken_test_build
2025-02-11 11:16:39 +08:00
Pierre Tachoire
055530c8c6 cdp: send dom node children 2025-02-10 12:19:35 +01:00
Pierre Tachoire
fb3b38aec7 cdp: implement getSearchResults and discardSearchResults 2025-02-10 09:31:10 +01:00
Pierre Tachoire
4e4a8f1bab cdp: implement DOM.performSearch 2025-02-10 09:31:09 +01:00
Pierre Tachoire
39b3786776 cdp: ctx state has init and deinit now 2025-02-10 09:31:09 +01:00