Commit Graph

361 Commits

Author SHA1 Message Date
Karl Seguin
0959eea677 Remove the loop
Previously, the IO loop was doing three things:
1 - Managing timeouts (either from scripts or for our own needs)
2 - Handling browser IO events (page/script/xhr)
3 - Handling CDP events (accept, read, write, timeout)

With the libcurl merge, 1 was moved to an in-process scheduler and 2 was moved
to libcurl's own event loop. That means the entire loop code, including
the dependency on tigerbeetle-io existed for handling a single TCP client.
Not only is that a lot of code, there was also friction between the two loops
(the libcurl one and our IO loop), which would result in latency - while one
loop is waiting for the events, any events on the other loop go un-processed.

This PR removes our IO loop. To accomplish this:

1 - The main accept loop is blocking. This is simpler and works perfectly well,
given we only allow 1 active connection.
2 - The client socket is passed to libcurl - yes, libcurl's loop can take
arbitrary FDs and poll them along with its own.

In addition to having one less dependency, the CDP code is quite a bit simpler,
especially around shutdowns and writes. This also removes _some_ of the latency
caused by the friction between page process and CDP processing. Specifically,
when CDP now blocks for input, http page events (script loading, xhr, ...) will
still be processed.

There's still friction. For one, the reverse isn't true: when the page is
waiting for events, CDP events aren't going to be processed. But the page.wait
already have some sensitivity to this (e.g. the page.request_intercepted flag).
Also, when CDP waits, while we will process network events, page timeouts are
still not processed. Because of both these remaining issues, we still need to
jump between the two loops - but being able to block on CDP (even for a short
time) WITHOUT stopping the page's network I/O, should reduce some latency.
2025-08-25 17:27:28 +08:00
Karl Seguin
087e42a641 Normalize CDP response headers
chromedb doesn't support duplicate header names. Although servers _will_ send
this (e.g. Cache-Control: public\r\nCache-Control: max-age=60\r\n), Chrome
seems to join them with a "\n". So we do the same.

A note on curl_easy_nextheader, which this code ultimately uses to iterate
and collect the headers. The documentation says:

    Applications must copy the data if they want it to survive subsequent API
    calls or the life-time of the easy handle.

As-is, I'd understand this to mean that a given header name/value is only
valid until any API call, including another call to curl_easy_nextheader. So,
from this comment, we _should_ be duping the name/value. But we don't. Why?
Because, despite the note in the documentation, this doesn't appear to be how
it actually works, nor does it really make sense. If it's just a linked list,
there's no reason curl_easy_nextheader should invalidate previous results. I'm
guessing this is just a general lack of guarantee libcurl is willing to make re
lifetimes.

https://github.com/lightpanda-io/browser/issues/966
2025-08-25 09:25:15 +08:00
Karl Seguin
cd33e9ad0e Implement Network.getResponseBody
Add response_data event, CDP now captures the full body so that it can respond
to the Network.getResponseBody. This isn't memory efficient, but I don't see
another way to do it. At least this way, it's only capturing/storing every
response body when (a) CDP is used and (b) Network.enabled is called. That is,
as opposed to baking this into Http/Client.zig, which would force the memory
consumption for all use-cases.

There's arguably some optimizations we could make for XHR requests, which also
dupe/own the response. As of now, the response is dupe'd separately for CDP
and XHR.
2025-08-21 10:33:53 +08:00
Karl Seguin
7cc9521cbb Merge pull request #958 from lightpanda-io/http_request_done_notification
Emits a http_request_done internal notification.
2025-08-21 09:23:41 +08:00
Karl Seguin
6b001c50a4 Emits a http_request_done internal notification.
With networking enabled, CDP listens to this event and emits a
`Network.loadingFinished` event. This is event is used by puppeteer to know that
details about the response (i.e. the body) can be queries.

Added dummy handling for the Network.getResponseBody message. Returns an
empty body. Needed because we emit the loadingFinished event which signals
to drivers that they can ask for the body.
2025-08-20 19:32:19 +08:00
Karl Seguin
00c11d9bd4 Merge pull request #956 from lightpanda-io/typo-fix
typo fix
2025-08-20 17:06:16 +08:00
Pierre Tachoire
ed99acebfe typo fix 2025-08-20 09:25:47 +02:00
Karl Seguin
16c85c5b8a Use Transfer.arena in a few more places, correctly set is_navigation on redirect
Following up to Request Interception PR (1) and Cookie Redirect PR (2) which
both introduced features that were useful to the other. This PR closes that
loop.

(1) https://github.com/lightpanda-io/browser/pull/946
(2) https://github.com/lightpanda-io/browser/pull/948
2025-08-20 11:39:38 +08:00
Karl Seguin
f5ec74252d Add fulfillRequest and more complete continueRequest 2025-08-18 18:29:10 +08:00
Karl Seguin
211012d367 move intercept_state and extra_headers from CDP instance to BrowserContext 2025-08-18 13:23:17 +08:00
Karl Seguin
c1319d1f27 add proper resourceType 2025-08-18 12:42:18 +08:00
Karl Seguin
01223601f2 Reduce allocations made during request interception
Stream (to json) the Transfer as a request and response object in the various
network interception-related events (e.g. Network.responseReceived).

Add a page.request_intercepted boolean flag for CDP to signal the page that
requests have been intercepted, allowing Page.wait to prioritize intercept
handling (or, at least, not block it).
2025-08-15 14:01:57 +08:00
Karl Seguin
96b10f4b85 Optimize Network.responseReceived
Add a header iterator to the transfer. This removes the need for NetworkState,
duping header name/values, and the http_header_received event.
2025-08-14 15:50:56 +08:00
Karl Seguin
5100e06f38 fix header done callback 2025-08-14 14:51:02 +08:00
sjorsdonkers
7d05712f40 setExtraHTTPHeaders 2025-08-13 14:54:59 +02:00
sjorsdonkers
c0106a238b http_headers_done_receiving 2025-08-13 14:29:23 +02:00
Karl Seguin
ca9e850ac7 Create Client.Transfer earlier.
On client.request(req) we now immediately wrap the request into a Transfer. This
results in less copying of the Request object. It also makes the transfer.uri
available, so CDP no longer needs to std.Uri(request.url) anymore.

The main advantage is that it's easier to manage resources. There was a use-
after free before due to the sensitive nature of the tranfer's lifetime. There
were also corner cases where some resources might not be freed. This is
hopefully fixed with the lifetime of Transfer being extended.
2025-08-13 18:05:00 +08:00
sjorsdonkers
a49154acf4 http_request_fail 2025-08-12 15:20:48 +02:00
sjorsdonkers
03694b54f0 3# This is a combination of 3 commits.
intercept continue and abort

feedback

First version of headers, no cookies yet
2025-08-12 13:49:20 +02:00
Karl Seguin
c96fb3c2f2 support CDP proxy override 2025-08-11 21:37:03 +08:00
Karl Seguin
3555680335 Working navigation events (clicks, form submission) 2025-08-11 21:37:01 +08:00
Karl Seguin
f65a39a3e3 Re-enable telemetry
Start work on supporting navigation events (clicks, form submission).
2025-08-11 21:37:00 +08:00
Karl Seguin
54ab1326e5 Switch XHR to new http client
get puppeteer/cdp.js working again

make test are all passing
2025-08-11 21:37:00 +08:00
Karl Seguin
b0fe5d60ab Initial work on integrating libcurl and making all http nonblocking 2025-08-11 21:36:56 +08:00
Pierre Tachoire
7aff90aec7 cdp: add comment for CDP_USER_AGENT 2025-08-04 14:40:44 +02:00
sjorsdonkers
478f3a5308 simplify statusText
Some checks failed
e2e-test / zig build release (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
zig-test / browser fetch (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
nightly build / build-linux-x86_64 (push) Has been cancelled
nightly build / build-linux-aarch64 (push) Has been cancelled
nightly build / build-macos-aarch64 (push) Has been cancelled
nightly build / build-macos-x86_64 (push) Has been cancelled
wpt / web platform tests json output (push) Has been cancelled
wpt / perf-fmt (push) Has been cancelled
2025-07-29 09:53:54 +02:00
sjorsdonkers
b98edf3d76 CDP response statusText 2025-07-29 09:53:54 +02:00
Karl Seguin
eef5f3fec2 support null params to CDP DOM.getDocument 2025-07-17 19:05:17 +08:00
Karl Seguin
f199816fcd support depth parameter for DOM.getDocument 2025-07-17 14:17:33 +08:00
Karl Seguin
5e74e17b41 Merge pull request #888 from lightpanda-io/cdp_dom_requestChildNodes
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / browser fetch (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
Add support for CDP's DOM.requestChildNodes
2025-07-17 10:48:24 +08:00
Karl Seguin
98b041e84a requestChildNode cannot have a depth of 0 2025-07-17 10:36:20 +08:00
Karl Seguin
1a05fe6ae1 Merge pull request #887 from lightpanda-io/go_rod
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / browser fetch (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
nightly build / build-linux-x86_64 (push) Has been cancelled
nightly build / build-linux-aarch64 (push) Has been cancelled
nightly build / build-macos-aarch64 (push) Has been cancelled
nightly build / build-macos-x86_64 (push) Has been cancelled
wpt / web platform tests json output (push) Has been cancelled
wpt / perf-fmt (push) Has been cancelled
Noop CDP methods that go-rod requires
2025-07-16 20:01:03 +08:00
sjorsdonkers
16fcbf66ee http_proxy_before ?? comment 2025-07-16 11:20:00 +02:00
Karl Seguin
09ca0e6ef0 Add support for CDP's DOM.requestChildNodes
https://github.com/lightpanda-io/browser/issues/866
2025-07-14 15:13:01 +08:00
Karl Seguin
fae2b5acfa Noop CDP methods that go-rod requires
go-rod appears to stop processing when it receives an error, such as
UnknownMethod. Added placeholder handlers for Network.setUserAgentOverride and
Page.stopLoading.

Setting a custom user agent is something still being discussed, so no-oping it
seems reasonable. And, due to the currently synchronous nature of the initial
page load, no-oping stopLoading also seems reasonable.

https://github.com/lightpanda-io/browser/issues/867
2025-07-14 11:21:02 +08:00
Karl Seguin
1602932d72 Add a "pre" polyfill
This is always run, but only the full webcomponents polyfill, it's very
small and isn't intrusive. This introduces a layer of indirection so that,
if the full polyfill is loaded, its monkeypatched constructor will be called
2025-07-12 19:49:19 +08:00
Pierre Tachoire
2cdc9e9f5f cdp: use a polyfill loader per isolate 2025-07-07 16:31:54 -07:00
Pierre Tachoire
941dace7f9 enable conditionnal loading for polyfill 2025-07-07 16:31:53 -07:00
Karl Seguin
0b846b15b1 Merge pull request #789 from lightpanda-io/browsercontext-proxyServer
browser context proxyServer
2025-06-19 10:22:17 +08:00
sjorsdonkers
60f4eab759 handle no params 2025-06-18 10:07:37 +02:00
sjorsdonkers
d7656ea985 expires dashes and f64 2025-06-18 10:07:37 +02:00
sjorsdonkers
e402998577 JS may not set/get HttpOnly cookies 2025-06-18 10:07:37 +02:00
sjorsdonkers
073f75efa3 CDP Network cookie tests 2025-06-18 10:07:37 +02:00
sjorsdonkers
da414f7eb3 CDP.Storage cookies tests 2025-06-18 10:07:37 +02:00
sjorsdonkers
270b89830a Cleaning up crumbles 2025-06-18 10:07:37 +02:00
sjorsdonkers
74ce7ca416 refactor path / domain parsing 2025-06-18 10:07:37 +02:00
sjorsdonkers
3f4338cb51 wip 2025-06-18 10:07:37 +02:00
sjorsdonkers
30ee41fd0e Network.getCookies 2025-06-18 10:07:37 +02:00
sjorsdonkers
4965fec55c storage cookies 2025-06-18 10:07:37 +02:00
sjorsdonkers
18dff8455c lower case domain 2025-06-18 10:07:37 +02:00