Commit Graph

84 Commits

Author SHA1 Message Date
Pierre Tachoire
fb6fbffe3f Merge pull request #1169 from lightpanda-io/cdp-security-ignore-cert-err
Some checks failed
e2e-test / zig build release (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
zig-test / browser fetch (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
nightly build / build-linux-x86_64 (push) Has been cancelled
nightly build / build-linux-aarch64 (push) Has been cancelled
nightly build / build-macos-aarch64 (push) Has been cancelled
nightly build / build-macos-x86_64 (push) Has been cancelled
wpt / web platform tests json output (push) Has been cancelled
wpt / perf-fmt (push) Has been cancelled
cdp: implement Security.setIgnoreCertificateErrors
2025-10-21 15:15:51 +02:00
Pierre Tachoire
6915738e02 cdp: ensure no inflight conns is running before set TLS verify 2025-10-21 14:07:59 +02:00
Pierre Tachoire
4f62cc833b http: fix VERIFY_HOST value 2025-10-21 13:47:09 +02:00
Pierre Tachoire
d2065f713f cdp: implement Security.setIgnoreCertificateErrors 2025-10-21 13:44:29 +02:00
Karl Seguin
ca3efb3ad9 correct typos (all in comments) 2025-10-21 16:17:38 +08:00
Karl Seguin
76e8506022 Remove potential processing blocking with CDP
When using CDP, we poll the HTTP clients along with the CDP socket. Because this
polling can be long, we first process any pending message. This can end up
processing _all_ messages, in which case the poll will block for a long time.

This change makes it so that when the initial processing processes 1+ message,
we do not poll, but rather return. This allows the page lifecycle to be
processed normally (and not just blocking on poll, waiting for the CDP client
to send data).
2025-10-09 13:18:47 +08:00
Karl Seguin
418dc6fdc2 Start downloading all synchronous imports ASAP
This changes how non-async module loading works. In general, module loading
is triggered by a v8 callback. We ask it to process a module (a <script type=
module>) and then for every module that it depends on, we get a callback. This
callback expects the nested v8.Module instance, so we need to load it then and
there (as opposed to dynamic imports, where we only have to return a promise).

Previously, we solved this by issuing a blocking HTTP get in each callback. The
HTTP loop was able to continuing downloading already-queued resources, but if
a module depended on 20 nested modules, we'd issue 20 blocking gets one after
the other.

Once a module is compiled, we can ask v8 for a list of its dependent module. We
can them immediately start to download all of those modules. We then evaluate
the original module, which will trigger our callback. At this point, we still
need to block and wait for the response, but we've already started the download
and it's much faster. Sure, for the first module, we might need to wait the same
amount of time, but for the other 19, chances are by the time the callback
executes, we already have it downloaded and ready.
2025-09-26 15:38:50 +08:00
Pierre Tachoire
2d24e3c7f7 Merge pull request #972 from lightpanda-io/fetch
Fetch + ReadableStream
2025-09-18 09:29:05 +02:00
Karl Seguin
26550129ea Add --user_agent_suffix argument
Allows appending a value (separated by a space) to the existing Lightpanda/X.Y
user agent.
2025-09-18 11:28:27 +08:00
Muki Kiboigo
a133a71eb9 proper fetch method and body setting 2025-09-17 08:41:22 -07:00
Pierre Tachoire
e00066466b http: decrement intercepted on auth abortion 2025-09-16 12:18:49 +02:00
Pierre Tachoire
b87a8ba97d http: increment intercepted counter on auth interception 2025-09-16 12:18:49 +02:00
Pierre Tachoire
37fe6a661b Merge pull request #1013 from lightpanda-io/reset_request_method
Reset CURLOPT_CUSTOMREQUEST for each request
2025-09-05 17:43:30 +02:00
Karl Seguin
6600626f4f Reset CURLOPT_CUSTOMREQUEST for each request 2025-09-05 15:45:28 +08:00
Pierre Tachoire
9f040025e7 Merge pull request #1010 from lightpanda-io/update_transfer_uri_on_redirect
Update the transfer.uri on redirect
2025-09-05 08:35:13 +02:00
Karl Seguin
dd22c55d23 migrate to htmlRunne (plus zig fmt) 2025-09-05 13:52:08 +08:00
Karl Seguin
a6efa9e9b2 Update the transfer.uri on redirect
Ensures that cookies set on the redirect page use the correct host and we don't
incorrectly reject cookies.

https://github.com/lightpanda-io/browser/issues/947
2025-09-05 08:55:36 +08:00
Karl Seguin
5dda86bf4a Emit networkIdle and networkAlmostIdle Page.lifecycleEvent
Most CDP drivers have a mechanism to wait for idle network, or an almost idle
network (sometimes called networkIdle2). These are events the browser must emit.

The page will now emit `networkIdle` when we are reasonably sure there's no more
network activity (this requires some slight changes to request interception,
since, I believe, intercepted requests should be considered).

`networkAlmostIdle` is currently _always_ emitted prior to emitting
`networkIdle`. We should tweak this but I can't, at a glance, think of a great
heuristic for when this should be emitted.
2025-09-04 16:36:29 +08:00
Karl Seguin
b6137b03cd Rework page wait again
Further reducing bouncing between page and server for loop polling. If there is
a page, the page polls. If there isn't a page, the server polls. Simpler.
2025-09-03 19:38:01 +08:00
Karl Seguin
2ac9b2088a Always monitor the CDP client socket, even on page.wait 2025-09-03 08:17:13 +08:00
Karl Seguin
de533755e5 fix segfault on abort if there are queued transfers 2025-09-02 21:18:02 +08:00
Karl Seguin
57dc303d90 Make getContentLength work on fulfilled responses 2025-09-01 18:40:50 +08:00
Karl Seguin
2a8e51c2d2 Pre-size the destination buffer when we know the response content length 2025-08-31 20:14:55 +08:00
Karl Seguin
1443f38e5f Zig 0.15.1
Depends on https://github.com/lightpanda-io/zig-v8-fork/pull/89
2025-08-29 10:42:06 +08:00
Pierre Tachoire
b80ee3342c http: set content_type len on fulfill request 2025-08-28 16:28:41 +02:00
Pierre Tachoire
7647ce9e6d Merge pull request #960 from lightpanda-io/auth-challenge
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
zig-test / zig build dev (push) Has been cancelled
zig-test / browser fetch (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
auth required interception
2025-08-27 15:34:51 +02:00
Pierre Tachoire
041e014d68 Merge pull request #970 from lightpanda-io/remove_loop
Remove the loop
2025-08-26 18:17:32 +02:00
Pierre Tachoire
5defb5c442 http: build headers when auth challenge fails 2025-08-26 18:05:45 +02:00
Pierre Tachoire
520a572bb4 http: add reset and tries for transfer 2025-08-26 18:05:45 +02:00
Pierre Tachoire
4c602256da http: remove useless field 2025-08-26 18:05:45 +02:00
Pierre Tachoire
a847a1faae http: replace _forbidden with _auth_challenge struct 2025-08-26 18:05:44 +02:00
Pierre Tachoire
bb381e522c http: add creds into request 2025-08-26 18:05:39 +02:00
Pierre Tachoire
7046e18d7e http: simplify header parsing 2025-08-25 14:18:14 +02:00
Pierre Tachoire
a7516061d0 http: move use_proxy from connection to client 2025-08-25 14:18:14 +02:00
Pierre Tachoire
e61d787ff0 http: move header done callback in its own func
And call it only after the headers are parsed, either from data callback
or end of the request.
2025-08-25 14:18:14 +02:00
Pierre Tachoire
25ad420f85 http: ajust header callback according to review 2025-08-25 14:18:14 +02:00
Pierre Tachoire
e2320ebe66 http: handle proxy's request header callback 2025-08-25 14:18:13 +02:00
Pierre Tachoire
5e78a26e3d http: refacto http header parsing 2025-08-25 14:18:13 +02:00
Pierre Tachoire
159bd06a56 http: add use_proxy bool in connection 2025-08-25 14:18:12 +02:00
Pierre Tachoire
bc7e1e07f4 typo fix 2025-08-25 14:18:08 +02:00
Karl Seguin
0959eea677 Remove the loop
Previously, the IO loop was doing three things:
1 - Managing timeouts (either from scripts or for our own needs)
2 - Handling browser IO events (page/script/xhr)
3 - Handling CDP events (accept, read, write, timeout)

With the libcurl merge, 1 was moved to an in-process scheduler and 2 was moved
to libcurl's own event loop. That means the entire loop code, including
the dependency on tigerbeetle-io existed for handling a single TCP client.
Not only is that a lot of code, there was also friction between the two loops
(the libcurl one and our IO loop), which would result in latency - while one
loop is waiting for the events, any events on the other loop go un-processed.

This PR removes our IO loop. To accomplish this:

1 - The main accept loop is blocking. This is simpler and works perfectly well,
given we only allow 1 active connection.
2 - The client socket is passed to libcurl - yes, libcurl's loop can take
arbitrary FDs and poll them along with its own.

In addition to having one less dependency, the CDP code is quite a bit simpler,
especially around shutdowns and writes. This also removes _some_ of the latency
caused by the friction between page process and CDP processing. Specifically,
when CDP now blocks for input, http page events (script loading, xhr, ...) will
still be processed.

There's still friction. For one, the reverse isn't true: when the page is
waiting for events, CDP events aren't going to be processed. But the page.wait
already have some sensitivity to this (e.g. the page.request_intercepted flag).
Also, when CDP waits, while we will process network events, page timeouts are
still not processed. Because of both these remaining issues, we still need to
jump between the two loops - but being able to block on CDP (even for a short
time) WITHOUT stopping the page's network I/O, should reduce some latency.
2025-08-25 17:27:28 +08:00
Karl Seguin
cd33e9ad0e Implement Network.getResponseBody
Add response_data event, CDP now captures the full body so that it can respond
to the Network.getResponseBody. This isn't memory efficient, but I don't see
another way to do it. At least this way, it's only capturing/storing every
response body when (a) CDP is used and (b) Network.enabled is called. That is,
as opposed to baking this into Http/Client.zig, which would force the memory
consumption for all use-cases.

There's arguably some optimizations we could make for XHR requests, which also
dupe/own the response. As of now, the response is dupe'd separately for CDP
and XHR.
2025-08-21 10:33:53 +08:00
Karl Seguin
7cc9521cbb Merge pull request #958 from lightpanda-io/http_request_done_notification
Emits a http_request_done internal notification.
2025-08-21 09:23:41 +08:00
Karl Seguin
6b001c50a4 Emits a http_request_done internal notification.
With networking enabled, CDP listens to this event and emits a
`Network.loadingFinished` event. This is event is used by puppeteer to know that
details about the response (i.e. the body) can be queries.

Added dummy handling for the Network.getResponseBody message. Returns an
empty body. Needed because we emit the loadingFinished event which signals
to drivers that they can ask for the body.
2025-08-20 19:32:19 +08:00
Karl Seguin
5759c88932 Remove the http/Client.zig header_callback.
The callback which was called on a per-header basis is removed. Only XHR was
using this, and it was created before the HeaderIterator existed (because I
didn't know we could iterate through the response headers in curl after the fact).

The header_done_callback remains, but is now called header_callback (a bit
confusing in the short term).

The only difficulty was with fulfilled requests, which do not have an easy
handle for our HeaderIterator. The existing code would segfault if
transfer.responseHeaderIterator() was called on a fulfilled requests.
The HeaderIterator is now a tagged union that abstracts whether the source of
the response header is a curl easy, or just an injected list from the fulfilled
requests.
2025-08-20 17:49:37 +08:00
Karl Seguin
16c85c5b8a Use Transfer.arena in a few more places, correctly set is_navigation on redirect
Following up to Request Interception PR (1) and Cookie Redirect PR (2) which
both introduced features that were useful to the other. This PR closes that
loop.

(1) https://github.com/lightpanda-io/browser/pull/946
(2) https://github.com/lightpanda-io/browser/pull/948
2025-08-20 11:39:38 +08:00
Karl Seguin
7f47692ad4 Fix compilation error
bad auto merge?
2025-08-20 10:04:15 +08:00
Karl Seguin
af4066da87 Merge pull request #946 from lightpanda-io/request_interception
Request Interception
2025-08-20 07:53:08 +08:00
Pierre Tachoire
f7eee0d461 http: add an arena to Transfer 2025-08-19 11:10:52 +02:00
Pierre Tachoire
39178d8d2b http: remove uselesss Client.arena 2025-08-19 11:10:25 +02:00