Commit Graph

118 Commits

Author SHA1 Message Date
Karl Seguin
e23ef4b0be Remove custom-arenas, use ArenaPool instead
This removes the browser-specific arenas (session, transfer, page, call) in
favor of the arena pool.

This is a bit of a win-lose commit. It exists as (the last?) step before I can
really start working on frames. Frames will require their own "page" and "call"
arenas, so there isn't just 1 per browser now, but rather N, where N is the
number of frames + 1 page. This change was already done for Contexts when
ExecutionWorld was removed, and the idea is the same: making these units more
self contained so to support cases where we break out of the "1" model we
currently have (1 browser, 1 session, 1 page, 1 context, ...).

But it's a bit of a step backwards because the ArenaPool is dumb and just resets
everything to a single hard-coded (for now) value: 16KB. But in my mind, an
arena that's used for 1 thing (e.g. the page or call arenas) is more likely to
be well-sized for that specific role in the future, even on a different
page/navigate.

I think ultimately, we'll move to an ArenaPool that has different levels, e.g.
acquire() and acquireLarge() which can reset to different sizes, so that a page
arena can use acquireLarge() and retain a larger amount of memory between use.
2026-02-13 12:34:27 +08:00
Karl Seguin
0c89dca261 When _not_ in a libcurl callback, deinit the transfer normally 2026-02-12 18:29:35 +08:00
Karl Seguin
ed802c0404 Remove potential recursive abort call in curl
Curl doesn't like recursive calls. For example, you can't call
curl_multi_remove_handle from within a dataCallback.

This specifically means that, as-is, transfer.abort() calls aren't safe to be
called during a libcurl callback. Consider this code:

```
req.open('GET', 'http://127.0.0.1:9582/xhr');
req.onreadystatechange = (e) => {
  req.abort();
}
req.send();
```

onreadystatechange is triggered by network events, i.e. it executes in libcurl
callback. Thus, the above code fails to truly "abort" the request with
`curl_multi_remove_handle` error, saying it's a recursive call.

To solve this, transfer.abort() now sets an `aborted = true` flag. Callbacks can
now use this flag to signal to libcurl to stop the transfer.

A test was added which reproduced this issue, but this comes from:
https://github.com/lightpanda-io/browser/issues/1527  which I wasn't able to
reliably reproduce. I did see it happen regularly, just not always. It seems
like this commit fixes that issue.
2026-02-12 11:29:47 +08:00
Muki Kiboigo
f02a37d3f0 properly handle failed parsing on robots 2026-02-10 20:09:32 -08:00
Karl Seguin
70ae6b8d72 Merge pull request #1407 from lightpanda-io/robots
Support for `robots.txt`
2026-02-10 09:51:32 +08:00
Muki Kiboigo
e1850440b0 shutdown queued req on robots shutdown 2026-02-09 15:24:35 -08:00
Muki Kiboigo
65c9b2a5f7 add robotsShutdownCallback 2026-02-09 05:51:42 -08:00
Muki Kiboigo
46c73a05a9 panic instead of unreachable on robots callbacks 2026-02-09 05:35:32 -08:00
Karl Seguin
a6cd019118 Add http_max_response_size
This adds a --http_max_response_size argument to the serve and fetch command
which is enforced by the HTTP client. This defaults to null, no limit.

As-is, the ScriptManager allocates a buffer based on Content-Length. Without
setting this flag, a server could simply reply with Content-Length: 99999999999
9999999999  to cause an OOM. This new flag is checked both once we have the
header if there's a content-length, and when reading the body.

Also requested in https://github.com/lightpanda-io/browser/issues/415
2026-02-09 13:16:18 +08:00
Karl Seguin
cecdf0d511 Add support for XHR's withCredentials
XHR should only send and receive cookies for same-origin requests or if
withCredentials is true.
2026-02-07 16:16:10 +08:00
Karl Seguin
2eab4b84c9 Rename all ArrayListUnmanaged -> ArrayList
ArrayListAlignedUnmanaged has been deprecated for a while, and I occasionally
replace them, but doing one complete pass gets it done once and for all.
2026-02-05 11:49:15 +08:00
Muki Kiboigo
a7095d7dec pass robot store into Http init 2026-02-04 12:23:42 -08:00
Muki Kiboigo
e620c28a1c stop leaking robots_url when in robot queue 2026-02-04 12:19:18 -08:00
Muki Kiboigo
29ee7d41f5 queue requests to run after robots is fetched 2026-02-04 12:19:17 -08:00
Muki Kiboigo
b6af5884b1 use RobotsRequestContext deinit 2026-02-04 12:16:36 -08:00
Muki Kiboigo
e4f250435d include robots url in debug log 2026-02-04 12:16:36 -08:00
Muki Kiboigo
1a246f2e38 robots in the actual http client 2026-02-04 12:15:49 -08:00
Karl Seguin
017d4e792b Fix [I hope] blocking auth interception
On a blocking request that requires authentication, we now handle the two cases
correctly:
1 - if the request is aborted, we don't continue processing (if we did, that
    would result in (a) transfer.deinit being called twice and (b) the callbacks
    being called twice

2 - if the request is "continue", we queue the transfer to be re-issued, as
    opposed to just processing it as-is. We have to queue it because we're
    currently inside a process loop and it [probaby] isn't safe to re-enter it.
    By using the queue, we wait until the next call to `tick` to re-issue the
    request.
2026-02-04 18:39:23 +08:00
Nikolay Govorov
a72782f91e Eliminates duplication in the creation of HTTP headers 2026-02-04 09:08:57 +00:00
Nikolay Govorov
f71aa1cad2 Centralizes configuration, eliminates unnecessary copying of config 2026-02-04 07:57:59 +00:00
Nikolay Govorov
fd8c488dbd Move Notification from App to BrowserContext 2026-02-04 07:33:45 +00:00
Karl Seguin
a11ae912b4 Add finalizer to Response and use an pooled arena
Unlike XHR, Response is a bit more complicated as it can exist in Zig code
without ever being given to v8. So we need to track this handoff to know who is
responsible for freeing it (zig code, on error/shutdown) or v8 code after
promise resolution.

This also cleansup a bad merge for the XHR finalizer and adds cleaning up the
`XMLHttpRequestEventTarget` callbacks.
2026-01-26 19:18:32 +08:00
Karl Seguin
9a57c2a0d4 fix merge 2026-01-24 08:28:26 +08:00
Karl Seguin
fc64abee8f Add finalizer mode
When a type is finalized by V8, it's because it's fallen out of scope. When a
type is finalized by Zig, it's because the Context is being shutdown.

Those are two different environments and might require distinct cleanup logic.
Specifically, a zig-initiated finalization needs to consider that the page and
context are being shutdown. It isn't necessarily safe to execute JavaScript at
this point, and thus, not safe to execute a callback (on_error, on_abort,
ready_state_change, ...).
2026-01-24 07:59:43 +08:00
Karl Seguin
97f9c2991b on XHR shutdown, use terminate to prevent any client callbacks into the XHR 2026-01-24 07:59:43 +08:00
Karl Seguin
f6397e2731 Handle scripts that don't return a 200 status code
This was already being handled for async scripts, but for sync scripts, we'd
log the error then proceed to try and execute the body (which would be some
error message).

This allows the header_callback to return a boolean to indicate whether or not
the http client should continue to process the request or abort it.
2026-01-22 14:15:00 +08:00
Karl Seguin
a6e7ecd9e5 Move more asserts to custom asserter.
Deciding what should be an lp.assert, vs an std.debug.assert, vs a debug-only
assert is a little arbitrary.

debug-only asserts, guarded with an `if (comptime IS_DEBUG)` obviously avoid the
check in release and thus have a performance advantage. We also use them at
library boundaries. If libcurl says it will always emit a header line with a
trailing \r\n, is that really a check we need to do in production? I don't think
so. First, that code path is checked _a lot_ in debug. Second, it feels a bit
like we're testing libcurl (in production!)..why? A debug-only assertion should
be good enough to catch any changes in libcurl.
2026-01-19 09:12:16 +08:00
Karl Seguin
296fa2a2f4 Update src/http/Client.zig
Co-authored-by: Pierre Tachoire <pierre@lightpanda.io>
2025-12-24 16:37:16 +08:00
Karl Seguin
df4e5d859f Enable blocking auth request interception 2025-12-24 12:19:11 +08:00
Karl Seguin
67875036c5 Rework request interception for Zigdom
Zigdom broke request interception. It isn't zigdom specifically, but in zigdom
we properly block the parser when executing a normal (not async, not defer)
script. This does not work well with request interception, because an
intercepted request isn't blocked on HTTP data, it's blocked on a message from
CDP. Generally, neither our Page nor ScriptManager are CDP-aware. And, even if
they were, it would be hard to break out of our parsing and return control to
the CDP server.

To fix this, we expand on the HTTP Client's basic awareness of CDP (via its
extra_socket field). The HTTP client is now able to block until an intercepted
request is continued/aborted/fulfilled. it does this by being able to ask the
CDP client to read/process data.

This does not yet work for intercepted authentication requests.
2025-12-24 11:49:05 +08:00
Karl Seguin
f475aa09e8 backport https://github.com/lightpanda-io/browser/pull/1265 2025-12-19 16:06:25 +08:00
Karl Seguin
8a2641d213 fetch/request/response improvement (legacy) 2025-12-16 17:54:05 +08:00
Pierre Tachoire
6a098665fa http: remove inflight conn check when enable/disable TLS 2025-12-09 10:47:34 +01:00
Pierre Tachoire
53ccefc15c cdp: implement Security.setIgnoreCertificateErrors
ensure no inflight conns is running before set TLS verify
2025-12-09 08:50:58 +01:00
Karl Seguin
121c49e9c3 Remove std.Uri from cookies
Everything now works on a [:0]const u8, with browser/URL.zig for parsing
2025-12-08 16:23:19 +08:00
Karl Seguin
61a1a2564e Fix typos
Encode unicode nonbreaking space
2025-12-05 17:48:49 +08:00
Karl Seguin
b5eceb52fb try safer http cleanup on page deinit 2025-12-02 16:05:57 +08:00
Karl Seguin
d3973172e8 re-enable minimum viable CDP server 2025-10-28 18:56:03 +08:00
Karl Seguin
b047cb6dc1 remove libdom 2025-10-27 22:14:59 +08:00
Karl Seguin
76e8506022 Remove potential processing blocking with CDP
When using CDP, we poll the HTTP clients along with the CDP socket. Because this
polling can be long, we first process any pending message. This can end up
processing _all_ messages, in which case the poll will block for a long time.

This change makes it so that when the initial processing processes 1+ message,
we do not poll, but rather return. This allows the page lifecycle to be
processed normally (and not just blocking on poll, waiting for the CDP client
to send data).
2025-10-09 13:18:47 +08:00
Karl Seguin
418dc6fdc2 Start downloading all synchronous imports ASAP
This changes how non-async module loading works. In general, module loading
is triggered by a v8 callback. We ask it to process a module (a <script type=
module>) and then for every module that it depends on, we get a callback. This
callback expects the nested v8.Module instance, so we need to load it then and
there (as opposed to dynamic imports, where we only have to return a promise).

Previously, we solved this by issuing a blocking HTTP get in each callback. The
HTTP loop was able to continuing downloading already-queued resources, but if
a module depended on 20 nested modules, we'd issue 20 blocking gets one after
the other.

Once a module is compiled, we can ask v8 for a list of its dependent module. We
can them immediately start to download all of those modules. We then evaluate
the original module, which will trigger our callback. At this point, we still
need to block and wait for the response, but we've already started the download
and it's much faster. Sure, for the first module, we might need to wait the same
amount of time, but for the other 19, chances are by the time the callback
executes, we already have it downloaded and ready.
2025-09-26 15:38:50 +08:00
Pierre Tachoire
2d24e3c7f7 Merge pull request #972 from lightpanda-io/fetch
Fetch + ReadableStream
2025-09-18 09:29:05 +02:00
Karl Seguin
26550129ea Add --user_agent_suffix argument
Allows appending a value (separated by a space) to the existing Lightpanda/X.Y
user agent.
2025-09-18 11:28:27 +08:00
Muki Kiboigo
a133a71eb9 proper fetch method and body setting 2025-09-17 08:41:22 -07:00
Pierre Tachoire
e00066466b http: decrement intercepted on auth abortion 2025-09-16 12:18:49 +02:00
Pierre Tachoire
b87a8ba97d http: increment intercepted counter on auth interception 2025-09-16 12:18:49 +02:00
Pierre Tachoire
37fe6a661b Merge pull request #1013 from lightpanda-io/reset_request_method
Reset CURLOPT_CUSTOMREQUEST for each request
2025-09-05 17:43:30 +02:00
Karl Seguin
6600626f4f Reset CURLOPT_CUSTOMREQUEST for each request 2025-09-05 15:45:28 +08:00
Pierre Tachoire
9f040025e7 Merge pull request #1010 from lightpanda-io/update_transfer_uri_on_redirect
Update the transfer.uri on redirect
2025-09-05 08:35:13 +02:00
Karl Seguin
dd22c55d23 migrate to htmlRunne (plus zig fmt) 2025-09-05 13:52:08 +08:00