Commit Graph

223 Commits

Author SHA1 Message Date
Karl Seguin
523efbd85a Fix a few issues in Client
Most significantly, if removing from the multi fails, the connection
is added to a "dirty" list for the removal to be retried later. Looking at
the curl source code, remove fails on a recursive call, and we've struggled with
recursive calls before, so I _think_ this might be happening (it fails in other
cases, but I suspect if it _is_ happening, it's for this reason). The retry
happens _after_ `perform`, so it cannot fail for due to recursiveness. If it
fails at this point, we @panic. This is harsh, but it isn't easily recoverable
and before putting effort into it, I'd like to know that it's actually happening.

Fix potential use of undefined when a 401-407 request is received, but no
'WWW-Authenticate' or 'Proxy-Authenticate' header is received.

Don't call `curl_multi_remove_handle` on an easy that hasn't been added yet do
to error. Specifically, if `makeRequest` fails during setup, transfer_conn is
nulled so that `transfer.deinit()` doesn't try to remove the connection. And the
conn is removed from the `in_use` queue and made `available` again.

On Abort, if getting the private fails (extremely unlikely), we now still try
to remove the connection from the multi.

Added a few more fields to the famous "ScriptManager.Header recall" assertion.
2026-03-03 18:02:06 +08:00
Karl Seguin
1f81b6ddc4 Allow frame-specific HTTP abort
Needed for frame navigation. Implemented using some ugly comptime to avoid
duplication and avoid an runtime frame check when doing a full abort.
2026-03-02 18:00:55 +08:00
Karl Seguin
20fbfc8544 Merge pull request #1689 from lightpanda-io/protect_xhr_abort_during_callback
Protect against transfer.abort() being called during callback
2026-03-02 17:42:07 +08:00
Karl Seguin
328c681a8f Add transfer-specific "performing" flag
In the previous commits, two separte crash resolution conspired to introduce
1 tick delay in request handling.

When we're in a libcurl perform, we can't re-enter libcurl. So we added a
check before processing new requests to make sure we weren't "performing" and,
if we were, we'd queue the request (hence the 1 tick delay).

But for another issue, we set the same "performing" check when manually
triggering callbacks. This extended the situations where the above check fired
thus causing the 1-tick delay to happen under more (and even common) situation.

This commit improves this - instead of relying on the global "performing" check
when processing 1 transfer explicitly, we now have a per-transfer performing
check. This prevents the transfer from being deinitialized during a callback
but does not block requests from being started immediately.
2026-03-02 17:29:47 +08:00
Karl Seguin
10ad5d763e Rename page.id to page._frame_id
This field was recently added and is used to generate correct frameIds in CDP
messages. They remain the same during a navigation event, so calling them
page.id might cause surprises since navigation events create new pages, but
retain the original id. Hence, frame_id is more accurate and hopefully less
surprising.

(This is a small cleanup prior to doing some iframe navigation work).
2026-03-02 16:21:29 +08:00
Karl Seguin
b104c3bfe8 Don't start request during callback
Fixes a separate but similar issue to
https://github.com/lightpanda-io/browser/pull/1689

Specifically, it prevents starting a request from within a libcurl handler, thus
avoiding an illegal recursive call.

(This commit also removes the failed function call debug logging for
DOMExceptions, as these aren't particularly abnormal / log-worthy)
2026-03-02 12:04:02 +08:00
Karl Seguin
82e3f126ff Protect against transfer.abort() being called during callback
This was already handled in most cases, but not for a body-less response. It's
safe to call transfer.abort() during a callback, so long as the performing flag
is set to true. This was set during the normal libcurl callbacks, but for a
body-less response, we manually invoke the header_done_callback and were not
setting the performing flag.
2026-03-02 11:44:42 +08:00
Nikolay Govorov
4c26161728 Move curl C API to type-safe wrapper 2026-02-25 23:29:54 +00:00
Nikolay Govorov
e1695a0874 Strict visibility for Net functions 2026-02-25 05:58:08 +00:00
Nikolay Govorov
3e2a4d8053 Move curl_multi to Net layer 2026-02-25 05:31:28 +00:00
Nikolay Govorov
29982e2caf Move all curl_easy ops to Connection 2026-02-25 05:31:24 +00:00
Nikolay Govorov
5fea1df42b Move Net staff to clean network module 2026-02-25 05:31:19 +00:00
Karl Seguin
6e6082119f Remove session.transfer_arena
This no longer works with frames. Multiple frames could have a scheduled
navigation, so a single arena no longer has a clear lifecycle. Instead an arena
from the pool is used per navigation event, thus the queued_navigation is self-
contained.

This required having libcurl copy the body. Unfortunate. Currently we free the
arena as soon as the navigation begins. This is clean. But it means the body is
immediately freed (thus we need libcurl to copy it). As an alternative, each
page could maintain an optional transfer_arena, which it could free on
httpDone/Error.
2026-02-19 23:47:33 +08:00
Karl Seguin
e2a1ce623c Rework CDP frameIds (and loaderIds and requestIds and interceptorIds)
Our BrowsingContext currently supports 1 target. So we have a per-BC target_id.
Previously, our target had 1 "frame" - our page. So we often treated the
targetId as the frameId. But to work with frames, we need page-specific
frameIds and loaderIds.

This tries to clean up our ids (a little). frameIds are now ids derived from
a new incrementing page.id. This page.id has to be passed around (via http
Requests and through notifications) in order to properly generate messages with
a frameId.
2026-02-19 13:01:41 +08:00
Nikolay Govorov
9296c10ca4 Use per-cdp connection HttpClient 2026-02-18 09:22:26 +00:00
Karl Seguin
4f6868728d Detach attached nodes on appendBeforeSibling callback
html5ever generally makes guarantees about nodes being parentless when
appending, but we've already seen 1 case where appendCallback receives a
connected node.

We're now seeing something in appendBeforeSiblingCallback, but we have a clearer
picture of how this is happening. In this case, it's via custom element
upgrading and the custom element constructor has already placed the node in
the document.

It's worth pointing, html5ever just has an opaque reference to our node. While
it guarantees that it will give us parent-less nodes, it doesn't actually know
anything about our nodes, or our node._parent. The guarantee is only from its
own point of view. There's nothing stopping us from giving a node a default
parent as soon as html5ever asks us to create a new node, in which case, the
node _will_ have a parent.
2026-02-18 10:52:51 +08:00
Karl Seguin
45eb59a5aa Support HTTP headers which are \n terminated (as opposed to \r\n).
Looks like curl will accept these as valid headers, and won't normalize the
header, so we have to deal with either a 2-byte or 1-byte terminated header
2026-02-17 18:55:50 +08:00
Karl Seguin
4f81cb9333 Add more granular assertions
Trying to see how the "ScriptManager.Header buffer" assertion is failing. Either
`headerCallback` is being called multiple times, or the script is corrupt. By
adding a similar assertion in various places, we can hopefully narrow (a) what's
going on and (b) what code is involved.

Also, switched the BufferPool from DoublyLinkedList to SinglyLinkedList. Was
just reviewing this code (to see if the buffer could possibly become corrupt)
and realized this could be switched.
2026-02-14 20:01:09 +08:00
Karl Seguin
6037521c49 Default behavior for input click (radio / checkbox).
This wasn't 100% intuitive to me. At the start of the event, the input is
immediately toggled. But at any point during dispatching, the default behavior
can be suppressed. So the state of the input's check during dispatching captures
the "intent" of the click. But it's possible for one listener to see that
input.checked == true even though, by the end of dispatching, input.checked ==
false because some other listener called preventDefault().

To support this, we need to capture the "current" state so that, if we need to
rollback, we can. For radio buttons, this "current" state includes capturing
the currently checked radio (if any).
2026-02-13 11:06:46 +08:00
Karl Seguin
0c89dca261 When _not_ in a libcurl callback, deinit the transfer normally 2026-02-12 18:29:35 +08:00
Karl Seguin
ed802c0404 Remove potential recursive abort call in curl
Curl doesn't like recursive calls. For example, you can't call
curl_multi_remove_handle from within a dataCallback.

This specifically means that, as-is, transfer.abort() calls aren't safe to be
called during a libcurl callback. Consider this code:

```
req.open('GET', 'http://127.0.0.1:9582/xhr');
req.onreadystatechange = (e) => {
  req.abort();
}
req.send();
```

onreadystatechange is triggered by network events, i.e. it executes in libcurl
callback. Thus, the above code fails to truly "abort" the request with
`curl_multi_remove_handle` error, saying it's a recursive call.

To solve this, transfer.abort() now sets an `aborted = true` flag. Callbacks can
now use this flag to signal to libcurl to stop the transfer.

A test was added which reproduced this issue, but this comes from:
https://github.com/lightpanda-io/browser/issues/1527  which I wasn't able to
reliably reproduce. I did see it happen regularly, just not always. It seems
like this commit fixes that issue.
2026-02-12 11:29:47 +08:00
Muki Kiboigo
f02a37d3f0 properly handle failed parsing on robots 2026-02-10 20:09:32 -08:00
Karl Seguin
70ae6b8d72 Merge pull request #1407 from lightpanda-io/robots
Support for `robots.txt`
2026-02-10 09:51:32 +08:00
Muki Kiboigo
e1850440b0 shutdown queued req on robots shutdown 2026-02-09 15:24:35 -08:00
Pierre Tachoire
b9f8ce5729 http: allow PROPOFIND http method
PROPOFIND is used by webdav to retrieve resource's properties.
http://www.webdav.org/specs/rfc4918.html#METHOD_PROPFIND

For example, Nextcloud public sharing uses PROPOFIND to list shared resources.
2026-02-09 15:16:28 +01:00
Muki Kiboigo
65c9b2a5f7 add robotsShutdownCallback 2026-02-09 05:51:42 -08:00
Muki Kiboigo
46c73a05a9 panic instead of unreachable on robots callbacks 2026-02-09 05:35:32 -08:00
Karl Seguin
a6cd019118 Add http_max_response_size
This adds a --http_max_response_size argument to the serve and fetch command
which is enforced by the HTTP client. This defaults to null, no limit.

As-is, the ScriptManager allocates a buffer based on Content-Length. Without
setting this flag, a server could simply reply with Content-Length: 99999999999
9999999999  to cause an OOM. This new flag is checked both once we have the
header if there's a content-length, and when reading the body.

Also requested in https://github.com/lightpanda-io/browser/issues/415
2026-02-09 13:16:18 +08:00
Karl Seguin
cecdf0d511 Add support for XHR's withCredentials
XHR should only send and receive cookies for same-origin requests or if
withCredentials is true.
2026-02-07 16:16:10 +08:00
Karl Seguin
2eab4b84c9 Rename all ArrayListUnmanaged -> ArrayList
ArrayListAlignedUnmanaged has been deprecated for a while, and I occasionally
replace them, but doing one complete pass gets it done once and for all.
2026-02-05 11:49:15 +08:00
Muki Kiboigo
a7095d7dec pass robot store into Http init 2026-02-04 12:23:42 -08:00
Muki Kiboigo
e620c28a1c stop leaking robots_url when in robot queue 2026-02-04 12:19:18 -08:00
Muki Kiboigo
29ee7d41f5 queue requests to run after robots is fetched 2026-02-04 12:19:17 -08:00
Muki Kiboigo
b6af5884b1 use RobotsRequestContext deinit 2026-02-04 12:16:36 -08:00
Muki Kiboigo
e4f250435d include robots url in debug log 2026-02-04 12:16:36 -08:00
Muki Kiboigo
1a246f2e38 robots in the actual http client 2026-02-04 12:15:49 -08:00
Karl Seguin
017d4e792b Fix [I hope] blocking auth interception
On a blocking request that requires authentication, we now handle the two cases
correctly:
1 - if the request is aborted, we don't continue processing (if we did, that
    would result in (a) transfer.deinit being called twice and (b) the callbacks
    being called twice

2 - if the request is "continue", we queue the transfer to be re-issued, as
    opposed to just processing it as-is. We have to queue it because we're
    currently inside a process loop and it [probaby] isn't safe to re-enter it.
    By using the queue, we wait until the next call to `tick` to re-issue the
    request.
2026-02-04 18:39:23 +08:00
Nikolay Govorov
a72782f91e Eliminates duplication in the creation of HTTP headers 2026-02-04 09:08:57 +00:00
Nikolay Govorov
f71aa1cad2 Centralizes configuration, eliminates unnecessary copying of config 2026-02-04 07:57:59 +00:00
Nikolay Govorov
fd8c488dbd Move Notification from App to BrowserContext 2026-02-04 07:33:45 +00:00
Karl Seguin
451dd0fd64 Merge pull request #1411 from lightpanda-io/more_SSO
more small strings (string.String)
2026-01-27 13:18:56 +08:00
Karl Seguin
a11ae912b4 Add finalizer to Response and use an pooled arena
Unlike XHR, Response is a bit more complicated as it can exist in Zig code
without ever being given to v8. So we need to track this handoff to know who is
responsible for freeing it (zig code, on error/shutdown) or v8 code after
promise resolution.

This also cleansup a bad merge for the XHR finalizer and adds cleaning up the
`XMLHttpRequestEventTarget` callbacks.
2026-01-26 19:18:32 +08:00
Karl Seguin
54c45a0cfd Make js.Bridge aware of string.String for input parameters
Avoids having to allocate small strings when going from v8 -> Zig. Also
added a discriminatory type, string.Global which uses the arena, rather than
the call_arena, if an allocation _is_ necessary. (This is similar to a feature
we had before, but was lost in zigdom). Strings from v8 that need to be
persisted, can be allocated directly v8 -> arena, rather than v8 -> call_arena
-> arena.

I think there are a lot of places where we should use string.String - where
strings are expected to be short (e.g. attribute names). But started with just
document.querySelector and querySelectorAll.
2026-01-26 07:52:27 +08:00
Karl Seguin
9a57c2a0d4 fix merge 2026-01-24 08:28:26 +08:00
Karl Seguin
fc64abee8f Add finalizer mode
When a type is finalized by V8, it's because it's fallen out of scope. When a
type is finalized by Zig, it's because the Context is being shutdown.

Those are two different environments and might require distinct cleanup logic.
Specifically, a zig-initiated finalization needs to consider that the page and
context are being shutdown. It isn't necessarily safe to execute JavaScript at
this point, and thus, not safe to execute a callback (on_error, on_abort,
ready_state_change, ...).
2026-01-24 07:59:43 +08:00
Karl Seguin
97f9c2991b on XHR shutdown, use terminate to prevent any client callbacks into the XHR 2026-01-24 07:59:43 +08:00
Karl Seguin
f6397e2731 Handle scripts that don't return a 200 status code
This was already being handled for async scripts, but for sync scripts, we'd
log the error then proceed to try and execute the body (which would be some
error message).

This allows the header_callback to return a boolean to indicate whether or not
the http client should continue to process the request or abort it.
2026-01-22 14:15:00 +08:00
Karl Seguin
a6e7ecd9e5 Move more asserts to custom asserter.
Deciding what should be an lp.assert, vs an std.debug.assert, vs a debug-only
assert is a little arbitrary.

debug-only asserts, guarded with an `if (comptime IS_DEBUG)` obviously avoid the
check in release and thus have a performance advantage. We also use them at
library boundaries. If libcurl says it will always emit a header line with a
trailing \r\n, is that really a check we need to do in production? I don't think
so. First, that code path is checked _a lot_ in debug. Second, it feels a bit
like we're testing libcurl (in production!)..why? A debug-only assertion should
be good enough to catch any changes in libcurl.
2026-01-19 09:12:16 +08:00
Karl Seguin
087086c308 remove some unused imports 2025-12-26 12:40:20 +08:00
Karl Seguin
296fa2a2f4 Update src/http/Client.zig
Co-authored-by: Pierre Tachoire <pierre@lightpanda.io>
2025-12-24 16:37:16 +08:00