browser

mirror of https://github.com/lightpanda-io/browser.git synced 2026-03-22 12:44:43 +00:00

Author	SHA1	Message	Date
Pierre Tachoire	37c34351ee	http: handle auth challenge for non-proxy auth	2026-03-09 19:23:36 +01:00
Karl Seguin	523efbd85a	Fix a few issues in Client Most significantly, if removing from the multi fails, the connection is added to a "dirty" list for the removal to be retried later. Looking at the curl source code, remove fails on a recursive call, and we've struggled with recursive calls before, so I _think_ this might be happening (it fails in other cases, but I suspect if it _is_ happening, it's for this reason). The retry happens _after_ `perform`, so it cannot fail for due to recursiveness. If it fails at this point, we @panic. This is harsh, but it isn't easily recoverable and before putting effort into it, I'd like to know that it's actually happening. Fix potential use of undefined when a 401-407 request is received, but no 'WWW-Authenticate' or 'Proxy-Authenticate' header is received. Don't call `curl_multi_remove_handle` on an easy that hasn't been added yet do to error. Specifically, if `makeRequest` fails during setup, transfer_conn is nulled so that `transfer.deinit()` doesn't try to remove the connection. And the conn is removed from the `in_use` queue and made `available` again. On Abort, if getting the private fails (extremely unlikely), we now still try to remove the connection from the multi. Added a few more fields to the famous "ScriptManager.Header recall" assertion.	2026-03-03 18:02:06 +08:00
Karl Seguin	1f81b6ddc4	Allow frame-specific HTTP abort Needed for frame navigation. Implemented using some ugly comptime to avoid duplication and avoid an runtime frame check when doing a full abort.	2026-03-02 18:00:55 +08:00
Karl Seguin	20fbfc8544	Merge pull request #1689 from lightpanda-io/protect_xhr_abort_during_callback Protect against transfer.abort() being called during callback	2026-03-02 17:42:07 +08:00
Karl Seguin	328c681a8f	Add transfer-specific "performing" flag In the previous commits, two separte crash resolution conspired to introduce 1 tick delay in request handling. When we're in a libcurl perform, we can't re-enter libcurl. So we added a check before processing new requests to make sure we weren't "performing" and, if we were, we'd queue the request (hence the 1 tick delay). But for another issue, we set the same "performing" check when manually triggering callbacks. This extended the situations where the above check fired thus causing the 1-tick delay to happen under more (and even common) situation. This commit improves this - instead of relying on the global "performing" check when processing 1 transfer explicitly, we now have a per-transfer performing check. This prevents the transfer from being deinitialized during a callback but does not block requests from being started immediately.	2026-03-02 17:29:47 +08:00
Karl Seguin	10ad5d763e	Rename page.id to page._frame_id This field was recently added and is used to generate correct frameIds in CDP messages. They remain the same during a navigation event, so calling them page.id might cause surprises since navigation events create new pages, but retain the original id. Hence, frame_id is more accurate and hopefully less surprising. (This is a small cleanup prior to doing some iframe navigation work).	2026-03-02 16:21:29 +08:00
Karl Seguin	b104c3bfe8	Don't start request during callback Fixes a separate but similar issue to https://github.com/lightpanda-io/browser/pull/1689 Specifically, it prevents starting a request from within a libcurl handler, thus avoiding an illegal recursive call. (This commit also removes the failed function call debug logging for DOMExceptions, as these aren't particularly abnormal / log-worthy)	2026-03-02 12:04:02 +08:00
Karl Seguin	82e3f126ff	Protect against transfer.abort() being called during callback This was already handled in most cases, but not for a body-less response. It's safe to call transfer.abort() during a callback, so long as the performing flag is set to true. This was set during the normal libcurl callbacks, but for a body-less response, we manually invoke the header_done_callback and were not setting the performing flag.	2026-03-02 11:44:42 +08:00
Nikolay Govorov	4c26161728	Move curl C API to type-safe wrapper	2026-02-25 23:29:54 +00:00
Nikolay Govorov	e1695a0874	Strict visibility for Net functions	2026-02-25 05:58:08 +00:00
Nikolay Govorov	3e2a4d8053	Move curl_multi to Net layer	2026-02-25 05:31:28 +00:00
Nikolay Govorov	29982e2caf	Move all curl_easy ops to Connection	2026-02-25 05:31:24 +00:00
Nikolay Govorov	5fea1df42b	Move Net staff to clean network module	2026-02-25 05:31:19 +00:00
Karl Seguin	6e6082119f	Remove session.transfer_arena This no longer works with frames. Multiple frames could have a scheduled navigation, so a single arena no longer has a clear lifecycle. Instead an arena from the pool is used per navigation event, thus the queued_navigation is self- contained. This required having libcurl copy the body. Unfortunate. Currently we free the arena as soon as the navigation begins. This is clean. But it means the body is immediately freed (thus we need libcurl to copy it). As an alternative, each page could maintain an optional transfer_arena, which it could free on httpDone/Error.	2026-02-19 23:47:33 +08:00
Karl Seguin	e2a1ce623c	Rework CDP frameIds (and loaderIds and requestIds and interceptorIds) Our BrowsingContext currently supports 1 target. So we have a per-BC target_id. Previously, our target had 1 "frame" - our page. So we often treated the targetId as the frameId. But to work with frames, we need page-specific frameIds and loaderIds. This tries to clean up our ids (a little). frameIds are now ids derived from a new incrementing page.id. This page.id has to be passed around (via http Requests and through notifications) in order to properly generate messages with a frameId.	2026-02-19 13:01:41 +08:00
Nikolay Govorov	9296c10ca4	Use per-cdp connection HttpClient	2026-02-18 09:22:26 +00:00
Karl Seguin	4f6868728d	Detach attached nodes on appendBeforeSibling callback html5ever generally makes guarantees about nodes being parentless when appending, but we've already seen 1 case where appendCallback receives a connected node. We're now seeing something in appendBeforeSiblingCallback, but we have a clearer picture of how this is happening. In this case, it's via custom element upgrading and the custom element constructor has already placed the node in the document. It's worth pointing, html5ever just has an opaque reference to our node. While it guarantees that it will give us parent-less nodes, it doesn't actually know anything about our nodes, or our node._parent. The guarantee is only from its own point of view. There's nothing stopping us from giving a node a default parent as soon as html5ever asks us to create a new node, in which case, the node _will_ have a parent.	2026-02-18 10:52:51 +08:00
Karl Seguin	45eb59a5aa	Support HTTP headers which are \n terminated (as opposed to \r\n). Looks like curl will accept these as valid headers, and won't normalize the header, so we have to deal with either a 2-byte or 1-byte terminated header	2026-02-17 18:55:50 +08:00
Karl Seguin	4f81cb9333	Add more granular assertions Trying to see how the "ScriptManager.Header buffer" assertion is failing. Either `headerCallback` is being called multiple times, or the script is corrupt. By adding a similar assertion in various places, we can hopefully narrow (a) what's going on and (b) what code is involved. Also, switched the BufferPool from DoublyLinkedList to SinglyLinkedList. Was just reviewing this code (to see if the buffer could possibly become corrupt) and realized this could be switched.	2026-02-14 20:01:09 +08:00
Karl Seguin	6037521c49	Default behavior for input click (radio / checkbox). This wasn't 100% intuitive to me. At the start of the event, the input is immediately toggled. But at any point during dispatching, the default behavior can be suppressed. So the state of the input's check during dispatching captures the "intent" of the click. But it's possible for one listener to see that input.checked == true even though, by the end of dispatching, input.checked == false because some other listener called preventDefault(). To support this, we need to capture the "current" state so that, if we need to rollback, we can. For radio buttons, this "current" state includes capturing the currently checked radio (if any).	2026-02-13 11:06:46 +08:00
Karl Seguin	0c89dca261	When _not_ in a libcurl callback, deinit the transfer normally	2026-02-12 18:29:35 +08:00
Karl Seguin	ed802c0404	Remove potential recursive abort call in curl Curl doesn't like recursive calls. For example, you can't call curl_multi_remove_handle from within a dataCallback. This specifically means that, as-is, transfer.abort() calls aren't safe to be called during a libcurl callback. Consider this code: ``` req.open('GET', 'http://127.0.0.1:9582/xhr'); req.onreadystatechange = (e) => { req.abort(); } req.send(); ``` onreadystatechange is triggered by network events, i.e. it executes in libcurl callback. Thus, the above code fails to truly "abort" the request with `curl_multi_remove_handle` error, saying it's a recursive call. To solve this, transfer.abort() now sets an `aborted = true` flag. Callbacks can now use this flag to signal to libcurl to stop the transfer. A test was added which reproduced this issue, but this comes from: https://github.com/lightpanda-io/browser/issues/1527 which I wasn't able to reliably reproduce. I did see it happen regularly, just not always. It seems like this commit fixes that issue.	2026-02-12 11:29:47 +08:00
Muki Kiboigo	f02a37d3f0	properly handle failed parsing on robots	2026-02-10 20:09:32 -08:00
Karl Seguin	70ae6b8d72	Merge pull request #1407 from lightpanda-io/robots Support for `robots.txt`	2026-02-10 09:51:32 +08:00
Muki Kiboigo	e1850440b0	shutdown queued req on robots shutdown	2026-02-09 15:24:35 -08:00
Pierre Tachoire	b9f8ce5729	http: allow PROPOFIND http method PROPOFIND is used by webdav to retrieve resource's properties. http://www.webdav.org/specs/rfc4918.html#METHOD_PROPFIND For example, Nextcloud public sharing uses PROPOFIND to list shared resources.	2026-02-09 15:16:28 +01:00
Muki Kiboigo	65c9b2a5f7	add robotsShutdownCallback	2026-02-09 05:51:42 -08:00
Muki Kiboigo	46c73a05a9	panic instead of unreachable on robots callbacks	2026-02-09 05:35:32 -08:00
Karl Seguin	a6cd019118	Add http_max_response_size This adds a --http_max_response_size argument to the serve and fetch command which is enforced by the HTTP client. This defaults to null, no limit. As-is, the ScriptManager allocates a buffer based on Content-Length. Without setting this flag, a server could simply reply with Content-Length: 99999999999 9999999999 to cause an OOM. This new flag is checked both once we have the header if there's a content-length, and when reading the body. Also requested in https://github.com/lightpanda-io/browser/issues/415	2026-02-09 13:16:18 +08:00
Karl Seguin	cecdf0d511	Add support for XHR's withCredentials XHR should only send and receive cookies for same-origin requests or if withCredentials is true.	2026-02-07 16:16:10 +08:00
Karl Seguin	2eab4b84c9	Rename all ArrayListUnmanaged -> ArrayList ArrayListAlignedUnmanaged has been deprecated for a while, and I occasionally replace them, but doing one complete pass gets it done once and for all.	2026-02-05 11:49:15 +08:00
Muki Kiboigo	a7095d7dec	pass robot store into Http init	2026-02-04 12:23:42 -08:00
Muki Kiboigo	e620c28a1c	stop leaking robots_url when in robot queue	2026-02-04 12:19:18 -08:00
Muki Kiboigo	29ee7d41f5	queue requests to run after robots is fetched	2026-02-04 12:19:17 -08:00
Muki Kiboigo	b6af5884b1	use RobotsRequestContext deinit	2026-02-04 12:16:36 -08:00
Muki Kiboigo	e4f250435d	include robots url in debug log	2026-02-04 12:16:36 -08:00
Muki Kiboigo	1a246f2e38	robots in the actual http client	2026-02-04 12:15:49 -08:00
Karl Seguin	017d4e792b	Fix [I hope] blocking auth interception On a blocking request that requires authentication, we now handle the two cases correctly: 1 - if the request is aborted, we don't continue processing (if we did, that would result in (a) transfer.deinit being called twice and (b) the callbacks being called twice 2 - if the request is "continue", we queue the transfer to be re-issued, as opposed to just processing it as-is. We have to queue it because we're currently inside a process loop and it [probaby] isn't safe to re-enter it. By using the queue, we wait until the next call to `tick` to re-issue the request.	2026-02-04 18:39:23 +08:00
Nikolay Govorov	a72782f91e	Eliminates duplication in the creation of HTTP headers	2026-02-04 09:08:57 +00:00
Nikolay Govorov	f71aa1cad2	Centralizes configuration, eliminates unnecessary copying of config	2026-02-04 07:57:59 +00:00
Nikolay Govorov	fd8c488dbd	Move Notification from App to BrowserContext	2026-02-04 07:33:45 +00:00
Karl Seguin	451dd0fd64	Merge pull request #1411 from lightpanda-io/more_SSO more small strings (string.String)	2026-01-27 13:18:56 +08:00
Karl Seguin	a11ae912b4	Add finalizer to Response and use an pooled arena Unlike XHR, Response is a bit more complicated as it can exist in Zig code without ever being given to v8. So we need to track this handoff to know who is responsible for freeing it (zig code, on error/shutdown) or v8 code after promise resolution. This also cleansup a bad merge for the XHR finalizer and adds cleaning up the `XMLHttpRequestEventTarget` callbacks.	2026-01-26 19:18:32 +08:00
Karl Seguin	54c45a0cfd	Make js.Bridge aware of string.String for input parameters Avoids having to allocate small strings when going from v8 -> Zig. Also added a discriminatory type, string.Global which uses the arena, rather than the call_arena, if an allocation _is_ necessary. (This is similar to a feature we had before, but was lost in zigdom). Strings from v8 that need to be persisted, can be allocated directly v8 -> arena, rather than v8 -> call_arena -> arena. I think there are a lot of places where we should use string.String - where strings are expected to be short (e.g. attribute names). But started with just document.querySelector and querySelectorAll.	2026-01-26 07:52:27 +08:00
Karl Seguin	9a57c2a0d4	fix merge	2026-01-24 08:28:26 +08:00
Karl Seguin	fc64abee8f	Add finalizer mode When a type is finalized by V8, it's because it's fallen out of scope. When a type is finalized by Zig, it's because the Context is being shutdown. Those are two different environments and might require distinct cleanup logic. Specifically, a zig-initiated finalization needs to consider that the page and context are being shutdown. It isn't necessarily safe to execute JavaScript at this point, and thus, not safe to execute a callback (on_error, on_abort, ready_state_change, ...).	2026-01-24 07:59:43 +08:00
Karl Seguin	97f9c2991b	on XHR shutdown, use terminate to prevent any client callbacks into the XHR	2026-01-24 07:59:43 +08:00
Karl Seguin	f6397e2731	Handle scripts that don't return a 200 status code This was already being handled for async scripts, but for sync scripts, we'd log the error then proceed to try and execute the body (which would be some error message). This allows the header_callback to return a boolean to indicate whether or not the http client should continue to process the request or abort it.	2026-01-22 14:15:00 +08:00
Karl Seguin	a6e7ecd9e5	Move more asserts to custom asserter. Deciding what should be an lp.assert, vs an std.debug.assert, vs a debug-only assert is a little arbitrary. debug-only asserts, guarded with an `if (comptime IS_DEBUG)` obviously avoid the check in release and thus have a performance advantage. We also use them at library boundaries. If libcurl says it will always emit a header line with a trailing \r\n, is that really a check we need to do in production? I don't think so. First, that code path is checked _a lot_ in debug. Second, it feels a bit like we're testing libcurl (in production!)..why? A debug-only assertion should be good enough to catch any changes in libcurl.	2026-01-19 09:12:16 +08:00
Karl Seguin	087086c308	remove some unused imports	2025-12-26 12:40:20 +08:00

1 2 3 4 5

224 Commits