Commit Graph

3888 Commits

Author SHA1 Message Date
Karl Seguin
4863b3df6e Merge pull request #1721 from lightpanda-io/fix_mcp_unintialized_memory
Ensure that mcp.Server is correctly initialized
2026-03-05 17:11:57 +08:00
Karl Seguin
768c3a533b Simplify navigation logic.
Must of the complexity in the previous commit had to do with the fact that
about:blank is processed synchronously, meaning that we could process a
scheduled navigation -> page.navigate -> scheduled navigation:

```
let iframe = document.createElement('iframe');
iframe.addEventListner('load', () => {
  iframe.src = "about:blank";
});
```

This is an infinite loop which is going to be a problem no mater what, but there
are different degrees of problems this can cause, e.g. looping forever vs use-
after-free or other undefined behavior.

The new approach does 2 passes through scheduled navigations, first processing
"asynchronous" navigation (anything not "about:blank"), then processing
synchronous navigation ("about:blank"). The main advantage is that if the
synchronous navigation causes more synchronous navigation, it won't be
processed until the next tick. PLUS, we can detect about:blank that loads
about:blank and stop it (which might not be to spec, but seems right to do
nonetheless). This 2-pass approach removes the need for a couple of checks and
makes everything else simpler.
2026-03-05 17:06:23 +08:00
Karl Seguin
3dea554e9e Ensure that mcp.Server is correctly initialized
It relies on default field values, e.g. for mutex: std.Thread.Mutex = .{}, but
doesn't initialize the structure, just the pointer on the heap resulting in a
crash.
2026-03-05 16:32:25 +08:00
Karl Seguin
9c7ecf221e Improve frame sub-navigation
This makes frame sub-navigation "work" for all page navigations (click, form
submit, location.top...) as well as setting the iframe.src.

Fixes at least 2 WPT crashes.

BUT, the implementation still isn't 100% correct, with two known issues:

1. Navigation currently happens in the context where it's called, not the
   context of the frame. So if Page1 accesses Frame1 and causes it to navigate,
   e.g. f1.contentDocument.querySelector('#link').click(), it's Page1 that will
   be navigated, since the JS is being executed in the Page1 context.
   This should be relatively easy to fix.

2. There are particularly complicated cases in WPT where a frame is navigated
   inside of its own load, creating an endless loop. There's some partial
   support for this as-is, but it doesn't work correctly and it currently is
   defensive and likely will not continue to navigate. This is particularly true
   when sub-navigation is done to about:blank within the frame's on load event.
   (Which is probably not a real concern, but an issue for some WPT tests)

Although it shares a lot with the original navigation code, there are many more
edge cases here, possibly due to being developed along side WPT tests. The
source of most of the complexity is the synchronous handling of "about:blank"
in page.navigate, which can result in a scheduled navigation synchronously
causing more scheduled navigation. (Specifically because
`self.documentIsComplete();` is called from page.navigate in that case). It
might be worth seeing if something can be done about that, to simplify this new
code (removing the double queue, removing the flag, simplifying pre-existing
schedule checks ,...)
2026-03-05 15:09:39 +08:00
Adrià Arrufat
26db481d46 markdown: refactor content discovery to use TreeWalker 2026-03-05 14:36:15 +09:00
Adrià Arrufat
3256a57230 TreeWalker: add sibling navigation and skipChildren 2026-03-05 14:29:42 +09:00
Adrià Arrufat
a27de38c03 markdown: encode resolved URLs in links and images 2026-03-05 13:57:42 +09:00
Adrià Arrufat
e2f1609116 markdown: use aria-label or title for empty links 2026-03-05 11:27:51 +09:00
Adrià Arrufat
ea66a91a95 markdown: resolve absolute URLs and skip empty links 2026-03-05 10:48:18 +09:00
Karl Seguin
6c5efe6ce0 Merge pull request #1715 from lightpanda-io/cdp-frame-navigate
cdp: don't dispatch executionContextsCleared on frame navigation
2026-03-04 22:02:30 +08:00
Karl Seguin
f0be6675e7 Merge pull request #1714 from lightpanda-io/fix-req-id
cdp: fix req id resolver, they are REQ- not RID-
2026-03-04 21:59:04 +08:00
Pierre Tachoire
6a8174a15c cdp: don't dispatch executionContextsCleared on frame navigation 2026-03-04 14:45:21 +01:00
Pierre Tachoire
40c3f1b618 cdp: fix req id resolver, they are REQ- not RID- 2026-03-04 13:00:16 +01:00
Pierre Tachoire
6dd2dac049 Merge pull request #1704 from lightpanda-io/non-ascii-css-key
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
e2e-test / browser fetch (push) Has been cancelled
zig-test / zig test using v8 in debug mode (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
css: fix crash in consumeName() on UTF-8 multibyte sequences
2026-03-04 12:35:14 +01:00
Karl Seguin
b39bbb557f Merge pull request #1713 from lightpanda-io/dynamic_module_instantiation
Force dynamic module instantiation if not already instantiated
2026-03-04 16:27:06 +08:00
Karl Seguin
f7682cba67 Force dynamic module instantiation if not already instantiated
I couldn't come up with a reproducible case where this was needed, but we're
seeing some crash reports indicate that this is happening.
2026-03-04 16:12:11 +08:00
Pierre Tachoire
f94c07160a Merge pull request #1712 from lightpanda-io/css-selector-quote
Handle commas inside quoted attributes
2026-03-04 09:00:01 +01:00
Karl Seguin
bbe6692580 Merge pull request #1711 from lightpanda-io/iframe_about_blank
iframe handling for src = "about:blank"
2026-03-04 15:56:26 +08:00
Karl Seguin
9266a1c4d9 Merge pull request #1709 from lightpanda-io/expand_event_dispatch_handle_scope
Use a single HandleScope for event dispatch
2026-03-04 15:56:13 +08:00
Pierre Tachoire
220d80f05f Handle commas inside quoted attributes
In CSS selector, commas inside quoted attribute are not selector separators, but part of
the attribute value.
2026-03-04 08:49:33 +01:00
Karl Seguin
9144c909dd Merge pull request #1710 from lightpanda-io/custom_element_clone
Support for clone custom elements that attach them self in their cons…
2026-03-04 15:47:39 +08:00
Karl Seguin
7981fcec84 iframe handling for src = "about:blank"
Don't try to resolve an iframe's source if it's about:blank

Extend the page's handling of about:blank to render an empty document
2026-03-04 15:43:07 +08:00
Pierre Tachoire
71264c56fc Merge pull request #1696 from lightpanda-io/textencoder-stream
Add TextEncoderStream and TextDecoderStream implementation
2026-03-04 07:58:56 +01:00
Karl Seguin
ca0f77bdee Support for clone custom elements that attach them self in their constructor
When we createElement, we assume the element is detached. This is usually true
except for Custom Elements where the constructor can do anything, including
connecting the element. This broken assumption results in cloneNode crashing.
2026-03-04 14:54:34 +08:00
Karl Seguin
fc8b1b8549 Use a single HandleScope for event dispatch
https://github.com/lightpanda-io/browser/pull/1690 narrowed the lifetime of
HandleScopes to once per listener. I think that was just an accident of
refactoring, and not some intentional choice.

The narrower HandleScope lifetime makes it so that when we do run microtask
queue at the end of event dispatching, some locals in the queue may not longer
be valid.

HS1
  HS2
    queueMicrotask(func)
  runMicrotask

In the above flow, `func` is only valid while HS2 is alive, so when we run
the microtask queue in HS1, it is no longer valid.
2026-03-04 11:43:09 +08:00
Karl Seguin
bc8c44f62f Merge pull request #1707 from lightpanda-io/nikneym/details
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
e2e-test / browser fetch (push) Has been cancelled
zig-test / zig test using v8 in debug mode (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
e2e-integration-test / zig build release (push) Has been cancelled
e2e-integration-test / demo-integration-scripts (push) Has been cancelled
Add `HTMLDetailsElement`
2026-03-04 07:44:11 +08:00
Karl Seguin
01fab5c92a Merge pull request #1706 from lightpanda-io/cdp-attach-to-browser
cdp: fix send CDP raw command with Playwright
2026-03-04 07:40:05 +08:00
Karl Seguin
1c07d786a0 Merge pull request #1705 from lightpanda-io/nikneym/track
` Track`: implement kind and constants
2026-03-04 07:34:12 +08:00
Karl Seguin
6f0cd87d1c Merge pull request #1703 from lightpanda-io/client_and_script_manager
Fix a few issues in Client
2026-03-04 07:32:14 +08:00
Karl Seguin
e44308cba2 Merge pull request #1695 from lightpanda-io/iframe_src_nav
Iframe src nav
2026-03-04 07:27:23 +08:00
Karl Seguin
50245c5157 Merge pull request #1667 from lightpanda-io/terminate_isolate
On Client.stop, terminate the isolate
2026-03-04 07:27:10 +08:00
Pierre Tachoire
9ca5188e12 cdp: set consistent target's default
with about:blank for url and empty title.
2026-03-03 17:24:08 +01:00
Pierre Tachoire
e25c33eaa6 Merge pull request #1673 from arrufat/mcp
Some checks failed
e2e-test / zig build release (push) Has been cancelled
e2e-test / demo-scripts (push) Has been cancelled
e2e-test / cdp-and-hyperfine-bench (push) Has been cancelled
e2e-test / perf-fmt (push) Has been cancelled
e2e-test / browser fetch (push) Has been cancelled
zig-test / zig test using v8 in debug mode (push) Has been cancelled
zig-test / zig test (push) Has been cancelled
zig-test / perf-fmt (push) Has been cancelled
nightly build / build-linux-x86_64 (push) Has been cancelled
nightly build / build-linux-aarch64 (push) Has been cancelled
nightly build / build-macos-aarch64 (push) Has been cancelled
nightly build / build-macos-x86_64 (push) Has been cancelled
wpt / zig build release (push) Has been cancelled
wpt / build wpt runner (push) Has been cancelled
wpt / web platform tests json output (push) Has been cancelled
wpt / perf-fmt (push) Has been cancelled
Add Model Context Protocol (MCP) server support
2026-03-03 15:18:34 +01:00
Pierre Tachoire
56cc881ac0 Fcdp: fix attachtToTarget and attachToBrowserTarget resp 2026-03-03 15:01:53 +01:00
Adrià Arrufat
7bddc0a89c mcp: remove search and over tools 2026-03-03 22:50:06 +09:00
Halil Durak
50896bdc9d HTMLDetailsElement: add tests 2026-03-03 15:12:12 +03:00
Halil Durak
8dd4567828 HTMLDetailsElement: implement HTMLDetailsElement 2026-03-03 15:12:02 +03:00
Pierre Tachoire
06ef6d3e6a cdp: attachToTarget must add the session id 2026-03-03 12:58:00 +01:00
Pierre Tachoire
14b58e8062 add target.attachToBrowserTarget 2026-03-03 12:58:00 +01:00
Pierre Tachoire
eee232c12c cdp: allow multiple calls to attachToTarget
Playwright, when creating a new CDPSession, sends an
attachToBrowserTarget followed by another attachToTarget to re-attach
itself to the existing target.

see playwright/axtree.js from demo/ repository.
2026-03-03 12:58:00 +01:00
Halil Durak
febe321aef Track: add tests 2026-03-03 14:41:05 +03:00
Halil Durak
28777ac717 Track: implement kind and constants 2026-03-03 14:40:53 +03:00
Pierre Tachoire
13b008b56c css: fix crash in consumeName() on UTF-8 multibyte sequences
advance() asserts that each byte it steps over is either an ASCII byte
or a UTF-8 sequence leader, never a continuation byte (0x80–0xBF).
consumeName() was calling advance(1) for all non-ASCII bytes
('\x80'...'\xFF'), processing multi-byte sequences one byte at a time.
For a two-byte sequence like é (0xC3 0xA9), the second iteration landed
on the continuation byte 0xA9 and triggered the assertion, crashing the
browser in Debug mode.

Fix: replace advance(1) with consumeChar() for all non-ASCII bytes.
consumeChar() reads the lead byte, derives the sequence length via
utf8ByteSequenceLength, and advances the full code point in one step,
so the position never rests on a continuation byte.

Observed on saintcyrlecole.caliceo.com, whose root element carries an
inline style with custom property names containing French accented
characters (--color-store-bulles-été-fg, etc.). The crash aborted JS
execution before the Angular app could render any dynamic content.
2026-03-03 11:13:30 +01:00
Karl Seguin
523efbd85a Fix a few issues in Client
Most significantly, if removing from the multi fails, the connection
is added to a "dirty" list for the removal to be retried later. Looking at
the curl source code, remove fails on a recursive call, and we've struggled with
recursive calls before, so I _think_ this might be happening (it fails in other
cases, but I suspect if it _is_ happening, it's for this reason). The retry
happens _after_ `perform`, so it cannot fail for due to recursiveness. If it
fails at this point, we @panic. This is harsh, but it isn't easily recoverable
and before putting effort into it, I'd like to know that it's actually happening.

Fix potential use of undefined when a 401-407 request is received, but no
'WWW-Authenticate' or 'Proxy-Authenticate' header is received.

Don't call `curl_multi_remove_handle` on an easy that hasn't been added yet do
to error. Specifically, if `makeRequest` fails during setup, transfer_conn is
nulled so that `transfer.deinit()` doesn't try to remove the connection. And the
conn is removed from the `in_use` queue and made `available` again.

On Abort, if getting the private fails (extremely unlikely), we now still try
to remove the connection from the multi.

Added a few more fields to the famous "ScriptManager.Header recall" assertion.
2026-03-03 18:02:06 +08:00
Pierre Tachoire
fcacc8bfc6 remove the isString type check into TransformStream write 2026-03-03 09:40:32 +01:00
Adrià Arrufat
b2e301418f cdp.lp: use page.document instead of window._document 2026-03-03 17:11:16 +09:00
Adrià Arrufat
334a2e44a1 lp: simplify dom_node resolution in getMarkdown 2026-03-03 17:08:43 +09:00
Pierre Tachoire
252b3c3bf6 Ignore BOM only when the option is set on TextDecoderStream 2026-03-03 09:04:41 +01:00
Adrià Arrufat
c9121a03d2 cdp: move LP.getMarkdown test to lp domain 2026-03-03 16:39:31 +09:00
Adrià Arrufat
cc93180d57 cdp: add LP domain and getMarkdown method
This PR introduces a custom CDP domain 'LP' (Lightpanda) to expose browser-specific tools. The first method, 'LP.getMarkdown', allows retrieving a Markdown representation of the DOM or a specific node by its 'nodeId'. This is optimized for AI agents and LLM-based scraping tasks.
2026-03-03 16:35:48 +09:00