Commit Graph

34 Commits

Author SHA1 Message Date
Adrià Arrufat
85ebbe8759 SemanticTree: improve accessibility tree and name calculation
- Add more structural roles (banner, navigation, main, list, etc.).
- Implement fallback for accessible names (SVG titles, image alt text).
- Skip children for leaf-like semantic nodes to reduce redundancy.
- Disable pruning in the default semantic tree view.
2026-03-09 21:04:47 +09:00
Adrià Arrufat
3c97332fd8 feat(dump): add semantic_tree and semantic_tree_text formats
Adds support for dumping the semantic tree in JSON or text format
via the --dump option. Updates the Config enum and usage help.
2026-03-09 18:23:52 +09:00
Adrià Arrufat
248851701f Refactor: move SemanticTree to core and expose via MCP tools 2026-03-06 15:44:03 +09:00
Adrià Arrufat
0f46277b1f CDP: implement LP.getSemanticTree for native semantic DOM extraction 2026-03-06 15:29:32 +09:00
Adrià Arrufat
982b8e2d72 mcp: remove redundant mcp from test references 2026-03-02 22:24:17 +09:00
Adrià Arrufat
da51cdd11d Merge branch 'main' into mcp 2026-03-02 11:55:36 +09:00
Adrià Arrufat
aae9a505e0 mcp: promot Server.zig to file struct 2026-02-28 21:02:49 +09:00
Karl Seguin
45196e022b Add a "wpt" dump mode
Adds a not-documented "wpt" mode to --dump which outputs a formatted
report.cases.

This is meant to make working on a single WPT test case easier, particularly
with some coding tool. Claude recommended this output for its own use.

Instead of telling claude to start the browser in serve mode, then run the
wptrunner, and merge the two outputs (and then stop the server), you can do:

zig build run -- fetch --dump wpt "http://localhost:8000/dom/nodes/CharacterData-appendChild.html"

(you still need the wpt server up)
2026-02-28 19:08:58 +08:00
Karl Seguin
21be3db51f Callers to page.navigate ensure URL is properly encoded.
Follow up to https://github.com/lightpanda-io/browser/pull/1646

The encodeURL (renamed to ensureEncoded and exposed in this commit) already
handled already-encoded URLs, so this was largely a matter of exposing the
functionality.

The reason this isn't baked directly into Page.navigate is that, in some places
e.g. internal navigation, the URL is already know to be encoded. So it's up
to every caller to make sure they are passing a valid URL to navigate.
2026-02-26 12:22:06 +08:00
Adrià Arrufat
8c8a05b8c1 mcp: consolidate tests and cleanup imports 2026-02-26 00:02:49 +09:00
Karl Seguin
a818560344 Add a --with_frames argument to fetch
When set (defaults to not set/false), --dump will include iframe contents.

I was hoping I could add a mode to strip_mode to this, but since dump is used
extensively (e.g. innerHTML), this is something that has to be off by default
(for correctness).
2026-02-25 15:29:27 +08:00
Adrià Arrufat
5fea4cf760 mcp: add protocol and router unit tests 2026-02-22 23:15:45 +09:00
Adrià Arrufat
a27339b954 mcp: add Model Context Protocol server support
Adds a new `mcp` run mode to start an MCP server over stdio.
Implements tools for navigation and JS evaluation, along with
resources for HTML and Markdown page content.
2026-02-22 22:32:14 +09:00
Pierre Tachoire
e15295bdac Merge pull request #1560 from arrufat/dump-markdown
Add support for dumping output to markdown
2026-02-19 10:32:57 +01:00
Nikolay Govorov
9296c10ca4 Use per-cdp connection HttpClient 2026-02-18 09:22:26 +00:00
Adrià Arrufat
dea492fd64 Unify dump flags into --dump <format> 2026-02-17 00:42:06 +09:00
Adrià Arrufat
748b37f1d6 Rename --dump-markdown to --markdown 2026-02-17 00:21:10 +09:00
Adrià Arrufat
1b5efea6eb Add --dump-markdown flag
Add a new module to handle HTML-to-Markdown conversion and
integrate it into the fetch command via a new CLI flag.
2026-02-15 23:18:01 +09:00
Karl Seguin
a6ba801738 Fix double-free of XHR on double client abort
It's possible (in fact normal) for client.abort to be called twice on a schedule
navigation. We immediately abort any pending requests once a secondary
navigation is called (is that right?), and then again when the page shuts down.

The first abort will kill the transfer, so the XHR object has to null this value
so that, on context shutdown, when the finalizer is called, we don't try to
free it again.
2026-02-10 11:30:10 +08:00
Nikolay Govorov
f71aa1cad2 Centralizes configuration, eliminates unnecessary copying of config 2026-02-04 07:57:59 +00:00
Nikolay Govorov
fd8c488dbd Move Notification from App to BrowserContext 2026-02-04 07:33:45 +00:00
Karl Seguin
181f265de5 Rework Inspector usage
V8's inspector world is made up of 4 components: Inspector, Client, Channel and
Session. Currently, we treat all 4 components as a single unit which is tied to
the lifetime of CDP BrowserContext - or, loosely speaking, 1 "Inspector Unit"
per page / v8::Context.

According to https://web.archive.org/web/20210622022956/https://hyperandroid.com/2020/02/12/v8-inspector-from-an-embedder-standpoint/
and conversation with Gemini, it's more typical to have 1 inspector per isolate.
The general breakdown is the Inspector is the top-level manager, the Client is
our implementation which control how the Inspector works (its function we expose
that v8 calls into). These should be tied to the Isolate. Channels and Sessions
are more closely tied to Context, where the Channel is v8->zig and the Session
us zig->v8.

This PR does a few things
1 - It creates 1 Inspector and Client per Isolate (Env.js)
2 - It creates 1 Session/Channel per BrowserContext
3 - It merges v8::Session and v8::Channel into Inspector.Session
4 - It moves the Inspector instance directly into the Env
5 - BrowserContext interacts with the Inspector.Session, not the Inspector

4 is arguably unnecessary with respect to the main goal of this commit, but
the end-goal is to tighten the integration. Specifically, rather than CDP having
to inform the inspector that a context was created/destroyed, the Env which
manages Contexts directly (https://github.com/lightpanda-io/browser/pull/1432)
and which now has direct access to the Inspector, is now equipped to keep this
in sync.
2026-01-30 15:59:33 +08:00
Karl Seguin
b4759ae261 Ability to capture a V8 heap profile and a heap snapshot 2026-01-22 10:27:58 +08:00
Karl Seguin
0f9c9e2089 Improve crash handling
This adds a crash handler which reports a crash (if telemetry is enabled). On a
crash, this looks for `curl` (using the PATH env), and forks the process to then
call execve. This relies on a new endpoint to be setup to accept the "report".
Also, we include very little data..I figured just knowing about crashes would
be a good place to start.

A panic handler is provided, which override's Zig default handler and hooks
into the crash handler.

An `assert` function is added and hooks into the crash handler. This is
currently only used in one place (Session.zig) to demonstrate its use. In
addition to reporting a failed assert, the assert aborts execution in
ReleaseFast (as opposed to an undefined behavior with std.debug.assert).

I want to hook this into the v8 global error handler, but only after direct_v8
is merged.

Much of this is inspired by bun's code. They have their own assert (1) and
a [more sophisticated] crashHandler (2).
:

(1) beccd01647/src/bun.zig (L2987)
(2) beccd01647/src/crash_handler.zig (L198)
2026-01-19 07:36:46 +08:00
Karl Seguin
087086c308 remove some unused imports 2025-12-26 12:40:20 +08:00
Karl Seguin
d9c53a3def Page.scheduleNavigation for location changes 2025-12-22 12:19:08 +08:00
Muki Kiboigo
ac85341cab add NavigationKind to navigate 2025-12-09 17:10:59 -08:00
Karl Seguin
47b4b68e60 add parsed DocType to document (and handle dumping it) 2025-12-09 16:38:56 +08:00
Karl Seguin
e336c67857 various small api fixes/tweaks 2025-11-24 20:12:43 +08:00
Karl Seguin
7ab88e9a71 add legacy tests, optimize empty types 2025-11-14 15:55:02 +08:00
Karl Seguin
1164da5e7a copyright notices 2025-11-14 10:52:43 +08:00
Karl Seguin
d3973172e8 re-enable minimum viable CDP server 2025-10-28 18:56:03 +08:00
Karl Seguin
cdd31353c5 get fetch campire working 2025-10-28 11:24:29 +08:00
Karl Seguin
b047cb6dc1 remove libdom 2025-10-27 22:14:59 +08:00