Previously, the IO loop was doing three things:
1 - Managing timeouts (either from scripts or for our own needs)
2 - Handling browser IO events (page/script/xhr)
3 - Handling CDP events (accept, read, write, timeout)
With the libcurl merge, 1 was moved to an in-process scheduler and 2 was moved
to libcurl's own event loop. That means the entire loop code, including
the dependency on tigerbeetle-io existed for handling a single TCP client.
Not only is that a lot of code, there was also friction between the two loops
(the libcurl one and our IO loop), which would result in latency - while one
loop is waiting for the events, any events on the other loop go un-processed.
This PR removes our IO loop. To accomplish this:
1 - The main accept loop is blocking. This is simpler and works perfectly well,
given we only allow 1 active connection.
2 - The client socket is passed to libcurl - yes, libcurl's loop can take
arbitrary FDs and poll them along with its own.
In addition to having one less dependency, the CDP code is quite a bit simpler,
especially around shutdowns and writes. This also removes _some_ of the latency
caused by the friction between page process and CDP processing. Specifically,
when CDP now blocks for input, http page events (script loading, xhr, ...) will
still be processed.
There's still friction. For one, the reverse isn't true: when the page is
waiting for events, CDP events aren't going to be processed. But the page.wait
already have some sensitivity to this (e.g. the page.request_intercepted flag).
Also, when CDP waits, while we will process network events, page timeouts are
still not processed. Because of both these remaining issues, we still need to
jump between the two loops - but being able to block on CDP (even for a short
time) WITHOUT stopping the page's network I/O, should reduce some latency.
Aims to improve compatibility for the lit framework (e.g. what Reddit is using).
1 - Adds support for adoptedStyleSheets to the Document and ShadowRoot
2 - Adds mock support for replace and replaceSync to the CSSStyleSheet
3 - Optionally include shadowroot in dump
4 - Special-case setting innerHTML on a TemplateElement
Currently, fetch --dump includes <script> tag (either inline or with src). I
don't know what use-case this is the desired behavior. Excluding them, via the
new --noscript option has benefit that if you --dump --noscript and open the
resulting page in the browser, you don't re-execute JavaScript, which is
likely to break the page.
For example, opening a --dump of github makes it look like the page is broken
because it re-executes JavaScript that isn't meant to be re-executed.
Similarly, opening a --dump in a browser might execute JavaScript that
lightpanda browser failed to execute, making it looks like it worked better
than it did.
The wait_for_network_idle demo often times out for me. I don't see any reason
to have the default so low. More likely to cause user scripts to unnecessarily
fail.
1 - Make log_level a runtime option (not a build-time)
2 - Make log_format a runtime option
3 - In Debug mode, allow for log scope filtering
Improve the general usability of scopes. Previously, the scope was more or less
based on the file that the log was in. Now they are more logically grouped.
Consider the case where you want to silence HTTP request information, previously
you'd have to filter out the `page`, `xhr` and `http_client` scopes, but that
would also elimiate other page, xhr and http_client logs. Now, you can just
filter out the `http` scope.
Outputs in logfmt in release and a "pretty" print in debug mode. The format
along with the log level will become arguments to the binary at some point in
the future.
1 - Add a log_level build option to control the default log level from
the build (e.g. -Dlog_level=debug). Defaults to info
2 - Add a new boolean log_unknown_properties build option to enable
logging unknown properties. Defautls to false.
3 - Remove the log debug for script eval - this can be a huge value
(i.e. hundreds of KB), which makes the debug log unusable IMO.
Replaces the existing, very specialized Notification with something more
general.
Currently, the existing page_navigate and page_navigated have been migrated.
Telemetry's page navigation event now also hooks into these events to generate
the telemetry record.
- Pages within the same session have proper isolation
- they have their own window
- they have their own SessionState
- they have their own v8.Context
- Move inspector to CDP browser context
- Browser now knows nothing about the inspector
- Use notification to emit a context-created message
- This is still a bit hacky, but again, it decouples browser from CDP
In order to support click handling on anchors from JavaScript, we need some hook
from the page/session to the CDP instance. This first phase adds notifications
in page.navigate, as well as a primitive notification hook to the session.
CDP's existing Page.navigate uses this new notifiation system.
When set, this disables the host verification of all HTTP requests. Available
for both the fetch and serve mode.
Also introduced an App.Config, for future command line options which need to
be passed more deeply into the code.