The mix of sync and async HTTP requests requires care to avoid deadlocks.
Previously, it was possible for async requests to use up all available HTTP
state objects duration a navigation flow (either directly, or via an internal
redirect (e.g. click, submit, ...)). This would block the navigation, which,
because everything is single thread, would block the I/O loop, resulting in a
deadlock.
The correct solution seems to be to remove all synchronous I/O. And I tried to
do that, but I ran into a wall with module-loading, which is initiated from V8.
V8 says "give me the source for this module", and I don't see a great way to
tell it: wait a bit.
So I went back to trying to make this work with the hybrid model, despite last
weeks failures to get it to work. I changed two things:
1 - The http client will only directly initiate an async request if there's
at least 2 free state objects available (1 for the request, and leaving 1
free for any synchronous requests)
2 - Delayed navigation retries until there's at least 1 free http state object
available.
Commits from last week did help with this. First, we're now guaranteed to have
a single sync-request at a time (previously, we could have had 2). Secondly,
the async connection is now async end-to-end (previously, it could have blocked
on an empty state pool).
We could probably make this a bit more obviously by reserving 1 state object
for synchronous requests. But, since the long term solution is probably having
no synchronous requests, I'm happy with anything that lets me move past this
issue.
1 - Make log_level a runtime option (not a build-time)
2 - Make log_format a runtime option
3 - In Debug mode, allow for log scope filtering
Improve the general usability of scopes. Previously, the scope was more or less
based on the file that the log was in. Now they are more logically grouped.
Consider the case where you want to silence HTTP request information, previously
you'd have to filter out the `page`, `xhr` and `http_client` scopes, but that
would also elimiate other page, xhr and http_client logs. Now, you can just
filter out the `http` scope.
We normally expect a navigation event to happen at some point after the page
loads, like a puppeteer script clicking on a link. But, it's also possible for
the main navigation event to result in a delayed navigation. For example, an
html page with this JS:
<script>top.location = '/';</script>
Would result in a delayed navigation being called from the main navigate
function. In these cases, we cannot clear the transfer_arena when navigate is
completed, as its memory is needed by the new "sub" delayed navigation.
Outputs in logfmt in release and a "pretty" print in debug mode. The format
along with the log level will become arguments to the binary at some point in
the future.
Some data has to exist specifically for the navigation of one page to another.
For example, if a hyperlink is clicked, the URL begins its life with the
original page, but is transferred to the new page. The page_arena cannot be used
for such data.
It's possible to use the session_arena, but it's lifetime is much longer and,
given enough navigation, could accumulate a lot of memory.
The new transfer_arena exists within the session, but only exists until the
next navigation.
While currently only used for the navigation URL, the main goal here is to have
a place to put the request body on form submission, which has a lifetime similar
to a click url.
While I'm at it, I promoted the existing session arena and the new transfer
arena to the browser, allowing better memory re-use between sessions.