css: fix crash in consumeName() on UTF-8 multibyte sequences

advance() asserts that each byte it steps over is either an ASCII byte
or a UTF-8 sequence leader, never a continuation byte (0x80–0xBF).
consumeName() was calling advance(1) for all non-ASCII bytes
('\x80'...'\xFF'), processing multi-byte sequences one byte at a time.
For a two-byte sequence like é (0xC3 0xA9), the second iteration landed
on the continuation byte 0xA9 and triggered the assertion, crashing the
browser in Debug mode.

Fix: replace advance(1) with consumeChar() for all non-ASCII bytes.
consumeChar() reads the lead byte, derives the sequence length via
utf8ByteSequenceLength, and advances the full code point in one step,
so the position never rests on a continuation byte.

Observed on saintcyrlecole.caliceo.com, whose root element carries an
inline style with custom property names containing French accented
characters (--color-store-bulles-été-fg, etc.). The crash aborted JS
execution before the Angular app could render any dynamic content.
This commit is contained in:
Pierre Tachoire
2026-03-03 11:13:30 +01:00
parent 4062a425cb
commit 13b008b56c
2 changed files with 24 additions and 4 deletions

View File

@@ -480,10 +480,11 @@ fn consumeName(self: *Tokenizer) []const u8 {
self.consumeEscape();
},
0x0 => self.advance(1),
'\x80'...'\xBF', '\xC0'...'\xEF', '\xF0'...'\xFF' => {
// This byte *is* part of a multi-byte code point,
// well end up copying the whole code point before this loop does something else.
self.advance(1);
'\x80'...'\xFF' => {
// Non-ASCII: advance over the complete UTF-8 code point in one step.
// Using consumeChar() instead of advance(1) ensures we never land on
// a continuation byte, which advance() asserts against.
self.consumeChar();
},
else => {
if (self.hasNonAsciiAt(0)) {

View File

@@ -256,3 +256,22 @@
testing.expectTrue(!html.includes('opacity:0'));
}
</script>
<script id="CSSStyleDeclaration_non_ascii_custom_property">
{
// Regression test: accessing element.style must not crash when the inline
// style attribute contains CSS custom properties with non-ASCII (UTF-8
// multibyte) names, such as French accented characters.
// The CSS Tokenizer's consumeName() must advance over whole UTF-8 sequences
// rather than byte-by-byte to avoid landing on a continuation byte.
const div = document.createElement('div');
div.setAttribute('style',
'--color-store-bulles-\u00e9t\u00e9-fg: #6a818f;' +
'--color-store-soir\u00e9es-odl-fg: #56b3b3;' +
'color: red;'
);
// Must not crash, and ASCII properties that follow non-ASCII ones must be readable.
testing.expectEqual('red', div.style.getPropertyValue('color'));
}
</script>