Early errors may result from parsing the source text of a test file, but
they may also result from parsing some other source text as referenced
through the ES2015 module syntax. The latter form of early error is not
necessarily detectable by ECMAScript parsers, however. Because of this,
the label "early" is not sufficiently precise for all Test262 consumers
to correctly interpret all tests.
Update the "phase" name of "early" to "parse" for all those negative
tests that describe errors resulting from parsing of the file's source
text directly. A forthcoming commit will update the remaining tests to
use a "phase" name that is more specific to module resolution.
Previously, test consumers were encouraged to insert a `throw` statement
as the first statement of tests for early errors. This recommendation
made tests harder to consume, and as an optional transformation,
consumers may have ignored it or simply been unaware it was made. By
explicitly including such a `throw` statement, the tests become more
literal, making them easier to consume and more transparent in their
expectations.
Document expectation for all tests for early errors to include an
explicit `throw` statement. Extend linting script to verify that
contributors are automatically notified of violations and to ensure that
future contributions satisfy this expectation.
A recent commit introduced a document that enumerated acceptable values
for the test "features" metadata tag. However, this list was incomplete,
and maintaining it placed extra burden on the project owners.
Restructure the document into a machine-readable format. Add entries for
all previously-omitted values. Add in-line documentation with
recommendations for maintenance of the file. Extend the project's
linting tool to validate tests according to the document's contents.
This script is intended to identify common test file formatting errors
prior to their acceptance into the project. It is designed to support
future extensions for additional validation rules.
In order to promote readability of the generated test material, the test
generation tool may insert whitespace if the context a given expanded
variable calls for it. Avoid inserting such whitespace within literal
values that span multiple lines.
Since the argument is required, we mark it as so. Using this approach
gives the user a much nicer error message, as compared to just the "not
enough args" message.
When inspecting previously-generated files, a new `Test` instance should
be used. This avoids over-writing the in-memory representation of the
latest test, and allows previously-existing test files to be partially
updated according to subsequent changes in their respective source/case
files.
In expecting "case directories" to contain a sub-directory named
"default", the test generation tool is unable to generate tests for
features where a directory named "default" is not appropriate.
Modify the heuristic that identifies "case directories" to use a more
fundamental aspect (i.e. the existence of at least one "case" file).
For asynchronous tests, the contract between test file and test runner
is implicit: runners are expected to inspect the source code for
references to a global `$DONE` identifier.
Promote a more explicit contract between test file and test runner by
introducing a new frontmatter "tag", `async`. This brings asynchronous
test configuration in-line with other configuration mechanisms and also
provides a more natural means of test filtering.
The modifications to test files was made programatically using the
`grep` and `sed` utilities:
$ grep "\$DONE" test/ -r --files-with-match --null | \
xargs -0 sed -i 's/^\(flags:\s*\)\[/\1[async, /g'
$ grep "\$DONE" test/ -rl --null | \
xargs -0 grep -E '^flags:' --files-without-match --null | \
xargs -0 sed -i 's/^---\*\//flags: [async]\n---*\//'
When executing multiple tests in parallel, each "child" thread would
write to the process's standard output buffer immediately upon test
completion. Because thread execution order and instruction interleaving
is non-deterministic, this made it possible for characters to be emitted
out-of-order.
When extended to support multiple concurrent threads, the runner was
outfitted with a "log lock" dedicated to sharing access to the output
file (when applicable). Re-use this lock when writing to standard out,
ensuring proper ordering of test result messages.
A recent extension to the test runner introduced support for running
tests in parallel using multi-threading. Following this, the runner
would incorrectly emit the "final report" before all individual test
results.
In order to emit the "final report" at the end of the output stream, the
parent thread would initialize all children and wait for availability of
a "log lock" shared by all children.
According to the documentation on the "threading" module's Lock object
[1]:
> When more than one thread is blocked in acquire() waiting for the state
> to turn to unlocked, only one thread proceeds when a release() call
> resets the state to unlocked; which one of the waiting threads proceeds
> is not defined, and may vary across implementations.
This means the primitive cannot be used by the parent thread to reliably
detect completion of all child threads.
Update the parent to maintain a reference for each child thread, and to
explicitly wait for every child thread to complete before emitting the
final result.
[1] https://docs.python.org/2/library/threading.html#lock-objects
Adds a `-j`/`--workers-count` parameter to `tools/packaging/test262.py`, defaulting to `[number of cores] - 1`.
Speeds up running the test suite by about ~3x on my 4-core machine, with the SpiderMonkey shell. This could certainly be optimized more by just appending test results to per-thread lists and merging them at the end, but it's better than nothing.