From 7e190c786d321a987c7ef266a5d6ebc664f03b53 Mon Sep 17 00:00:00 2001 From: Mike Pennisi Date: Fri, 11 Feb 2022 19:05:23 -0500 Subject: [PATCH] Document rationale for some maintenance practices --- README.md | 3 + docs/rationale.md | 144 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 147 insertions(+) create mode 100644 docs/rationale.md diff --git a/README.md b/README.md index 60bbb32366..fe8937df89 100644 --- a/README.md +++ b/README.md @@ -34,6 +34,9 @@ Guidance for contributing to Test262 can be found in [CONTRIBUTING.md](./CONTRIB Guidance for running Test262 and explanations of how a test file must be interpreted by a test runner is in [INTERPRETING](./INTERPRETING.md) +### Rationale + +This project offers an explanation for many of its design decisions and maintenance practices--see [rationale.md](./docs/rationale.md). ### Test262 Runners diff --git a/docs/rationale.md b/docs/rationale.md new file mode 100644 index 0000000000..66055afa4e --- /dev/null +++ b/docs/rationale.md @@ -0,0 +1,144 @@ +# Test262 maintenance rationale + +Explanations behind the practices promoted by the project maintainers. + +## Vestigial tests + +Test262 has been maintained for many years, and the practices used to write +tests have evolved alongside the needs of its consumers. When conventions +change, old tests are typically updated to accommodate the new practices. That +doesn't always happen, though, and one doesn't have to look very far to find +examples of tests which contradict the preferred patterns. + +For instance: + +- tests which expression expectations with `throw` statements inside of + conditional statements rather than the assertion API implemented by the + harness files (though this explicitness will always been desirable when + asserting the semantics of conditional statements and `throw` statements + themselves) +- tests with file names derived from section numbers in the 5th edition of + ECMA262, e.g. `built-ins/Array/15.4.5-1.js` +- tests which validate multiple behaviors, using elaborate comment blocks to + designate sections +- tests which use deprecated harness functions, e.g. `verifyEnumerable`, + `verifyConfigurable`, and `verifyWritable` + +Since existing tests do not necessarily reflect the project's current +best-practices, it's especially important for test authors to familiarize +themselves with the contribution guidelines. + +## Test generation + +The project includes a software tool for generating test material from abstract +templates. The generator was designed to promote uniformity of coverage, +particularly for parts of the grammar that are used in many productions +(specifically: the destructuring assignment patterns introduced in the 6th +edition of ECMA262). + +This tool makes it easy to introduce enormous numbers of tests. Introducing +more tests is not a goal unto itself, though. Test262 prioritizes coherence in +its coverage of the specification, and it recognizes that the value of tests, +as measured by their likelihood to identify defects, varies. + +For these reasons, the maintainers urge restraint in the application of the +test generation tool. + +## File structure + +For practical reasons, tests are organized in a tree structure according to the +conventions of modern file systems. Unfortunately, this structure is not +expressive enough to model the semantics of a rich and evolving programming +language like ECMAScript. This means the common and crucial task of coverage +assessment will likely always be a challenge, but strong conventions around +file organization can help. + +Tests for syntax-derived operations are organized according to the language +grammar, with directories used to describe non-terminals. Tests for built-in +APIs are organized according to the identifiers by which they can be accessed, +with directories used to describe the sequence of properties that can be used +from the global scope. + +Directories are not generally applied beyond these limits; further +differentiation is instead achieved through structured file names which follow +ad-hoc conventions. This organization balances the need to group tests +logically with the need to discover tests. + +Many consumers use file names as a way to compare test results across revisions +and between implementations. For this reason, tests files are rarely +re-organized after being accepted. + +## Regression tests + +It is possible to write tests for semantics which, while not explicitly +specified by ECMA262, are nonetheless valid according to the normative text. +Such tests are welcome in Test262, but their fitness is not a given. Consumers +from many constituencies value the coherence and consistency of the test suite, +and tests which disallow arbitrary extraneous behavior can degrade those +qualities. Because Test262 is not maintained as a repository of regression +tests, contributions which include these kinds of tests will be weighed against +their likelihood of identifying error in a plurality of implementations. + +## Large tests + +Test262 tests are typically very focused. The vast majority exercise just one +algorithm step/grammar production, and some are even more granular that that! + +Some test contributors are uncomfortable splitting their work across files like +this. It's certainly unlike the practices that are common in modern application +development. In those settings, many tests are often grouped into the same file +and separated by function boundaries. + +Test262 doesn't use the same approach as a typical application test suite in +order to limit complexity. The guidance of "one test per file" means that +consuming Test262 is relatively easy; there is no "test runner" API for +consumers to implement, and interpreting results is likewise straightforward. +It also lowers the barrier to entry for new contributors since there is no API +to learn. + +## Syntax tests + +When testing a syntactic feature of the language, it can be tempting to write +tests which verify that some bit of source text does *not* produce a syntax +error. Contributors should try to push beyond verifying only the lack of a +syntax error because almost all such tests also have observable semantics. It's +often better for a test to assert that the expected semantics are followed, +even when they may already be covered elsewhere. + +However, this is not always desirable because verifying semantics invariably +requires inserting still more code, and that additional code may degrade the +tests' precision for verifying syntax. + +When considering this tension, be aware that TC39 maintains [a separate project +called test262-parser-tests](https://github.com/tc39/test262-parser-tests). +This project was partially motivated by a desire to offer test material that +isn't (and perhaps cannot be) related to any specific grammar production. The +availability of that project may inform decisions about if, where, and how to +include tests for syntax in Test262. + +## Avoiding abstraction + +Contributors will occasionally suggest introducing new abstractions to reduce +duplication in tests. The maintainers set a relatively high bar for such +enhancements, both due to their many drawbacks and due to the aspects of +standards testing which limit their benefit. + +The drawbacks to abstraction include: + +- it degrades the tests by introducing unrelated semantics +- it discourages contributors by requiring them to learn more +- it frustrates implementers by making it harder to understand what's being + tested and what has failed + +One of abstraction's common motivations is its tendency to reduce maintenance +costs by limiting duplication. TC39 has a very high standard for compatibility +between revisions of ECMA262. This gives us a certain assurance in Test262 that +maintainers of other test suites do not enjoy: Test262's tests are very rarely +invalidated. The project takes advantage of this by using a more declarative, +readable, and verbose style. + +Abstraction has other motivations, so there will always be room for it to some +extent. When the benefits of a specific proposal outweigh the drawbacks, then +it should be well documented and also well-tested. Test262 maintains tests for +its "harness" abstractions in [a dedicated directory within the test suite +itself](https://github.com/tc39/test262/tree/main/test/harness).