Project introduction
The Growth-Team is currently working on the Community configuration 2.0 project, which allows wiki administrators to configure MediaWiki on-wiki, without having to make a Phabricator request. This is a continuation, and improvement of the Community configuration 1.0 project, which is implemented within the GrowthExperiments as one of its integral parts.
Detailed information about the project, issues identified in the CC1.0 implementation, as well as a high-level overview of its architecture, can be found in the Community configuration 2.0 Product Requirements Document. Feel free to review that document as well (feedback can be submitted as comments within the linked PRD document).
If possible, all feedback should be provided by the end of November 08, 2023.
Introduction of the problem
When allowing on-wiki administrators to configure MediaWiki, one needs to ensure that the configuration is of an expected format. Otherwise, it would be possible to take down the site via an on-wiki change, which is not desirable.
The first (Growth-specific) version of Community configuration 1.0 has a basic data type validation, which was engineered by the Growth team. It does not scale well with new and new data stored, and it is only capable of catching the most basic errors.
Task scope
The scope of this task is to determine the validation approach that should be used within the Community configuration 2.0 project. Namely, the following questions should be answered:
- What should be the interface of a validator?
- How should extensions/skins register their configuration providers?
- Which JSON Schema validation library should be used to enforce the validation rules?
- How to ensure backwards compatibility when the format of a configuration value changes?
- What other considerations should be kept in mind when implementing validation?
Proposed solution
(1) What should be the interface of a validator?
Community configuration 2.0 backend extension introduces a concept of a configuration provider (a class with the ability to read/write on-wiki config). Each provider has an assigned validator, which is responsible for (a) validating the configuration (b) determining where the underlying schema lies.
Validators need to be implemented within the Community configuration extension, and confirm to the following interface (source link):
interface IValidator { /** * Validate passed config * * This is executed by WikiPageConfigWriter _before_ writing a config (for edits made * via GrowthExperiments-provided interface), by ConfigHooks for manual edits and * by WikiPageConfigLoader before returning the config (this is to ensure invalid config * is never used). * * @param array $config Associative array representing config that's going to be validated * @return StatusValue */ public function validate( array $config ): StatusValue; /** * Return list of supported top level keys * * This is useful for IConfigurationProvider implementations; this information can be used to * decide whether a certain configuration request asks for an information that can be * present (but is missing from the store at the moment), or whether the information cannot * exist at all (and thus the request is invalid). Example is deciding whether Config::get * should throw, or return a default value. * * @return array List of top level keys names */ public function getSupportedTopLevelKeys(): array; }
(2) How should extensions/skins register their configuration providers?
Community configuration 2.0 introduces the concept of “configuration provider”, which represents an entity providing configuration for a certain feature or featureset (such as, variables needed for Growth’s Suggested edits to work might be one provider). Each provider is represented as a set of the storage backend and the validation backend, which define where the configuration is stored and how it is validated. All supported storage/validation backends need to be implemented within Community configuration 2.0 itself.
In the initial version, jsonschema would be the only provided validator, which would take a path to the JSON Schema describing the configuration file. In this file, each of the supported configuration variables is described, to ensure configuration returned by Community configuration 2.0 is in the format the extension expects. Community configuration 2.0 ensures that returned configuration is valid.
In the initial version of Community configuration 2.0, we focus on configuration providers that are outside of MediaWiki Core. Extensions or skins can register their own providers via extension.json or skin.json respectively:
{ "CommunityConfigurationProviders": { "example": { "storage": { "type": "wikipage", "args": [ "MediaWiki:Example.json" ] }, "validator": { "type": "jsonschema", "args": [ "relative/path/to/json/schema.json" ] } } } }
(3) Which JSON Schema validation library should be used to enforce the validation rules?
There are already a fair amount of MW components doing JSON validation for different use cases. The goal would be to select a library that can cover this project use cases and also be reused by existing components. In that direction three libraries have been evaluated and the current proposal is to use Opis/json-schema. That is because of its repository code health, schema versions support and because it is already in use in MW. A summary of the libraries trade-offs can be found in T332847#9228639.
A remaining challenge is to provide localization support for the validation error messages. None of the researched libraries has i18n feature built-in. However it is possible to build a localization layer around it or collaborate with the maintainers to make it part of the library.
(4) How to ensure backwards compatibility when the format of a configuration value changes?
Occasionally, there is a need to make a breaking change within the configuration schema. Those situations should be resolvable via a migration interface, which would allow developers to convert their data from older format to a newer format. This is done by introducing migration classes that would be registered as a step from one schema to the directly newer one. Backwards compatibility is accomplished by:
- Each wiki-page validated by a jsonschema validator has a $schema special variable, which includes the name of the schema (including a version identifier) that the page adheres to.
- The extension that registered the configuration page also registers a list of migration classes that are capable of converting a configuration from a directly preceding schema to a new one.
- Upon invocation of a maintenance script, Community configuration 2.0 converts the content of the wiki page.
(5) What other considerations should be kept in mind when implementing validation?
If there is anything we didn’t consider and we should know about, please tell us in the comments in this task. If you have any questions that are not related to the validation, feel free to ask them on the MediaWiki.org project talk page.
Proof of concept implementation
To better understand the proposed solution, we have prepared a proof of concept implementation, which is hosted at the Wikimedia Gitlab. Currently, it consists of three different parts:
- Community configuration backend extension
- Community configuration example user extension
- Required changes in MediaWiki Core: sandbox/urbanecm/community-configuration in Gerrit, see Gerrit
Instructions on how to set up the PoC version of Community configuration 2.0 locally are available in the backend repository’s README.