diff --git a/common/dtd/messageFormat/message.json b/common/dtd/messageFormat/message.json index 6d4036887a4..611107ff783 100644 --- a/common/dtd/messageFormat/message.json +++ b/common/dtd/messageFormat/message.json @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema", - "$id": "https://github.com/unicode-org/cldr/blob/maint/maint-45/common/dtd/messageFormat/message.json", +"$id": "https://github.com/unicode-org/cldr/blob/maint/maint-46/common/dtd/messageFormat/message.json", "oneOf": [{ "$ref": "#/$defs/message" }, { "$ref": "#/$defs/select" }], diff --git a/common/testData/messageFormat/README.md b/common/testData/messageFormat/README.md index 6fa17197bc4..fb645271806 100644 --- a/common/testData/messageFormat/README.md +++ b/common/testData/messageFormat/README.md @@ -2,40 +2,176 @@ For information about MessageFormat 2.0, see [Unicode Locale Data Markup Language (LDML): Part 9: Message Format](../../../docs/ldml/tr35-messageFormat.md) -The files in this directory were originally copied from the [messageformat project](https://github.com/messageformat/messageformat/tree/11c95dab2b25db8454e49ff4daadb817e1d5b770/packages/mf2-messageformat/src/__fixtures) +The tests in the `./tests/` directory were originally copied from the [messageformat project](https://github.com/messageformat/messageformat/tree/11c95dab2b25db8454e49ff4daadb817e1d5b770/packages/mf2-messageformat/src/__fixtures) and are here relicensed by their original author (Eemeli Aro) under the Unicode License. These test files are intended to be useful for testing multiple different message processors in different ways: -- `syntax-errors.json` — An array of strings that should produce a Syntax Error when parsed. - -- `data-model-errors.json` - An object with string keys and arrays of strings as values, - where each key is the name of an error and its value is an array of strings that - should produce `error` when processed. - Error names are defined in ["MessageFormat 2.0 Errors"](../../../docs/ldml/tr35-messageFormat.md#errors) in the spec. - -- `test-core.json` — An array of test cases that do not depend on any registry definitions. - Each test may include some of the following fields: - - `src: string` (required) — The MF2 syntax source. - - `exp: string` (required) — The expected result of formatting the message to a string. - - `locale: string` — The locale to use for formatting. Defaults to 'en-US'. - - `params: Record` — Parameters to pass in to the formatter for resolving external variables. - - `parts: object[]` — The expected result of formatting the message to parts. - - `cleanSrc: string` — A normalixed form of `src`, for testing stringifiers. - - `errors: { type: string }[]` — The runtime errors expected to be emitted when formatting the message. - If `errors` is either absent or empty, the message must be formatted without errors. - - `only: boolean` — Normally not set. A flag to use during development to only run one or more specific tests. - -- `test-function.json` — An object with string keys and arrays of test cases as values, - using the same definition as for `test-core.json`. - The keys each correspond to a function that is used in the tests. - Since the behavior of built-in formatters is implementation-specific, - the `exp` field should generally be omitted, - except for error cases. - -TypeScript `.d.ts` files are included for `test-core.json` and `test-function.json` with the above definition. +- `syntax.json` — Test cases that do not depend on any registry definitions. + +- `syntax-errors.json` — Strings that should produce a Syntax Error when parsed. + +- `data-model-errors.json` - Strings that should produce a Data Model Error when processed. + Error names are defined in ["MessageFormat 2.0 Errors"](../spec/errors.md) in the spec. + +- `functions/` — Test cases that correspond to built-in functions. + The behaviour of the built-in formatters is implementation-specific so the `exp` field is often + omitted and assertions are made on error cases. Some examples of test harnesses using these tests, from the source repository: + - [CST parse/stringify tests](https://github.com/messageformat/messageformat/blob/11c95dab2b25db8454e49ff4daadb817e1d5b770/packages/mf2-messageformat/src/cst/cst.test.ts) - [Data model stringify tests](https://github.com/messageformat/messageformat/blob/11c95dab2b25db8454e49ff4daadb817e1d5b770/packages/mf2-messageformat/src/data-model/stringify.test.ts) - [Formatting tests](https://github.com/messageformat/messageformat/blob/11c95dab2b25db8454e49ff4daadb817e1d5b770/packages/mf2-messageformat/src/messageformat.test.ts) + +A [JSON schema](./schemas/) is included for the test files in this repository. +## Error Codes + +The following table relates the error names used in the [JSON schema](./schemas/) +to the error names used in ["MessageFormat 2.0 Errors"](../spec/errors.md) in the spec. + +| Spec | Schema | +| --------------------------- | --------------------------- | +| Bad Operand | bad-operand | +| Bad Option | bad-option | +| Bad Selector | bad-selector | +| Bad Variant Key | bad-variant-key | +| Duplicate Declaration | duplicate-declaration | +| Duplicate Option Name | duplicate-option-name | +| Duplicate Variant | duplicate-variant | +| Missing Fallback Variant | missing-fallback-variant | +| Missing Selector Annotation | missing-selector-annotation | +| Syntax Error | syntax-error | +| Unknown Function | unknown-function | +| Unresolved Variable | unresolved-variable | +| Variant Key Mismatch | variant-key-mismatch | + +The "Message Function Error" error name used in the spec +is not included in the schema, +as it is intended to be an umbrella category +for implementation-specific errors. + +## Test Functions + +As the behaviour of some of the default registry _functions_ +such as `:number` and `:datetime` +is dependent on locale-specific data and may vary between implementations, +the following _functions_ are defined for **test use only**: + +### `:test:function` + +This function is valid both as a _selector_ and as a _formatter_. + +#### Operands + +The function `:test:function` requires a [Number Operand](/spec/registry.md#number-operands) as its _operand_. + +#### Options + +The following _options_ are available on `:test:function`: +- `decimalPlaces`, a _digit size option_ for which only `0` and `1` are valid values. + - `0` + - `1` +- `fails` + - `never` (default) + - `select` + - `format` + - `always` + +All other _options_ and their values are ignored. + +#### Behavior + +When resolving a `:test:function` expression, +its `Input`, `DecimalPlaces`, `FailsFormat`, and `FailsSelect` values are determined as follows: + +1. Let `DecimalPlaces` be 0. +1. Let `FailsFormat` be `false`. +1. Let `FailsSelect` be `false`. +1. Let `arg` be the resolved value of the _expression_ _operand_. +1. If `arg` is the resolved value of an _expression_ + with a `:test:function`, `:test:select`, or `:test:format` _annotation_ + for which resolution has succeeded, then + 1. Let `Input` be the `Input` value of `arg`. + 1. Set `DecimalPlaces` to be `DecimalPlaces` value of `arg`. + 1. Set `FailsFormat` to be `FailsFormat` value of `arg`. + 1. Set `FailsSelect` to be `FailsSelect` value of `arg`. +1. Else if `arg` is a numerical value + or a string matching the `number-literal` production, then + 1. Let `Input` be the numerical value of `arg`. +1. Else, + 1. Emit "bad-input" _Resolution Error_. + 1. Use a _fallback value_ as the resolved value of the _expression_. + Further steps of this algorithm are not followed. +1. If the `decimalPlaces` _option_ is set, then + 1. If its value resolves to a numerical integer value 0 or 1 + or their corresponding string representations `'0'` or `'1'`, then + 1. Set `DecimalPlaces` to be the numerical value of the _option_. + 1. Else if its value is not an unresolved value set by _option resolution_, + 1. Emit "bad-option" _Resolution Error_. + 1. Use a _fallback value_ as the resolved value of the _expression_. +1. If the `fails` _option_ is set, then + 1. If its value resolves to the string `'always'`, then + 1. Set `FailsFormat` to be `true`. + 1. Set `FailsSelect` to be `true`. + 1. Else if its value resolves to the string `'format'`, then + 1. Set `FailsFormat` to be `true`. + 1. Else if its value resolves to the string `'select'`, then + 1. Set `FailsSelect` to be `true`. + 1. Else if its value does not resolve to the string `'never'`, then + 1. Emit "bad-option" _Resolution Error_. + +When `:test:function` is used as a _selector_, +the behaviour of calling it as the `rv` value of MatchSelectorKeys(`rv`, `keys`) +(see [Resolve Preferences](/spec/formatting.md#resolve-preferences) for more information) +depends on its `Input`, `DecimalPlaces` and `FailsSelect` values. + +- If `FailsSelect` is `true`, + calling the method will fail and not return any value. +- If the `Input` is 1 and `DecimalPlaces` is 1, + the method will return some slice of the list « `'1.0'`, `'1'` », + depending on whether those values are included in `keys`. +- If the `Input` is 1 and `DecimalPlaces` is 0, + the method will return the list « `'1'` » if `keys` includes `'1'`, or an empty list otherwise. +- If the `Input` is any other value, the method will return an empty list. + +When an _expression_ with a `:test:function` _annotation_ is assigned to a _variable_ by a _declaration_ +and that _variable_ is used as an _option_ value, +its resolved value is the `Input` value. + +When `:test:function` is used as a _formatter_, +a _placeholder_ resolving to a value with a `:test:function` _expression_ +is formatted as a concatenation of the following parts: + +1. If `Input` is less than 0, the character `-` U+002D Hyphen-Minus. +1. The truncated absolute integer value of `Input`, i.e. floor(abs(`Input`)), + formatted as a sequence of decimal digit characters (U+0030...U+0039). +1. If `DecimalPlaces` is 1, then + 1. The character `.` U+002E Full Stop. + 1. The single decimal digit character representing the value floor((abs(`Input`) - floor(abs(`Input`))) \* 10) + +If the formatting target is a sequence of parts, +each of the above parts will be emitted separately +rather than being concatenated into a single string. + +If `FailsFormat` is `true`, +attempting to format the _placeholder_ to any formatting target will fail. + +### `:test:select` + +This _function_ accepts the same _operands_ and _options_, +and behaves exactly the same as `:test:function`, +except that it cannot be used for formatting. + +When `:test:select` is used as a _formatter_, +a "not-formattable" error is emitted and the _placeholder_ is formatted with +a _fallback value_. + +### `:test:format` + +This _function_ accepts the same _operands_ and _options_, +and behaves exactly the same as `:test:function`, +except that it cannot be used for selection. + +When `:test:format` is used as a _selector_, +the steps under 2.iii. of [Resolve Selectors](/spec/formatting.md#resolve-selectors) are followed. diff --git a/common/testData/messageFormat/data-model-errors.json b/common/testData/messageFormat/data-model-errors.json deleted file mode 100644 index 0a6bd67641b..00000000000 --- a/common/testData/messageFormat/data-model-errors.json +++ /dev/null @@ -1,32 +0,0 @@ -{ - "Variant Key Mismatch": [ - ".match {$foo :x} * * {{foo}}", - ".match {$foo :x} {$bar :x} * {{foo}}" - ], - "Missing Fallback Variant": [ - ".match {:foo} 1 {{_}}", - ".match {:foo} other {{_}}", - ".match {:foo} {:bar} * 1 {{_}} 1 * {{_}}" - ], - "Missing Selector Annotation": [ - ".match {$foo} one {{one}} * {{other}}", - ".input {$foo} .match {$foo} one {{one}} * {{other}}", - ".local $foo = {$bar} .match {$foo} one {{one}} * {{other}}" - ], - "Duplicate Declaration": [ - ".input {$foo} .input {$foo} {{_}}", - ".input {$foo} .local $foo = {42} {{_}}", - ".local $foo = {42} .input {$foo} {{_}}", - ".local $foo = {:unknown} .local $foo = {42} {{_}}", - ".local $foo = {$bar} .local $bar = {42} {{_}}", - ".local $foo = {$foo} {{_}}", - ".local $foo = {$bar} .local $bar = {$baz} {{_}}", - ".local $foo = {$bar :func} .local $bar = {$baz} {{_}}", - ".local $foo = {42 :func opt=$foo} {{_}}", - ".local $foo = {42 :func opt=$bar} .local $bar = {42} {{_}}" - ], - "Duplicate Option Name": [ - "bad {:placeholder option=x option=x}", - "bad {:placeholder ns:option=x ns:option=y}" - ] -} diff --git a/common/testData/messageFormat/schemas/v0/tests.schema.json b/common/testData/messageFormat/schemas/v0/tests.schema.json new file mode 100644 index 00000000000..a0dd0a56e11 --- /dev/null +++ b/common/testData/messageFormat/schemas/v0/tests.schema.json @@ -0,0 +1,377 @@ +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "$id": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "title": "MessageFormat 2 data-driven tests", + "description": "The main schema for MessageFormat 2 test data.", + "type": "object", + "additionalProperties": false, + "required": [ + "tests" + ], + "properties": { + "$schema": { + "type": "string", + "description": "Identifier for the test file JSON schema." + }, + "scenario": { + "type": "string", + "description": "Identifier for the tests in the file." + }, + "description": { + "type": "string", + "description": "Information about the test scenario." + }, + "defaultTestProperties": { + "$ref": "#/$defs/defaultTestProperties" + }, + "tests": { + "type": "array", + "items": { + "$ref": "#/$defs/test" + }, + "minItems": 1 + } + }, + "$comment": "This allOf specifies required test properties that allow a default. A value will be required in 'defaultTestProperties' if one is not provided for every individual test.", + "allOf": [ + { + "anyOf": [ + { + "properties": { + "defaultTestProperties": { + "required": [ + "locale" + ] + } + } + }, + { + "properties": { + "tests": { + "type": "array", + "items": { + "required": [ + "locale" + ] + } + } + } + } + ] + }, + { + "anyOf": [ + { + "properties": { + "defaultTestProperties": { + "required": [ + "src" + ] + } + } + }, + { + "properties": { + "tests": { + "type": "array", + "items": { + "required": [ + "src" + ] + } + } + } + } + ] + }, + { + "$comment": "Only one assertion is required. It doesn't matter which type.", + "anyOf": [ + { + "properties": { + "defaultTestProperties": { + "$ref": "#/$defs/anyExp" + } + } + }, + { + "properties": { + "tests": { + "type": "array", + "items": { + "$ref": "#/$defs/anyExp" + } + } + } + } + ] + } + ], + "$defs": { + "defaultTestProperties": { + "type": "object", + "additionalProperties": false, + "properties": { + "locale": { + "$ref": "#/$defs/locale" + }, + "src": { + "$ref": "#/$defs/src" + }, + "params": { + "$ref": "#/$defs/params" + }, + "exp": { + "$ref": "#/$defs/exp" + }, + "expParts": { + "$ref": "#/$defs/expParts" + }, + "expErrors": { + "$ref": "#/$defs/expErrors" + } + } + }, + "test": { + "type": "object", + "additionalProperties": false, + "properties": { + "description": { + "type": "string", + "description": "Information about the test." + }, + "locale": { + "$ref": "#/$defs/locale" + }, + "src": { + "$ref": "#/$defs/src" + }, + "params": { + "$ref": "#/$defs/params" + }, + "exp": { + "$ref": "#/$defs/exp" + }, + "expParts": { + "$ref": "#/$defs/expParts" + }, + "expErrors": { + "$ref": "#/$defs/expErrors" + }, + "only": { + "type": "boolean", + "description": "Normally not set. A flag to use during development to only run one or more specific tests." + } + } + }, + "locale": { + "description": "The locale to use for formatting.", + "type": "string" + }, + "src": { + "description": "The MF2 syntax source.", + "type": "string" + }, + "params": { + "description": "Parameters to pass in to the formatter for resolving external variables.", + "type": "array", + "items": { + "$ref": "#/$defs/var" + } + }, + "var": { + "type": "object", + "oneOf": [ + { + "additionalProperties": false, + "required": [ + "name", + "value" + ], + "properties": { + "name": { + "type": "string" + }, + "value": {} + } + }, + { + "additionalProperties": false, + "required": [ + "name", + "type", + "value" + ], + "properties": { + "name": { + "type": "string" + }, + "type": { + "const": "datetime" + }, + "value": { + "$comment": "Should be converted to a datetime.", + "type": "string" + } + } + } + ] + }, + "exp": { + "description": "The expected result of formatting the message to a string.", + "type": "string" + }, + "expParts": { + "description": "The expected result of formatting the message to parts.", + "type": "array", + "items": { + "oneOf": [ + { + "description": "Message literal part.", + "type": "object", + "additionalProperties": false, + "required": [ + "type", + "value" + ], + "properties": { + "type": { + "const": "literal" + }, + "value": { + "type": "string" + } + } + }, + { + "description": "Message markup part.", + "type": "object", + "additionalProperties": false, + "required": [ + "type", + "kind", + "name" + ], + "properties": { + "type": { + "const": "markup" + }, + "kind": { + "enum": [ + "open", + "standalone", + "close" + ] + }, + "source": { + "type": "string" + }, + "name": { + "type": "string" + }, + "options": { + "type": "object" + } + } + }, + { + "description": "Message expression part.", + "type": "object", + "required": [ + "type", + "source" + ], + "not": { + "required": [ + "parts", + "value" + ] + }, + "properties": { + "type": { + "type": "string" + }, + "source": { + "type": "string" + }, + "locale": { + "type": "string" + }, + "parts": { + "type": "array", + "items": { + "type": "object", + "properties": { + "type": { + "type": "string" + }, + "source": { + "type": "string" + }, + "value": {} + }, + "required": [ + "type" + ] + } + }, + "value": {} + } + } + ] + } + }, + "expErrors": { + "description": "The runtime errors expected to be emitted when formatting the message. If expErrors is either absent or empty, the message must be formatted without errors.", + "type": [ + "array", + "boolean" + ], + "items": { + "type": "object", + "additionalProperties": false, + "required": [ + "type" + ], + "properties": { + "type": { + "enum": [ + "syntax-error", + "variant-key-mismatch", + "missing-fallback-variant", + "missing-selector-annotation", + "duplicate-declaration", + "duplicate-option-name", + "duplicate-variant", + "unresolved-variable", + "unknown-function", + "bad-selector", + "bad-operand", + "bad-option", + "bad-variant-key" + ] + } + } + } + }, + "anyExp": { + "anyOf": [ + { + "required": [ + "exp" + ] + }, + { + "required": [ + "expParts" + ] + }, + { + "required": [ + "expErrors" + ] + } + ] + } + } +} diff --git a/common/testData/messageFormat/syntax-errors.json b/common/testData/messageFormat/syntax-errors.json deleted file mode 100644 index fc4537131c8..00000000000 --- a/common/testData/messageFormat/syntax-errors.json +++ /dev/null @@ -1,56 +0,0 @@ -[ - ".", - "{", - "}", - "{}", - "{{", - "{{}", - "{{}}}", - "{|foo| #markup}", - "{{missing end brace}", - "{{missing end braces", - "{{missing end {$braces", - "{{extra}} content", - "empty { } placeholder", - "missing space {42:func}", - "missing space {|foo|:func}", - "missing space {|foo|@bar}", - "missing space {:func@bar}", - "missing space {:func @bar@baz}", - "missing space {:func @bar=42@baz}", - "missing space {+reserved@bar}", - "missing space {&private@bar}", - "bad {:} placeholder", - "bad {\\u0000placeholder}", - "no-equal {|42| :number minimumFractionDigits 2}", - "bad {:placeholder option=}", - "bad {:placeholder option value}", - "bad {:placeholder option:value}", - "bad {:placeholder option}", - "bad {:placeholder:}", - "bad {::placeholder}", - "bad {:placeholder::foo}", - "bad {:placeholder option:=x}", - "bad {:placeholder :option=x}", - "bad {:placeholder option::x=y}", - "bad {$placeholder option}", - "bad {:placeholder @attribute=}", - "bad {:placeholder @attribute=@foo}", - "no {placeholder end", - "no {$placeholder end", - "no {:placeholder end", - "no {|placeholder| end", - "no {|literal} end", - "no {|literal or placeholder end", - ".local bar = {|foo|} {{_}}", - ".local #bar = {|foo|} {{_}}", - ".local $bar {|foo|} {{_}}", - ".local $bar = |foo| {{_}}", - ".match {#foo} * {{foo}}", - ".match {} * {{foo}}", - ".match {|foo| :x} {|bar| :x} ** {{foo}}", - ".match * {{foo}}", - ".match {|x| :x} * foo", - ".match {|x| :x} * {{foo}} extra", - ".match |x| * {{foo}}" -] diff --git a/common/testData/messageFormat/test-core.json b/common/testData/messageFormat/test-core.json deleted file mode 100644 index 0e7049fdb1a..00000000000 --- a/common/testData/messageFormat/test-core.json +++ /dev/null @@ -1,212 +0,0 @@ -[ - { "src": "hello", "exp": "hello" }, - { "src": "hello {world}", "exp": "hello world" }, - { - "src": "hello { world\t\n}", - "exp": "hello world", - "cleanSrc": "hello {world}" - }, - { - "src": "hello {\u3000world\r}", - "exp": "hello world", - "cleanSrc": "hello {world}" - }, - { "src": "hello {|world|}", "exp": "hello world" }, - { "src": "hello {||}", "exp": "hello " }, - { - "src": "hello {$place}", - "params": { "place": "world" }, - "exp": "hello world" - }, - { - "src": "hello {$place-.}", - "params": { "place-.": "world" }, - "exp": "hello world" - }, - { - "src": "hello {$place}", - "errors": [{ "type": "unresolved-var" }], - "exp": "hello {$place}" - }, - { - "src": "{$one} and {$two}", - "params": { "one": 1.3, "two": 4.2 }, - "exp": "1.3 and 4.2" - }, - { - "src": "{$one} et {$two}", - "locale": "fr", - "params": { "one": 1.3, "two": 4.2 }, - "exp": "1,3 et 4,2" - }, - { "src": ".local $foo = {bar} {{bar {$foo}}}", "exp": "bar bar" }, - { "src": ".local $foo = {|bar|} {{bar {$foo}}}", "exp": "bar bar" }, - { - "src": ".local $foo = {|bar|} {{bar {$foo}}}", - "params": { "foo": "foo" }, - "exp": "bar bar" - }, - { - "src": ".local $foo = {$bar} {{bar {$foo}}}", - "params": { "bar": "foo" }, - "exp": "bar foo" - }, - { - "src": ".local $foo = {$baz} .local $bar = {$foo} {{bar {$bar}}}", - "params": { "baz": "foo" }, - "exp": "bar foo" - }, - { - "src": ".input {$foo} {{bar {$foo}}}", - "params": { "foo": "foo" }, - "exp": "bar foo" - }, - { - "src": ".input {$foo} .local $bar = {$foo} {{bar {$bar}}}", - "params": { "foo": "foo" }, - "exp": "bar foo" - }, - { - "src": ".local $foo = {$baz} .local $bar = {$foo} {{bar {$bar}}}", - "params": { "baz": "foo" }, - "exp": "bar foo" - }, - { "src": ".local $x = {42} .local $y = {$x} {{{$x} {$y}}}", "exp": "42 42" }, - { - "src": "{#tag}", - "exp": "", - "parts": [{ "type": "markup", "kind": "open", "name": "tag" }] - }, - { - "src": "{#tag}content", - "exp": "content", - "parts": [ - { "type": "markup", "kind": "open", "name": "tag" }, - { "type": "literal", "value": "content" } - ] - }, - { - "src": "{#ns:tag}content{/ns:tag}", - "exp": "content", - "parts": [ - { "type": "markup", "kind": "open", "name": "ns:tag" }, - { "type": "literal", "value": "content" }, - { "type": "markup", "kind": "close", "name": "ns:tag" } - ] - }, - { - "src": "{/tag}content", - "exp": "content", - "parts": [ - { "type": "markup", "kind": "close", "name": "tag" }, - { "type": "literal", "value": "content" } - ] - }, - { - "src": "{#tag foo=bar}", - "exp": "", - "parts": [ - { - "type": "markup", - "kind": "open", - "name": "tag", - "options": { "foo": "bar" } - } - ] - }, - { - "src": "{#tag foo=bar/}", - "cleanSrc": "{#tag foo=bar /}", - "exp": "", - "parts": [ - { - "type": "markup", - "kind": "standalone", - "name": "tag", - "options": { "foo": "bar" } - } - ] - }, - { - "src": "{#tag a:foo=|foo| b:bar=$bar}", - "params": { "bar": "b a r" }, - "exp": "", - "parts": [ - { - "type": "markup", - "kind": "open", - "name": "tag", - "options": { "a:foo": "foo", "b:bar": "b a r" } - } - ] - }, - { - "src": "{/tag foo=bar}", - "exp": "", - "parts": [ - { - "type": "markup", - "kind": "close", - "name": "tag", - "options": { "foo": "bar" } - } - ] - }, - { - "src": "{42 @foo @bar=13}", - "exp": "42", - "parts": [{ "type": "string", "value": "42" }] - }, - { - "src": "{42 @foo=$bar}", - "exp": "42", - "parts": [{ "type": "string", "value": "42" }] - }, - { - "src": "foo {+reserved}", - "exp": "foo {+}", - "parts": [ - { "type": "literal", "value": "foo " }, - { "type": "fallback", "source": "+" } - ], - "errors": [{ "type": "unsupported-annotation" }] - }, - { - "src": "foo {&private}", - "exp": "foo {&}", - "parts": [ - { "type": "literal", "value": "foo " }, - { "type": "fallback", "source": "&" } - ], - "errors": [{ "type": "unsupported-annotation" }] - }, - { - "src": "foo {?reserved @a @b=$c}", - "exp": "foo {?}", - "parts": [ - { "type": "literal", "value": "foo " }, - { "type": "fallback", "source": "?" } - ], - "errors": [{ "type": "unsupported-annotation" }] - }, - { - "src": ".foo {42} {{bar}}", - "exp": "bar", - "parts": [{ "type": "literal", "value": "bar" }], - "errors": [{ "type": "unsupported-statement" }] - }, - { - "src": ".foo{42}{{bar}}", - "cleanSrc": ".foo {42} {{bar}}", - "exp": "bar", - "parts": [{ "type": "literal", "value": "bar" }], - "errors": [{ "type": "unsupported-statement" }] - }, - { - "src": ".foo |}lit{| {42}{{bar}}", - "cleanSrc": ".foo |}lit{| {42} {{bar}}", - "exp": "bar", - "parts": [{ "type": "literal", "value": "bar" }], - "errors": [{ "type": "unsupported-statement" }] - } -] diff --git a/common/testData/messageFormat/test-core.json.d.ts b/common/testData/messageFormat/test-core.json.d.ts deleted file mode 100644 index 495fbf7b4ba..00000000000 --- a/common/testData/messageFormat/test-core.json.d.ts +++ /dev/null @@ -1,25 +0,0 @@ -// Copyright © 1991-2024 Unicode, Inc. -// For terms of use, see http://www.unicode.org/copyright.html -// SPDX-License-Identifier: Unicode-3.0 - -export type TestMessage = { - /** The MF2 message to be tested. */ - src: string; - /** The locale to use for formatting. Defaults to 'en-US'. */ - locale?: string; - /** Parameters to pass in to the formatter for resolving external variables. */ - params?: Record; - /** The expected result of formatting the message to a string. */ - exp: string; - /** The expected result of formatting the message to parts. */ - parts?: Array; - /** A normalixed form of `src`, for testing stringifiers. */ - cleanSrc?: string; - /** The runtime errors expected to be emitted when formatting the message. */ - errors?: Array<{ type: string }>; - /** Normally not set. A flag to use during development to only run one or more specific tests. */ - only?: boolean; -}; - -declare const data: TestMessage[]; -export default data; diff --git a/common/testData/messageFormat/test-functions.json b/common/testData/messageFormat/test-functions.json deleted file mode 100644 index 03080a2b6cd..00000000000 --- a/common/testData/messageFormat/test-functions.json +++ /dev/null @@ -1,322 +0,0 @@ -{ - "date": [ - { "src": "{:date}", "exp": "{:date}", "errors": [{ "type": "bad-input" }] }, - { - "src": "{horse :date}", - "exp": "{|horse|}", - "errors": [{ "type": "bad-input" }] - }, - { "src": "{|2006-01-02| :date}" }, - { "src": "{|2006-01-02T15:04:06| :date}" }, - { "src": "{|2006-01-02| :date style=long}" }, - { - "src": ".local $d = {|2006-01-02| :date style=long} {{{$d :date}}}" - }, - { - "src": ".local $t = {|2006-01-02T15:04:06| :time} {{{$t :date}}}" - } - ], - "time": [ - { "src": "{:time}", "exp": "{:time}", "errors": [{ "type": "bad-input" }] }, - { - "src": "{horse :time}", - "exp": "{|horse|}", - "errors": [{ "type": "bad-input" }] - }, - { "src": "{|2006-01-02T15:04:06| :time}" }, - { - "src": "{|2006-01-02T15:04:06| :time style=medium}" - }, - { - "src": ".local $t = {|2006-01-02T15:04:06| :time style=medium} {{{$t :time}}}" - }, - { - "src": ".local $d = {|2006-01-02T15:04:06| :date} {{{$d :time}}}" - } - ], - "datetime": [ - { - "src": "{:datetime}", - "exp": "{:datetime}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "{$x :datetime}", - "exp": "{$x}", - "params": { "x": true }, - "errors": [{ "type": "bad-input" }] - }, - { - "src": "{horse :datetime}", - "exp": "{|horse|}", - "errors": [{ "name": "RangeError" }] - }, - { "src": "{|2006-01-02T15:04:06| :datetime}" }, - { - "src": "{|2006-01-02T15:04:06| :datetime year=numeric month=|2-digit|}" - }, - { - "src": "{|2006-01-02T15:04:06| :datetime dateStyle=long}" - }, - { - "src": "{|2006-01-02T15:04:06| :datetime timeStyle=medium}" - }, - { - "src": "{$dt :datetime}", - "params": { "dt": "2006-01-02T15:04:06" } - } - ], - "integer": [ - { "src": "hello {4.2 :integer}", "exp": "hello 4" }, - { "src": "hello {-4.20 :integer}", "exp": "hello -4" }, - { "src": "hello {0.42e+1 :integer}", "exp": "hello 4" }, - { - "src": ".match {$foo :integer} one {{one}} * {{other}}", - "params": { "foo": 1.2 }, - "exp": "one" - } - ], - "number": [ - { "src": "hello {4.2 :number}", "exp": "hello 4.2" }, - { "src": "hello {-4.20 :number}", "exp": "hello -4.2" }, - { "src": "hello {0.42e+1 :number}", "exp": "hello 4.2" }, - { - "src": "hello {foo :number}", - "exp": "hello {|foo|}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "invalid number literal {.1 :number}", - "exp": "invalid number literal {|.1|}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "invalid number literal {1. :number}", - "exp": "invalid number literal {|1.|}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "invalid number literal {01 :number}", - "exp": "invalid number literal {|01|}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "invalid number literal {|+1| :number}", - "exp": "invalid number literal {|+1|}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "invalid number literal {0x1 :number}", - "exp": "invalid number literal {|0x1|}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "hello {:number}", - "exp": "hello {:number}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": "hello {4.2 :number minimumFractionDigits=2}", - "exp": "hello 4.20" - }, - { - "src": "hello {|4.2| :number minimumFractionDigits=|2|}", - "exp": "hello 4.20" - }, - { - "src": "hello {4.2 :number minimumFractionDigits=$foo}", - "params": { "foo": 2 }, - "exp": "hello 4.20" - }, - { - "src": "hello {|4.2| :number minimumFractionDigits=$foo}", - "params": { "foo": "2" }, - "exp": "hello 4.20" - }, - { - "src": ".local $foo = {$bar :number} {{bar {$foo}}}", - "params": { "bar": 4.2 }, - "exp": "bar 4.2" - }, - { - "src": ".local $foo = {$bar :number minimumFractionDigits=2} {{bar {$foo}}}", - "params": { "bar": 4.2 }, - "exp": "bar 4.20" - }, - { - "src": ".local $foo = {$bar :number minimumFractionDigits=foo} {{bar {$foo}}}", - "params": { "bar": 4.2 }, - "exp": "bar {$bar}", - "errors": [{ "type": "bad-option" }] - }, - { - "src": ".local $foo = {$bar :number} {{bar {$foo}}}", - "params": { "bar": "foo" }, - "exp": "bar {$bar}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": ".input {$foo :number} {{bar {$foo}}}", - "params": { "foo": 4.2 }, - "exp": "bar 4.2" - }, - { - "src": ".input {$foo :number minimumFractionDigits=2} {{bar {$foo}}}", - "params": { "foo": 4.2 }, - "exp": "bar 4.20" - }, - { - "src": ".input {$foo :number minimumFractionDigits=foo} {{bar {$foo}}}", - "params": { "foo": 4.2 }, - "exp": "bar {$foo}", - "errors": [{ "type": "bad-option" }] - }, - { - "src": ".input {$foo :number} {{bar {$foo}}}", - "params": { "foo": "foo" }, - "exp": "bar {$foo}", - "errors": [{ "type": "bad-input" }] - }, - { - "src": ".match {$foo :number} one {{one}} * {{other}}", - "params": { "foo": 1 }, - "exp": "one" - }, - { - "src": ".match {$foo :number} 1 {{=1}} one {{one}} * {{other}}", - "params": { "foo": 1 }, - "exp": "=1" - }, - { - "src": ".match {$foo :number} one {{one}} 1 {{=1}} * {{other}}", - "params": { "foo": 1 }, - "exp": "=1" - }, - { - "src": ".match {$foo :number} {$bar :number} one one {{one one}} one * {{one other}} * * {{other}}", - "params": { "foo": 1, "bar": 1 }, - "exp": "one one" - }, - { - "src": ".match {$foo :number} {$bar :number} one one {{one one}} one * {{one other}} * * {{other}}", - "params": { "foo": 1, "bar": 2 }, - "exp": "one other" - }, - { - "src": ".match {$foo :number} {$bar :number} one one {{one one}} one * {{one other}} * * {{other}}", - "params": { "foo": 2, "bar": 2 }, - "exp": "other" - }, - { - "src": ".input {$foo :number} .match {$foo} one {{one}} * {{other}}", - "params": { "foo": 1 }, - "exp": "one" - }, - { - "src": ".local $foo = {$bar :number} .match {$foo} one {{one}} * {{other}}", - "params": { "bar": 1 }, - "exp": "one" - }, - { - "src": ".input {$foo :number} .local $bar = {$foo} .match {$bar} one {{one}} * {{other}}", - "params": { "foo": 1 }, - "exp": "one" - }, - { - "src": ".input {$bar :number} .match {$bar} one {{one}} * {{other}}", - "params": { "bar": 2 }, - "exp": "other" - }, - { - "src": ".input {$bar} .match {$bar :number} one {{one}} * {{other}}", - "params": { "bar": 1 }, - "exp": "one" - }, - { - "src": ".input {$bar} .match {$bar :number} one {{one}} * {{other}}", - "params": { "bar": 2 }, - "exp": "other" - }, - { - "src": ".input {$bar} .match {$bar :number} one {{one}} * {{other}}", - "params": { "bar": 1 }, - "exp": "one" - }, - { - "src": ".input {$bar} .match {$bar :number} one {{one}} * {{other}}", - "params": { "bar": 2 }, - "exp": "other" - }, - { - "src": ".input {$none} .match {$foo :number} one {{one}} * {{{$none}}}", - "params": { "foo": 1 }, - "exp": "one" - }, - { - "src": ".local $bar = {$none} .match {$foo :number} one {{one}} * {{{$bar}}}", - "params": { "foo": 1 }, - "exp": "one" - }, - { - "src": ".local $bar = {$none} .match {$foo :number} one {{one}} * {{{$bar}}}", - "params": { "foo": 2 }, - "exp": "{$none}", - "errors": [{ "type": "unresolved-var" }] - }, - { - "src": "{42 :number @foo @bar=13}", - "exp": "42", - "parts": [ - { "type": "number", "parts": [{ "type": "integer", "value": "42" }] } - ] - } - ], - "ordinal": [ - { - "src": ".match {$foo :ordinal} one {{st}} two {{nd}} few {{rd}} * {{th}}", - "params": { "foo": 1 }, - "exp": "th", - "errors": [{ "type": "missing-func" }, { "type": "not-selectable" }] - }, - { - "src": "hello {42 :ordinal}", - "exp": "hello {|42|}", - "errors": [{ "type": "missing-func" }] - } - ], - "plural": [ - { - "src": ".match {$foo :plural} one {{one}} * {{other}}", - "params": { "foo": 1 }, - "exp": "other", - "errors": [{ "type": "missing-func" }, { "type": "not-selectable" }] - }, - { - "src": "hello {42 :plural}", - "exp": "hello {|42|}", - "errors": [{ "type": "missing-func" }] - } - ], - "string": [ - { - "src": ".match {$foo :string} |1| {{one}} * {{other}}", - "params": { "foo": "1" }, - "exp": "one" - }, - { - "src": ".match {$foo :string} 1 {{one}} * {{other}}", - "params": { "foo": 1 }, - "exp": "one" - }, - { - "src": ".match {$foo :string} 1 {{one}} * {{other}}", - "params": { "foo": null }, - "exp": "other" - }, - { - "src": ".match {$foo :string} 1 {{one}} * {{other}}", - "exp": "other", - "errors": [{ "type": "unresolved-var" }] - } - ] -} diff --git a/common/testData/messageFormat/test-functions.json.d.ts b/common/testData/messageFormat/test-functions.json.d.ts deleted file mode 100644 index 360c76a4196..00000000000 --- a/common/testData/messageFormat/test-functions.json.d.ts +++ /dev/null @@ -1,8 +0,0 @@ -// Copyright © 1991-2024 Unicode, Inc. -// For terms of use, see http://www.unicode.org/copyright.html -// SPDX-License-Identifier: Unicode-3.0 - -import type { TestMessage } from './test-core.json'; - -declare const data: Record; -export default data; diff --git a/common/testData/messageFormat/tests/data-model-errors.json b/common/testData/messageFormat/tests/data-model-errors.json new file mode 100644 index 00000000000..f1f54cabe7c --- /dev/null +++ b/common/testData/messageFormat/tests/data-model-errors.json @@ -0,0 +1,189 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Data model errors", + "defaultTestProperties": { + "locale": "en-US" + }, + "tests": [ + { + "src": ".input {$foo :x} .match $foo * * {{foo}}", + "expErrors": [ + { + "type": "variant-key-mismatch" + } + ] + }, + { + "src": ".input {$foo :x} .input {$bar :x} .match $foo $bar * {{foo}}", + "expErrors": [ + { + "type": "variant-key-mismatch" + } + ] + }, + { + "src": ".input {$foo :x} .match $foo 1 {{_}}", + "expErrors": [ + { + "type": "missing-fallback-variant" + } + ] + }, + { + "src": ".input {$foo :x} .match $foo other {{_}}", + "expErrors": [ + { + "type": "missing-fallback-variant" + } + ] + }, + { + "src": ".input {$foo :x} .input {$bar :x} .match $foo $bar * 1 {{_}} 1 * {{_}}", + "expErrors": [ + { + "type": "missing-fallback-variant" + } + ] + }, + { + "src": ".input {$foo} .match $foo one {{one}} * {{other}}", + "expErrors": [ + { + "type": "missing-selector-annotation" + } + ] + }, + { + "src": ".local $foo = {$bar} .match $foo one {{one}} * {{other}}", + "expErrors": [ + { + "type": "missing-selector-annotation" + } + ] + }, + { + "src": ".input {$bar} .local $foo = {$bar} .match $foo one {{one}} * {{other}}", + "expErrors": [ + { + "type": "missing-selector-annotation" + } + ] + }, + { + "src": ".input {$foo} .input {$foo} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".input {$foo} .local $foo = {42} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {42} .input {$foo} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {:unknown} .local $foo = {42} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {$bar} .local $bar = {42} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {$foo} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {$bar} .local $bar = {$baz} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {$bar :func} .local $bar = {$baz} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {42 :func opt=$foo} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": ".local $foo = {42 :func opt=$bar} .local $bar = {42} {{_}}", + "expErrors": [ + { + "type": "duplicate-declaration" + } + ] + }, + { + "src": "bad {:placeholder option=x option=x}", + "expErrors": [ + { + "type": "duplicate-option-name" + } + ] + }, + { + "src": "bad {:placeholder ns:option=x ns:option=y}", + "expErrors": [ + { + "type": "duplicate-option-name" + } + ] + }, + { + "src": ".input {$var :string} .match $var * {{The first default}} * {{The second default}}", + "expErrors": [ + { + "type": "duplicate-variant" + } + ] + }, + { + "src": ".input {$x :string} .input {$y :string} .match $x $y * foo {{The first foo variant}} bar * {{The bar variant}} * |foo| {{The second foo variant}} * * {{The default variant}}", + "expErrors": [ + { + "type": "duplicate-variant" + } + ] + }, + { + "src": ".local $star = {star :string} .match $star |*| {{Literal star}} * {{The default}}", + "exp": "The default" + } + ] +} diff --git a/common/testData/messageFormat/tests/functions/date.json b/common/testData/messageFormat/tests/functions/date.json new file mode 100644 index 00000000000..494ca8d2345 --- /dev/null +++ b/common/testData/messageFormat/tests/functions/date.json @@ -0,0 +1,44 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Date function", + "description": "The built-in formatter for dates.", + "defaultTestProperties": { + "locale": "en-US", + "expErrors": false + }, + "tests": [ + { + "src": "{:date}", + "exp": "{:date}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{horse :date}", + "exp": "{|horse|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{|2006-01-02| :date}" + }, + { + "src": "{|2006-01-02T15:04:06| :date}" + }, + { + "src": "{|2006-01-02| :date style=long}" + }, + { + "src": ".local $d = {|2006-01-02| :date style=long} {{{$d :date}}}" + }, + { + "src": ".local $t = {|2006-01-02T15:04:06| :time} {{{$t :date}}}" + } + ] +} diff --git a/common/testData/messageFormat/tests/functions/datetime.json b/common/testData/messageFormat/tests/functions/datetime.json new file mode 100644 index 00000000000..758a8bbaa00 --- /dev/null +++ b/common/testData/messageFormat/tests/functions/datetime.json @@ -0,0 +1,66 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Datetime function", + "description": "The built-in formatter for datetimes.", + "defaultTestProperties": { + "locale": "en-US", + "expErrors": false + }, + "tests": [ + { + "src": "{:datetime}", + "exp": "{:datetime}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{$x :datetime}", + "exp": "{$x}", + "params": [ + { + "name": "x", + "value": true + } + ], + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{horse :datetime}", + "exp": "{|horse|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{|2006-01-02T15:04:06| :datetime}" + }, + { + "src": "{|2006-01-02T15:04:06| :datetime year=numeric month=|2-digit|}" + }, + { + "src": "{|2006-01-02T15:04:06| :datetime dateStyle=long}" + }, + { + "src": "{|2006-01-02T15:04:06| :datetime timeStyle=medium}" + }, + { + "src": "{$dt :datetime}", + "params": [ + { + "type": "datetime", + "name": "dt", + "value": "2006-01-02T15:04:06" + } + ] + } + ] +} diff --git a/common/testData/messageFormat/tests/functions/integer.json b/common/testData/messageFormat/tests/functions/integer.json new file mode 100644 index 00000000000..4ea96941e17 --- /dev/null +++ b/common/testData/messageFormat/tests/functions/integer.json @@ -0,0 +1,32 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Integer function", + "description": "The built-in formatter for integers.", + "defaultTestProperties": { + "locale": "en-US" + }, + "tests": [ + { + "src": "hello {4.2 :integer}", + "exp": "hello 4" + }, + { + "src": "hello {-4.20 :integer}", + "exp": "hello -4" + }, + { + "src": "hello {0.42e+1 :integer}", + "exp": "hello 4" + }, + { + "src": ".input {$foo :integer} .match $foo 1 {{one}} * {{other}}", + "params": [ + { + "name": "foo", + "value": 1.2 + } + ], + "exp": "one" + } + ] +} diff --git a/common/testData/messageFormat/tests/functions/number.json b/common/testData/messageFormat/tests/functions/number.json new file mode 100644 index 00000000000..2b00d83e495 --- /dev/null +++ b/common/testData/messageFormat/tests/functions/number.json @@ -0,0 +1,229 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Number function", + "description": "The built-in formatter for numbers.", + "defaultTestProperties": { + "locale": "en-US" + }, + "tests": [ + { + "src": "hello {4.2 :number}", + "exp": "hello 4.2" + }, + { + "src": "hello {-4.20 :number}", + "exp": "hello -4.2" + }, + { + "src": "hello {0.42e+1 :number}", + "exp": "hello 4.2" + }, + { + "src": "hello {foo :number}", + "exp": "hello {|foo|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "invalid number literal {|.1| :number}", + "exp": "invalid number literal {|.1|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "invalid number literal {|1.| :number}", + "exp": "invalid number literal {|1.|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "invalid number literal {|01| :number}", + "exp": "invalid number literal {|01|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "invalid number literal {|+1| :number}", + "exp": "invalid number literal {|+1|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "invalid number literal {|0x1| :number}", + "exp": "invalid number literal {|0x1|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "hello {:number}", + "exp": "hello {:number}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "hello {4.2 :number minimumFractionDigits=2}", + "exp": "hello 4.20" + }, + { + "src": "hello {|4.2| :number minimumFractionDigits=|2|}", + "exp": "hello 4.20" + }, + { + "src": "hello {4.2 :number minimumFractionDigits=$foo}", + "params": [ + { + "name": "foo", + "value": 2 + } + ], + "exp": "hello 4.20" + }, + { + "src": "hello {|4.2| :number minimumFractionDigits=$foo}", + "params": [ + { + "name": "foo", + "value": "2" + } + ], + "exp": "hello 4.20" + }, + { + "src": ".local $foo = {$bar :number} {{bar {$foo}}}", + "params": [ + { + "name": "bar", + "value": 4.2 + } + ], + "exp": "bar 4.2" + }, + { + "src": ".local $foo = {$bar :number minimumFractionDigits=2} {{bar {$foo}}}", + "params": [ + { + "name": "bar", + "value": 4.2 + } + ], + "exp": "bar 4.20" + }, + { + "src": ".local $foo = {$bar :number minimumFractionDigits=foo} {{bar {$foo}}}", + "params": [ + { + "name": "bar", + "value": 4.2 + } + ], + "exp": "bar {$bar}", + "expErrors": [ + { + "type": "bad-option" + } + ] + }, + { + "src": ".local $foo = {$bar :number} {{bar {$foo}}}", + "params": [ + { + "name": "bar", + "value": "foo" + } + ], + "exp": "bar {$bar}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": ".input {$foo :number} {{bar {$foo}}}", + "params": [ + { + "name": "foo", + "value": 4.2 + } + ], + "exp": "bar 4.2" + }, + { + "src": ".input {$foo :number minimumFractionDigits=2} {{bar {$foo}}}", + "params": [ + { + "name": "foo", + "value": 4.2 + } + ], + "exp": "bar 4.20" + }, + { + "src": ".input {$foo :number minimumFractionDigits=foo} {{bar {$foo}}}", + "params": [ + { + "name": "foo", + "value": 4.2 + } + ], + "exp": "bar {$foo}", + "expErrors": [ + { + "type": "bad-option" + } + ] + }, + { + "src": ".input {$foo :number} {{bar {$foo}}}", + "params": [ + { + "name": "foo", + "value": "foo" + } + ], + "exp": "bar {$foo}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{42 :number @foo @bar=13}", + "exp": "42", + "expParts": [ + { + "type": "number", + "source": "|42|", + "parts": [ + { + "type": "integer", + "value": "42" + } + ] + } + ] + } + ] +} diff --git a/common/testData/messageFormat/tests/functions/string.json b/common/testData/messageFormat/tests/functions/string.json new file mode 100644 index 00000000000..3543e7844a3 --- /dev/null +++ b/common/testData/messageFormat/tests/functions/string.json @@ -0,0 +1,49 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "String function", + "description": "The built-in formatter for strings.", + "defaultTestProperties": { + "locale": "en-US" + }, + "tests": [ + { + "src": ".input {$foo :string} .match $foo |1| {{one}} * {{other}}", + "params": [ + { + "name": "foo", + "value": "1" + } + ], + "exp": "one" + }, + { + "src": ".input {$foo :string} .match $foo 1 {{one}} * {{other}}", + "params": [ + { + "name": "foo", + "value": 1 + } + ], + "exp": "one" + }, + { + "src": ".input {$foo :string} .match $foo 1 {{one}} * {{other}}", + "params": [ + { + "name": "foo", + "value": null + } + ], + "exp": "other" + }, + { + "src": ".input {$foo :string} .match $foo 1 {{one}} * {{other}}", + "exp": "other", + "expErrors": [ + { + "type": "unresolved-variable" + } + ] + } + ] +} diff --git a/common/testData/messageFormat/tests/functions/time.json b/common/testData/messageFormat/tests/functions/time.json new file mode 100644 index 00000000000..416d18a3efe --- /dev/null +++ b/common/testData/messageFormat/tests/functions/time.json @@ -0,0 +1,41 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Time function", + "description": "The built-in formatter for times.", + "defaultTestProperties": { + "locale": "en-US", + "expErrors": false + }, + "tests": [ + { + "src": "{:time}", + "exp": "{:time}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{horse :time}", + "exp": "{|horse|}", + "expErrors": [ + { + "type": "bad-operand" + } + ] + }, + { + "src": "{|2006-01-02T15:04:06| :time}" + }, + { + "src": "{|2006-01-02T15:04:06| :time style=medium}" + }, + { + "src": ".local $t = {|2006-01-02T15:04:06| :time style=medium} {{{$t :time}}}" + }, + { + "src": ".local $d = {|2006-01-02T15:04:06| :date} {{{$d :time}}}" + } + ] +} diff --git a/common/testData/messageFormat/tests/pattern-selection.json b/common/testData/messageFormat/tests/pattern-selection.json new file mode 100644 index 00000000000..29dc146c190 --- /dev/null +++ b/common/testData/messageFormat/tests/pattern-selection.json @@ -0,0 +1,120 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Pattern selection", + "description": "Tests for pattern selection", + "defaultTestProperties": { + "locale": "und" + }, + "tests": [ + { + "src": ".local $x = {1 :test:select} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "1" + }, + { + "src": ".local $x = {0 :test:select} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "other" + }, + { + "src": ".input {$x :test:select} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "params": [{ "name": "x", "value": 1 }], + "exp": "1" + }, + { + "src": ".input {$x :test:select} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "params": [{ "name": "x", "value": 2 }], + "exp": "other" + }, + { + "src": ".input {$x :test:select} .local $y = {$x} .match $y 1.0 {{1.0}} 1 {{1}} * {{other}}", + "params": [{ "name": "x", "value": 1 }], + "exp": "1" + }, + { + "src": ".input {$x :test:select} .local $y = {$x} .match $y 1.0 {{1.0}} 1 {{1}} * {{other}}", + "params": [{ "name": "x", "value": 2 }], + "exp": "other" + }, + { + "src": ".local $x = {1 :test:select decimalPlaces=1} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "1.0" + }, + { + "src": ".local $x = {1 :test:select decimalPlaces=1} .match $x 1 {{1}} 1.0 {{1.0}} * {{other}}", + "exp": "1.0" + }, + { + "src": ".local $x = {1 :test:select decimalPlaces=9} .match $x 1.0 {{1.0}} 1 {{1}} * {{bad-option-value}}", + "exp": "bad-option-value", + "expErrors": [{ "type": "bad-option" }, { "type": "bad-selector" }] + }, + { + "src": ".input {$x :test:select} .local $y = {$x :test:select decimalPlaces=1} .match $y 1.0 {{1.0}} 1 {{1}} * {{other}}", + "params": [{ "name": "x", "value": 1 }], + "exp": "1.0" + }, + { + "src": ".input {$x :test:select decimalPlaces=1} .local $y = {$x :test:select} .match $y 1.0 {{1.0}} 1 {{1}} * {{other}}", + "params": [{ "name": "x", "value": 1 }], + "exp": "1.0" + }, + { + "src": ".input {$x :test:select decimalPlaces=9} .local $y = {$x :test:select decimalPlaces=1} .match $y 1.0 {{1.0}} 1 {{1}} * {{bad-option-value}}", + "params": [{ "name": "x", "value": 1 }], + "exp": "bad-option-value", + "expErrors": [ + { "type": "bad-option" }, + { "type": "bad-operand" }, + { "type": "bad-selector" } + ] + }, + { + "src": ".local $x = {1 :test:select fails=select} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "other", + "expErrors": [{ "type": "bad-selector" }] + }, + { + "src": ".local $x = {1 :test:select fails=format} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "1" + }, + { + "src": ".local $x = {1 :test:format} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "other", + "expErrors": [{ "type": "bad-selector" }] + }, + { + "src": ".input {$x :test:select} .match $x 1.0 {{1.0}} 1 {{1}} * {{other}}", + "exp": "other", + "expErrors": [ + { "type": "unresolved-variable" }, + { "type": "bad-operand" }, + { "type": "bad-selector" } + ] + }, + { + "src": ".local $x = {1 :test:select} .local $y = {1 :test:select} .match $x $y 1 1 {{1,1}} 1 * {{1,*}} * 1 {{*,1}} * * {{*,*}}", + "exp": "1,1" + }, + { + "src": ".local $x = {1 :test:select} .local $y = {0 :test:select} .match $x $y 1 1 {{1,1}} 1 * {{1,*}} * 1 {{*,1}} * * {{*,*}}", + "exp": "1,*" + }, + { + "src": ".local $x = {0 :test:select} .local $y = {1 :test:select} .match $x $y 1 1 {{1,1}} 1 * {{1,*}} * 1 {{*,1}} * * {{*,*}}", + "exp": "*,1" + }, + { + "src": ".local $x = {0 :test:select} .local $y = {0 :test:select} .match $x $y 1 1 {{1,1}} 1 * {{1,*}} * 1 {{*,1}} * * {{*,*}}", + "exp": "*,*" + }, + { + "src": ".local $x = {1 :test:select fails=select} .local $y = {1 :test:select} .match $x $y 1 1 {{1,1}} 1 * {{1,*}} * 1 {{*,1}} * * {{*,*}}", + "exp": "*,1", + "expErrors": [{ "type": "bad-selector" }] + }, + { + "src": ".local $x = {1 :test:select} .local $y = {1 :test:format} .match $x $y 1 1 {{1,1}} 1 * {{1,*}} * 1 {{*,1}} * * {{*,*}}", + "exp": "1,*", + "expErrors": [{ "type": "bad-selector" }] + } + ] +} diff --git a/common/testData/messageFormat/tests/syntax-errors.json b/common/testData/messageFormat/tests/syntax-errors.json new file mode 100644 index 00000000000..00d0420f46f --- /dev/null +++ b/common/testData/messageFormat/tests/syntax-errors.json @@ -0,0 +1,247 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Syntax errors", + "description": "Strings that produce syntax errors when parsed.", + "defaultTestProperties": { + "locale": "en-US", + "expErrors": [ + { + "type": "syntax-error" + } + ] + }, + "tests": [ + { + "src": "." + }, + { + "src": "{" + }, + { + "src": "}" + }, + { + "src": "{}" + }, + { + "src": "{{" + }, + { + "src": "{{}" + }, + { + "src": "{{}}}" + }, + { + "src": "{|foo| #markup}" + }, + { + "src": "{{missing end brace}" + }, + { + "src": "{{missing end braces" + }, + { + "src": "{{missing end {$braces" + }, + { + "src": "{{extra}} content" + }, + { + "src": "empty { } placeholder" + }, + { + "src": "missing space {42:func}" + }, + { + "src": "missing space {|foo|:func}" + }, + { + "src": "missing space {|foo|@bar}" + }, + { + "src": "missing space {:func@bar}" + }, + { + "src": "missing space {:func @bar@baz}" + }, + { + "src": "missing space {:func @bar=42@baz}" + }, + { + "src": "missing space {+reserved@bar}" + }, + { + "src": "missing space {&private@bar}" + }, + { + "src": "bad {:} placeholder" + }, + { + "src": "bad {\\u0000placeholder}" + }, + { + "src": "no-equal {|42| :number minimumFractionDigits 2}" + }, + { + "src": "bad {:placeholder option=}" + }, + { + "src": "bad {:placeholder option value}" + }, + { + "src": "bad {:placeholder option:value}" + }, + { + "src": "bad {:placeholder option}" + }, + { + "src": "bad {:placeholder:}" + }, + { + "src": "bad {::placeholder}" + }, + { + "src": "bad {:placeholder::foo}" + }, + { + "src": "bad {:placeholder option:=x}" + }, + { + "src": "bad {:placeholder :option=x}" + }, + { + "src": "bad {:placeholder option::x=y}" + }, + { + "src": "bad {$placeholder option}" + }, + { + "src": "bad {:placeholder @attribute=}" + }, + { + "src": "bad {:placeholder @attribute=@foo}" + }, + { + "src": "bad {:placeholder @attribute=$foo}" + }, + { + "src": "{ @misplaced = attribute }" + }, + { + "src": "no {placeholder end" + }, + { + "src": "no {$placeholder end" + }, + { + "src": "no {:placeholder end" + }, + { + "src": "no {|placeholder| end" + }, + { + "src": "no {|literal} end" + }, + { + "src": "no {|literal or placeholder end" + }, + { + "src": ".local bar = {|foo|} {{_}}" + }, + { + "src": ".local #bar = {|foo|} {{_}}" + }, + { + "src": ".local $bar {|foo|} {{_}}" + }, + { + "src": ".local $bar = |foo| {{_}}" + }, + { "src": ".match {{foo}}" }, + { "src": ".match * {{foo}}" }, + { "src": ".match x * {{foo}}" }, + { "src": ".match |x| * {{foo}}" }, + { "src": ".match :x * {{foo}}" }, + { "src": ".match {$foo} * {{foo}}" }, + { "src": ".match {#foo} * {{foo}}" }, + { "src": ".input {$x :x} .match {$x} * {{foo}}" }, + { "src": ".input {$x :x} .match$x * {{foo}}" }, + { "src": ".input {$x :x} .match $x* {{foo}}" }, + { "src": ".input {$x :x} .match $x|x| {{foo}} * {{foo}}" }, + { "src": ".input {$x :x} .local $y = {y :y} .match $x$y * * {{foo}}" }, + { "src": ".input {$x :x} .local $y = {y :y} .match $x $y ** {{foo}}" }, + { "src": ".input {$x :x} .match $x" }, + { "src": ".input {$x :x} .match $x *" }, + { "src": ".input {$x :x} .match $x * foo" }, + { "src": ".input {$x :x} .match $x * {{foo}} extra" }, + { "src": ".n{a}{{}}" }, + { "src": "{^}" }, + { "src": "{!}" }, + { "src": ".n .{a}{{}}" }, + { "src": ".n. {a}{{}}" }, + { "src": ".n.{a}{b}{{}}" }, + { "src": "{!.}" }, + { "src": "{! .}" }, + { "src": "{%}" }, + { "src": "{*}" }, + { "src": "{+}" }, + { "src": "{<}" }, + { "src": "{>}" }, + { "src": "{?}" }, + { "src": "{~}" }, + { "src": "{^.}" }, + { "src": "{^ .}" }, + { "src": "{&}" }, + { "src": "{!.\\{}" }, + { "src": "{!. \\{}" }, + { "src": "{!|a|}" }, + { "src": "foo {+reserved}" }, + { "src": "foo {&private}" }, + { "src": "foo {?reserved @a @b=c}" }, + { "src": ".foo {42} {{bar}}" }, + { "src": ".foo{42}{{bar}}" }, + { "src": ".foo |}lit{| {42}{{bar}}" }, + { "src": ".i {1} {{}}" }, + { "src": ".l $y = {|bar|} {{}}" }, + { "src": ".l $x.y = {|bar|} {{}}" }, + { "src": "hello {|4.2| %number}" }, + { "src": "hello {|4.2| %n|um|ber}" }, + { "src": "{+42}" }, + { "src": "hello {|4.2| &num|be|r}" }, + { "src": "hello {|4.2| ^num|be|r}" }, + { "src": "hello {|4.2| +num|be|r}" }, + { "src": "hello {|4.2| ?num|be||r|s}" }, + { "src": "hello {|foo| !number}" }, + { "src": "hello {|foo| *number}" }, + { "src": "hello {?number}" }, + { "src": "{xyzz }" }, + { "src": "hello {$foo ~xyzz }" }, + { "src": "hello {$x xyzz }" }, + { "src": "{ !xyzz }" }, + { "src": "{~xyzz }" }, + { "src": "{ num x \\\\ abcde |aaa||3.14||42| r }" }, + { "src": "hello {$foo >num x \\\\ abcde |aaa||3.14| |42| r }" }, + { "src" : ".input{ $n ~ }{{{$n}}}" } + ] +} diff --git a/common/testData/messageFormat/tests/syntax.json b/common/testData/messageFormat/tests/syntax.json new file mode 100644 index 00000000000..27b74b2f302 --- /dev/null +++ b/common/testData/messageFormat/tests/syntax.json @@ -0,0 +1,702 @@ +{ + "$schema": "https://raw.githubusercontent.com/unicode-org/message-format-wg/main/test/schemas/v0/tests.schema.json", + "scenario": "Syntax", + "description": "Test cases that do not depend on any registry definitions.", + "defaultTestProperties": { + "locale": "en-US" + }, + "tests": [ + { + "description": "message -> simple-message -> ", + "src": "", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> simple-start-char", + "src": "a", + "exp": "a" + }, + { + "description": "message -> simple-message -> simple-start pattern -> simple-start-char pattern -> ...", + "src": "hello", + "exp": "hello" + }, + { + "description": "message -> simple-message -> simple-start pattern -> escaped-char", + "src": "\\\\", + "exp": "\\" + }, + { + "description": "message -> simple-message -> simple-start pattern -> 1*escaped-char", + "src": "\\\\\\{\\|\\}", + "exp": "\\{|}" + }, + { + "description": "message -> simple-message -> simple-start pattern -> simple-start-char pattern -> ... -> simple-start-char *text-char placeholder", + "src": "hello {world}", + "exp": "hello world" + }, + { + "description": "message -> simple-message -> simple-start pattern -> simple-start-char pattern -> ... -> simple-start-char *text-char placeholder", + "src": "hello {|world|}", + "exp": "hello world" + }, + { + "description": "message -> simple-message -> s simple-start pattern -> s simple-start-char pattern -> ...", + "src": "\n hello\t", + "exp": "\n hello\t" + }, + { + "src": "hello {$place}", + "params": [ + { + "name": "place", + "value": "world" + } + ], + "exp": "hello world" + }, + { + "src": "hello {$place-.}", + "params": [ + { + "name": "place-.", + "value": "world" + } + ], + "exp": "hello world" + }, + { + "src": "hello {$place}", + "expErrors": [ + { + "type": "unresolved-variable" + } + ], + "exp": "hello {$place}" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> expression -> literal-expression -> \"{\" literal \"}\"", + "src": "{a}", + "exp": "a" + }, + { + "description": "... -> literal-expression -> \"{\" literal s annotation \"}\" -> \"{\" literal s function \"}\" -> \"{\" literal s \":\" identifier \"}\" -> \"{\" literal s \":\" name \"}\"", + "src": "{a :f}", + "exp": "{|a|}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... -> \"{\" literal s \":\" namespace \":\" name \"}\"", + "src": "{a :u:f}", + "exp": "{|a|}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> expression -> variable-expression -> \"{\" variable \"}\"", + "src": "{$x}", + "exp": "{$x}", + "expErrors": [{ "type": "unresolved-variable" }] + }, + { + "description": "... -> variable-expression -> \"{\" variable s annotation \"}\" -> \"{\" variable s function \"}\" -> \"{\" variable s \":\" identifier \"}\" -> \"{\" variable s \":\" name \"}\"", + "src": "{$x :f}", + "exp": "{$x}", + "expErrors": [{ "type": "unresolved-variable" }, { "type": "unknown-function" }] + }, + { + "description": "... -> \"{\" variable s \":\" namespace \":\" name \"}\"", + "src": "{$x :u:f}", + "exp": "{$x}", + "expErrors": [{ "type": "unresolved-variable" }, { "type": "unknown-function" }] + }, + { + "description": "... -> annotation-expression -> function -> \"{\" \":\" namespace \":\" name \"}\"", + "src": "{:u:f}", + "exp": "{:u:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... -> annotation-expression -> function -> \"{\" \":\" name \"}\"", + "src": "{:f}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "message -> complex-message -> complex-body -> quoted-pattern -> \"{{\" pattern \"}}\" -> \"{{\"\"}}\"", + "src": "{{}}", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"#\" identifier \"}\"", + "src": "{#tag}", + "exp": "", + "expParts": [ + { + "type": "markup", + "kind": "open", + "name": "tag" + } + ] + }, + { + "description": "message -> complex-message -> *(declaration [s]) complex-body -> declaration complex-body -> input-declaration complex-body -> input variable-expression complex-body", + "src": ".input{$x}{{}}", + "exp": "" + }, + { + "description": "message -> complex-message -> s *(declaration [s]) complex-body s -> s declaration complex-body s -> s input-declaration complex-body s -> s input variable-expression complex-body s", + "src": "\t.input{$x}{{}}\n", + "exp": "" + }, + { + "description": "message -> complex-message -> *(declaration [s]) complex-body -> declaration declaration complex-body -> input-declaration input-declaration complex-body -> input variable-expression input variable-expression complex-body", + "src": ".input{$x}.input{$y}{{}}", + "exp": "" + }, + { + "description": "message -> complex-message -> *(declaration [s]) complex-body -> declaration s declaration complex-body -> input-declaration s input-declaration complex-body -> input variable-expression s input variable-expression complex-body", + "src": ".input{$x} .input{$y}{{}}", + "exp": "" + }, + { + "description": "message -> complex-message -> s *(declaration [s]) complex-body s -> s complex-body s", + "src": " {{}} ", + "exp": "" + }, + { + "description": "message -> complex-message -> *(declaration [s]) complex-body -> declaration declaration complex-body -> local-declaration input-declaration complex-body -> local s variable [s] \"=\" [s] expression input variable-expression complex-body", + "src": ".local $x ={a}.input{$y}{{}}", + "exp": "" + }, + { + "description": "message -> complex-message -> complex-body -> ... -> matcher -> match-statement variant -> match selector key quoted-pattern -> \".match\" variable literal quoted-pattern", + "src": ".local $a={a :f}.match $a a{{}}*{{}}", + "exp": "", + "expErrors": [{ "type": "unknown-function" }, { "type": "bad-selector" }] + }, + { + "description": "... input-declaration -> input s variable-expression ...", + "src": ".input {$x}{{}}", + "exp": "" + }, + { + "description": "... local-declaration -> local s variable s \"=\" expression ...", + "src": ".local $x ={a}{{}}", + "exp": "" + }, + { + "description": "... local-declaration -> local s variable \"=\" s expression ...", + "src": ".local $x= {a}{{}}", + "exp": "" + }, + { + "description": "... local-declaration -> local s variable s \"=\" expression ...", + "src": ".local $x = {a}{{}}", + "exp": "" + }, + { + "description": "input-declaration-like content in complex-message", + "src": "{{.input {$x}}}", + "params": [{ "name": "x", "value": "X" }], + "exp": ".input X" + }, + { + "description": "local-declaration-like content in complex-message with leading whitespace", + "src": "{{ .local $x = {$y}}}", + "params": [{ "name": "y", "value": "Y" }], + "exp": " .local $x = Y" + }, + { + "description": "... matcher -> match-statement [s] variant -> match 1*([s] selector) variant -> match selector selector variant -> match selector selector variant key s key quoted-pattern", + "src": ".local $a={a :f}.local $b={b :f}.match $a $b a b{{}}* *{{}}", + "exp": "", + "expErrors": [ + { "type": "unknown-function" }, + { "type": "bad-selector" }, + { "type": "unknown-function" }, + { "type": "bad-selector" } + ] + }, + { + "description": "... matcher -> match-statement [s] variant -> match 1*([s] selector) variant -> match selector variant variant ...", + "src": ".local $a={a :f}.match $a a{{}}b{{}}*{{}}", + "exp": "", + "expErrors": [{ "type": "unknown-function" }, { "type": "bad-selector" }] + }, + { + "description": "... variant -> key s quoted-pattern -> ...", + "src": ".local $a={a :f}.match $a a {{}}*{{}}", + "exp": "", + "expErrors": [{ "type": "unknown-function" }, { "type": "bad-selector" }] + }, + { + "description": "... variant -> key s key s quoted-pattern -> ...", + "src": ".local $a={a :f}.local $b={b :f}.match $a $b a b {{}}* *{{}}", + "exp": "", + "expErrors": [ + { "type": "unknown-function" }, + { "type": "bad-selector" }, + { "type": "unknown-function" }, + { "type": "bad-selector" } + ] + }, + { + "description": "... key -> \"*\" ...", + "src": ".local $a={a :f}.match $a *{{}}", + "exp": "", + "expErrors": [{ "type": "unknown-function" }, { "type": "bad-selector" }] + }, + { + "description": "simple-message -> simple-start pattern -> placeholder -> expression -> literal-expression -> \"{\" s literal \"}\"", + "src": "{ a}", + "exp": "a" + }, + { + "description": "... literal-expression -> \"{\" literal s attribute \"}\" -> \"{\" literal s \"@\" identifier \"}\"", + "src": "{a @c}", + "exp": "a" + }, + { + "description": "... -> literal-expression -> \"{\" literal s \"}\"", + "src": "{a }", + "exp": "a" + }, + { + "description": "simple-message -> simple-start pattern -> placeholder -> expression -> variable-expression -> \"{\" s variable \"}\"", + "src": "{ $x}", + "exp": "{$x}", + "expErrors": [{ "type": "unresolved-variable" }] + }, + { + "description": "... variable-expression -> \"{\" variable s attribute \"}\" -> \"{\" variable s \"@\" identifier \"}\"", + "src": "{$x @c}", + "exp": "{$x}", + "expErrors": [{ "type": "unresolved-variable" }] + }, + { + "description": "... -> variable-expression -> \"{\" variable s \"}\"", + "src": "{$x }", + "exp": "{$x}", + "expErrors": [{ "type": "unresolved-variable" }] + }, + { + "description": "simple-message -> simple-start pattern -> placeholder -> expression -> annotation-expression -> \"{\" s annotation \"}\"", + "src": "{ :f}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... annotation-expression -> \"{\" annotation s attribute \"}\" -> \"{\" annotation s \"@\" identifier \"}\"", + "src": "{:f @c}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... -> annotation-expression -> \"{\" annotation s \"}\"", + "src": "{:f }", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" s \"#\" identifier \"}\"", + "src": "{ #a}", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"#\" identifier option \"}\" -> \"{\" \"#\" identifier identifier \"=\" literal \"}\"", + "src": "{#tag foo=bar}", + "exp": "", + "expParts": [ + { + "type": "markup", + "kind": "open", + "name": "tag", + "options": { + "foo": "bar" + } + } + ] + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"#\" identifier attribute \"}\" -> \"{\" \"#\" identifier identifier \"=\" literal \"}\"", + "src": "{#a @c}", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"#\" identifier s \"}\" -> \"{\" \"#\" identifier identifier \"=\" literal \"}\"", + "src": "{#a }", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"#\" identifier \"/\" \"}\" -> \"{\" \"#\" identifier identifier \"=\" literal \"}\"", + "src": "{#a/}", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"/\" identifier \"}\"", + "src": "{/a}", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" s \"/\" identifier \"}\"", + "src": "{ /a}", + "exp": "" + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"/\" identifier option \"}\"", + "src": "{/tag foo=bar}", + "exp": "", + "expParts": [ + { + "type": "markup", + "kind": "close", + "name": "tag", + "options": { + "foo": "bar" + } + } + ] + }, + { + "description": "message -> simple-message -> simple-start pattern -> placeholder -> markup -> \"{\" \"/\" identifier s \"}\"", + "src": "{/a }", + "exp": "" + }, + { + "description": "... annotation-expression -> function -> \":\" identifier option", + "src": "{:f k=v}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... option -> identifier s \"=\" literal", + "src": "{:f k =v}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... option -> identifier \"=\" s literal", + "src": "{:f k= v}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... option -> identifier s \"=\" s literal", + "src": "{:f k = v}", + "exp": "{:f}", + "expErrors": [{ "type": "unknown-function" }] + }, + { + "description": "... attribute -> \"@\" identifier \"=\" literal ...", + "src": "{a @c=d}", + "exp": "a" + }, + { + "description": "... attribute -> \"@\" identifier s \"=\" literal ...", + "src": "{a @c =d}", + "exp": "a" + }, + { + "description": "... attribute -> \"@\" identifier \"=\" s literal ...", + "src": "{a @c= d}", + "exp": "a" + }, + { + "description": "... attribute -> \"@\" identifier s \"=\" s literal ...", + "src": "{a @c = d}", + "exp": "a" + }, + { + "description": "... attribute -> \"@\" identifier s \"=\" s quoted-literal ...", + "src": "{42 @foo=|bar|}", + "exp": "42", + "expParts": [ + { + "type": "string", + "source": "|42|", + "value": "42" + } + ] + }, + { + "description": "... literal -> quoted-literal -> \"|\" \"|\" ...", + "src": "{||}", + "exp": "" + }, + { + "description": "... quoted-literal -> \"|\" quoted-char \"|\"", + "src": "{|a|}", + "exp": "a" + }, + { + "description": "... quoted-literal -> \"|\" escaped-char \"|\"", + "src": "{|\\\\|}", + "exp": "\\" + }, + { + "description": "... quoted-literal -> \"|\" quoted-char 1*escaped-char \"|\"", + "src": "{|a\\\\\\{\\|\\}|}", + "exp": "a\\{|}" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30", + "src": "{0}", + "exp": "0" + }, + { + "description": "... unquoted-literal -> number-literal -> \"-\" %x30", + "src": "{-0}", + "exp": "-0" + }, + { + "description": "... unquoted-literal -> number-literal -> (%x31-39 *DIGIT) -> %x31", + "src": "{1}", + "exp": "1" + }, + { + "description": "... unquoted-literal -> number-literal -> (%x31-39 *DIGIT) -> %x31 DIGIT -> 11", + "src": "{11}", + "exp": "11" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30 \".\" 1*DIGIT -> 0 \".\" 1", + "src": "{0.1}", + "exp": "0.1" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30 \".\" 1*DIGIT -> %x30 \".\" DIGIT DIGIT -> 0 \".\" 1 2", + "src": "{0.12}", + "exp": "0.12" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" 1*DIGIT -> %x30 \"e\" DIGIT", + "src": "{0e1}", + "exp": "0e1" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" 1*DIGIT -> %x30 \"E\" DIGIT", + "src": "{0E1}", + "exp": "0E1" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" \"-\" 1*DIGIT ...", + "src": "{0E-1}", + "exp": "0E-1" + }, + { + "description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" \"+\" 1*DIGIT ...", + "src": "{0E-1}", + "exp": "0E-1" + }, + { + "src": "hello { world\t\n}", + "exp": "hello world" + }, + { + "src": "hello {\u3000world\r}", + "exp": "hello world" + }, + { + "src": "{$one} and {$two}", + "params": [ + { + "name": "one", + "value": 1.3 + }, + { + "name": "two", + "value": 4.2 + } + ], + "exp": "1.3 and 4.2" + }, + { + "src": "{$one} et {$two}", + "locale": "fr", + "params": [ + { + "name": "one", + "value": 1.3 + }, + { + "name": "two", + "value": 4.2 + } + ], + "exp": "1,3 et 4,2" + }, + { + "src": ".local $foo = {bar} {{bar {$foo}}}", + "exp": "bar bar" + }, + { + "src": ".local $foo = {|bar|} {{bar {$foo}}}", + "exp": "bar bar" + }, + { + "src": ".local $foo = {|bar|} {{bar {$foo}}}", + "params": [ + { + "name": "foo", + "value": "foo" + } + ], + "exp": "bar bar" + }, + { + "src": ".local $foo = {$bar} {{bar {$foo}}}", + "params": [ + { + "name": "bar", + "value": "foo" + } + ], + "exp": "bar foo" + }, + { + "src": ".local $foo = {$baz} .local $bar = {$foo} {{bar {$bar}}}", + "params": [ + { + "name": "baz", + "value": "foo" + } + ], + "exp": "bar foo" + }, + { + "src": ".input {$foo} {{bar {$foo}}}", + "params": [ + { + "name": "foo", + "value": "foo" + } + ], + "exp": "bar foo" + }, + { + "src": ".input {$foo} .local $bar = {$foo} {{bar {$bar}}}", + "params": [ + { + "name": "foo", + "value": "foo" + } + ], + "exp": "bar foo" + }, + { + "src": ".local $foo = {$baz} .local $bar = {$foo} {{bar {$bar}}}", + "params": [ + { + "name": "baz", + "value": "foo" + } + ], + "exp": "bar foo" + }, + { + "src": ".local $x = {42} .local $y = {$x} {{{$x} {$y}}}", + "exp": "42 42" + }, + { + "src": "{#tag}content", + "exp": "content", + "expParts": [ + { + "type": "markup", + "kind": "open", + "name": "tag" + }, + { + "type": "literal", + "value": "content" + } + ] + }, + { + "src": "{#ns:tag}content{/ns:tag}", + "exp": "content", + "expParts": [ + { + "type": "markup", + "kind": "open", + "name": "ns:tag" + }, + { + "type": "literal", + "value": "content" + }, + { + "type": "markup", + "kind": "close", + "name": "ns:tag" + } + ] + }, + { + "src": "{/tag}content", + "exp": "content", + "expParts": [ + { + "type": "markup", + "kind": "close", + "name": "tag" + }, + { + "type": "literal", + "value": "content" + } + ] + }, + { + "src": "{#tag foo=bar/}", + "exp": "", + "expParts": [ + { + "type": "markup", + "kind": "standalone", + "name": "tag", + "options": { + "foo": "bar" + } + } + ] + }, + { + "src": "{#tag a:foo=|foo| b:bar=$bar}", + "params": [ + { + "name": "bar", + "value": "b a r" + } + ], + "exp": "", + "expParts": [ + { + "type": "markup", + "kind": "open", + "name": "tag", + "options": { + "a:foo": "foo", + "b:bar": "b a r" + } + } + ] + }, + { + "src": "{42 @foo @bar=13}", + "exp": "42", + "expParts": [ + { + "type": "string", + "source": "|42|", + "value": "42" + } + ] + }, + { + "src": "{{trailing whitespace}} \n", + "exp": "trailing whitespace" + } + ] +} diff --git a/docs/ldml/tr35-messageFormat.md b/docs/ldml/tr35-messageFormat.md index 21a0580b7fd..409a1774810 100644 --- a/docs/ldml/tr35-messageFormat.md +++ b/docs/ldml/tr35-messageFormat.md @@ -173,7 +173,7 @@ The LDML specification is divided into the following parts: ## Introduction One of the challenges in adapting software to work for -users with different languages and cultures is the need for **_dynamic messages_**. +users with different languages and cultures is the need for **_dynamic messages_**. Whenever a user interface needs to present data as part of a larger string, that data needs to be formatted (and the message may need to be altered) to make it culturally accepted and grammatically correct. @@ -229,41 +229,33 @@ A reference to a _term_ looks like this. > The provisions of the stability policy are not in effect until > the conclusion of the technical preview and adoption of this specification. -Updates to this specification will not change -the syntactical meaning, the runtime output, or other behaviour -of valid messages written for earlier versions of this specification -that only use functions defined in this specification. +Updates to this specification will not make any valid _message_ invalid. + Updates to this specification will not remove any syntax provided in this version. -Future versions MAY add additional structure or meaning to existing syntax. -Updates to this specification will not remove any reserved keywords or sigils. +Updates to this specification MUST NOT specify an error for any message +that previously did not specify an error. -> [!NOTE] -> Future versions may define new keywords. +Updates to this specification MUST NOT specify the use of a fallback value for any message +that previously did not specify a fallback value. + +Updates to this specification will not change the syntactical meaning +of any syntax defined in this specification. -Updates to this specification will not reserve or assign meaning to -any character "sigils" except for those in the `reserved` production. +Updates to this specification will not remove any functions defined in the default registry. -Updates to this specification -will not remove any functions defined in the default registry nor -will they remove any options or option values. -Additional options or option values MAY be defined. +Updates to this specification will not remove any options or option values +defined in the default registry. > [!NOTE] -> This does not guarantee that the results of formatting will never change. -> Even when the specification doesn't change, +> The foregoing policies are _not_ a guarantee that the results of formatting will never change. +> Even when this specification or its implementation do not change, > the functions for date formatting, number formatting and so on -> will change their results over time. +> can change their results over time or behave differently due to local runtime +> differences in implementation or changes to locale data +> (such as due to the release of new CLDR versions). -Later specification versions MAY make previously invalid messages valid. - -Updates to this specification will not introduce message syntax that, -when parsed according to earlier versions of this specification, -would produce syntax or data model errors. -Such messages MAY produce errors when formatted -according to an earlier version of this specification. - -From version 2.0, MessageFormat will only reserve, define, or require +Updates to this specification will only reserve, define, or require function names or function option names consisting of characters in the ranges a-z, A-Z, and 0-9. All other names in these categories are reserved for the use of implementations or users. @@ -271,30 +263,35 @@ All other names in these categories are reserved for the use of implementations > [!NOTE] > Users defining custom names SHOULD include at least one character outside these ranges > to ensure that they will be compatible with future versions of this specification. +> They SHOULD also use the namespace feature to avoid collisions with other implementations. -Later versions of this specification will not introduce changes +Future versions of this specification will not introduce changes to the data model that would result in a data model representation based on this version being invalid. > For example, existing interfaces or fields will not be removed. -Later versions of this specification MAY introduce changes -to the data model that would result in future data model representations -not being valid for implementations of this version of the data model. - -> For example, a future version could introduce a new keyword, -> whose data model representation would be a new interface -> that is not recognized by this version's data model. +> [!IMPORTANT] +> This stability policy allows any of the following, non-exhaustive list, of changes +> in future versions of this specification: +> - Future versions may define new syntax and structures +> that would not be supported by this version of the specification. +> - Future versions may add additional structure or meaning to existing syntax. +> - Future versions may define new keywords. +> - Future versions may make previously invalid messages valid. +> - Future versions may define additional functions in the default registry +> or may reserve the names of functions for the purposes of interoperability. +> - Future versions may define additional options to existing functions. +> - Future versions may define additional option values for existing options. +> - Future versions may deprecate (but not remove) keywords, functions, options, or option values. +> - Future versions of this specification may introduce changes +> to the data model that would result in future data model representations +> not being valid for implementations of this version of the data model. +> - For example, a future version could introduce a new keyword, +> whose data model representation would be a new interface +> that is not recognized by this version's data model. -Later specification versions will not introduce syntax that cannot be -represented by this version of the data model. -> For example, a future version could introduce a new keyword. -> The future version's data model would provide an interface for that keyword -> while this version of the data model would parse the value into -> the interface `UnsupportedStatement`. -> Both data models would be "valid" in their context, -> but this version's would be missing any functionality for the new statement type. ## Syntax @@ -309,7 +306,7 @@ The design goals of the syntax specification are as follows: 1. The syntax should leverage the familiarity with ICU MessageFormat 1.0 in order to lower the barrier to entry and increase the chance of adoption. At the same time, - the syntax should fix the [pain points of ICU MessageFormat 1.0](https://github.com/unicode-org/message-format-wg/blob/main/docs/why_mf_next.md). +the syntax should fix the [pain points of ICU MessageFormat 1.0](https://github.com/unicode-org/message-format-wg/blob/main/docs/why_mf_next.md). - _Non-Goal_: Be backwards-compatible with the ICU MessageFormat 1.0 syntax. @@ -354,7 +351,7 @@ The syntax specification takes into account the following design restrictions: private-use code points (U+E000 through U+F8FF, U+F0000 through U+FFFFD, and U+100000 through U+10FFFD), unassigned code points, and other potentially confusing content. -### Messages and their Syntax +## Messages and their Syntax The purpose of MessageFormat is to allow content to vary at runtime. This variation might be due to placing a value into the content @@ -372,9 +369,9 @@ This part of the MessageFormat specification defines the syntax for a _message_, along with the concepts and terminology needed when processing a _message_ during the [formatting](#formatting) of a _message_ at runtime. -The complete formal syntax of a _message_ is described by the [ABNF](#complete-abnf). +The complete formal syntax of a _message_ is described by the [ABNF](#complete.abnf). -#### Well-formed vs. Valid Messages +### Well-formed vs. Valid Messages A _message_ is **_well-formed_** if it satisfies all the rules of the grammar. Attempting to parse a _message_ that is not _well-formed_ will result in a _Syntax Error_. @@ -382,13 +379,22 @@ Attempting to parse a _message_ that is not _well-formed_ will result in a _Synt A _message_ is **_valid_** if it is _well-formed_ and **also** meets the additional content restrictions and semantic requirements about its structure defined below for -_declarations_, _matcher_ and _options_. +_declarations_, _matcher_, and _options_. Attempting to parse a _message_ that is not _valid_ will result in a _Data Model Error_. -### The Message +## The Message A **_message_** is the complete template for a specific message formatting request. +A **_variable_** is a _name_ associated to a resolved value. + +An **_external variable_** is a _variable_ +whose _name_ and initial value are supplied by the caller +to MessageFormat or available in the _formatting context_. +Only an _external variable_ can appear as an _operand_ in an _input declaration_. + +A **_local variable_** is a _variable_ created as the result of a _local declaration_. + > [!NOTE] > This syntax is designed to be embeddable into many different programming languages and formats. > As such, it avoids constructs, such as character escapes, that are specific to any given file @@ -417,17 +423,23 @@ A **_message_** is the complete template for a specific message forma > > An exception to this is: whitespace inside a _pattern_ is **always** significant. > [!NOTE] -> The syntax assumes that each _message_ will be displayed with a left-to-right display order +> The MessageFormat 2 syntax assumes that each _message_ will be displayed +> with a left-to-right display order > and be processed in the logical character order. -> The syntax also permits the use of right-to-left characters in _identifiers_, +> The syntax permits the use of right-to-left characters in _identifiers_, > _literals_, and other values. -> This can result in confusion when viewing the _message_. -> -> Additional restrictions or requirements, -> such as permitting the use of certain bidirectional control characters in the syntax, -> might be added during the Tech Preview to better manage bidirectional text. -> Feedback on the creation and management of _messages_ -> containing bidirectional tokens is strongly desired. +> This can result in confusion when viewing the message +> or users might incorrectly insert bidi controls or marks that negatively affect the output +> of the message. +> +> To assist with this, the syntax permits the use of various controls and +> strongly-directional markers in both optional and required _whitespace_ +> in a _message_, as well was encouraging the use of isolating controls +> with _expressions_ and _quoted patterns_. +> See: [whitespace](#whitespace) (below) for more information. +> +> Additional restrictions or requirements might be added during the +> Tech Preview to better manage bidirectional text. A _message_ can be a _simple message_ or it can be a _complex message_. @@ -436,12 +448,15 @@ message = simple-message / complex-message ``` A **_simple message_** contains a single _pattern_, -with restrictions on its first character. -An empty string is a valid _simple message_. +with restrictions on its first non-whitespace character. +An empty string is a _valid_ _simple message_. + +Whitespace at the start or end of a _simple message_ is significant, +and a part of the _text_ of the _message_. ```abnf -simple-message = [simple-start pattern] -simple-start = simple-start-char / text-escape / placeholder +simple-message = o [simple-start pattern] +simple-start = simple-start-char / escaped-char / placeholder ``` A **_complex message_** is any _message_ that contains _declarations_, @@ -452,11 +467,14 @@ and consists of: 1. an optional list of _declarations_, followed by 2. a _complex body_ +Whitespace at the start or end of a _complex message_ is not significant, +and does not affect the processing of the _message_. + ```abnf -complex-message = *(declaration [s]) complex-body +complex-message = o *(declaration o) complex-body o ``` -#### Declarations +### Declarations A **_declaration_** binds a _variable_ identifier to a value within the scope of a _message_. This _variable_ can then be used in other _expressions_ within the same _message_. @@ -464,26 +482,23 @@ _Declarations_ are optional: many messages will not contain any _declarations_. An **_input-declaration_** binds a _variable_ to an external input value. The _variable-expression_ of an _input-declaration_ -MAY include an _annotation_ that is applied to the external value. +MAY include a _function_ that is applied to the external value. A **_local-declaration_** binds a _variable_ to the resolved value of an _expression_. -For compatibility with later MessageFormat 2 specification versions, -_declarations_ MAY also include _reserved statements_. - ```abnf -declaration = input-declaration / local-declaration / reserved-statement -input-declaration = input [s] variable-expression -local-declaration = local s variable [s] "=" [s] expression +declaration = input-declaration / local-declaration +input-declaration = input o variable-expression +local-declaration = local s variable o "=" o expression ``` -_Variables_, once declared, MUST NOT be redeclared. -A _message_ that does any of the following is not _valid_ and will produce a +_Variables_, once declared, MUST NOT be redeclared. +A _message_ that does any of the following is not _valid_ and will produce a _Duplicate Declaration_ error during processing: - A _declaration_ MUST NOT bind a _variable_ that appears as a _variable_ anywhere within a previous _declaration_. - An _input-declaration_ MUST NOT bind a _variable_ - that appears anywhere within the _annotation_ of its _variable-expression_. + that appears anywhere within the _function_ of its _variable-expression_. - A _local-declaration_ MUST NOT bind a _variable_ that appears in its _expression_. A _local-declaration_ MAY overwrite an external input value as long as the @@ -491,47 +506,19 @@ external input value does not appear in a previous _declaration_. > [!NOTE] > These restrictions only apply to _declarations_. -> A _placeholder_ or _selector_ can apply a different annotation to a _variable_ +> A _placeholder_ can apply a different _function_ to a _variable_ > than one applied to the same _variable_ named in a _declaration_. > For example, this message is _valid_: > ``` > .input {$var :number maximumFractionDigits=0} -> .match {$var :number maximumFractionDigits=2} -> 0 {{The selector can apply a different annotation to {$var} for the purposes of selection}} -> * {{A placeholder in a pattern can apply a different annotation to {$var :number maximumFractionDigits=3}}} +> .local $var2 = {$var :number maximumFractionDigits=2} +> .match $var2 +> 0 {{The selector can apply a different function to {$var} for the purposes of selection}} +> * {{A placeholder in a pattern can apply a different function to {$var :number maximumFractionDigits=3}}} > ``` > (See the [Errors](#errors) section for examples of invalid messages) -##### Reserved Statements - -A **_reserved statement_** reserves additional `.keywords` -for use by future versions of this specification. -Any such future keyword must start with `.`, -followed by two or more lower-case ASCII characters. - -The rest of the statement supports -a similarly wide range of content as _reserved annotations_, -but it MUST end with one or more _expressions_. - -```abnf -reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) -reserved-keyword = "." name -``` - -> [!NOTE] -> The `reserved-keyword` ABNF rule is a simplification, -> as it MUST NOT be considered to match any of the existing keywords -> `.input`, `.local`, or `.match`. - -This allows flexibility in future standardization, -as future definitions MAY define additional semantics and constraints -on the contents of these _reserved statements_. - -Implementations MUST NOT assign meaning or semantics to a _reserved statement_: -these are reserved for future standardization. -Implementations MUST NOT remove or alter the contents of a _reserved statement_. - -#### Complex Body +### Complex Body The **_complex body_** of a _complex message_ is the part that will be formatted. The _complex body_ consists of either a _quoted pattern_ or a _matcher_. @@ -540,29 +527,29 @@ The _complex body_ consists of either a _quoted pattern_ or a _matcher_. complex-body = quoted-pattern / matcher ``` -### Pattern +## Pattern A **_pattern_** contains a sequence of _text_ and _placeholders_ to be formatted as a unit. Unless there is an error, resolving a _message_ always results in the formatting of a single _pattern_. ```abnf -pattern = *(text-char / text-escape / placeholder) +pattern = *(text-char / escaped-char / placeholder) ``` A _pattern_ MAY be empty. A _pattern_ MAY contain an arbitrary number of _placeholders_ to be evaluated during the formatting process. -#### Quoted Pattern +### Quoted Pattern -A **_quoted pattern_** is a _pattern_ that is "quoted" to prevent -interference with other parts of the _message_. -A _quoted pattern_ starts with a sequence of two U+007B LEFT CURLY BRACKET `{{` +A **_quoted pattern_** is a _pattern_ that is "quoted" to prevent +interference with other parts of the _message_. +A _quoted pattern_ starts with a sequence of two U+007B LEFT CURLY BRACKET `{{` and ends with a sequence of two U+007D RIGHT CURLY BRACKET `}}`. ```abnf -quoted-pattern = "{{" pattern "}}" +quoted-pattern = o "{{" pattern "}}" ``` A _quoted pattern_ MAY be empty. @@ -573,7 +560,7 @@ A _quoted pattern_ MAY be empty. > {{}} > ``` -#### Text +### Text **_text_** is the translateable content of a _pattern_. Any Unicode code point is allowed, except for U+0000 NULL @@ -583,8 +570,8 @@ U+007B LEFT CURLY BRACKET `{`, and U+007D RIGHT CURLY BRACKET `}` MUST be escaped as `\\`, `\{`, and `\}` respectively. In the ABNF, _text_ is represented by non-empty sequences of -`simple-start-char`, `text-char`, and `text-escape`. -The first of these is used at the start of a _simple message_, +`simple-start-char`, `text-char`, `escaped-char`, and `s`. +The production `simple-start-char` represents the first non-whitespace in a _simple message_ and matches `text-char` except for not allowing U+002E FULL STOP `.`. The ABNF uses `content-char` as a shared base for _text_ and _quoted literal_ characters. @@ -592,10 +579,9 @@ Whitespace in _text_, including tabs, spaces, and newlines is significant and MU be preserved during formatting. ```abnf -simple-start-char = content-char / s / "@" / "|" -text-char = content-char / s / "." / "@" / "|" -quoted-char = content-char / s / "." / "@" / "{" / "}" -reserved-char = content-char / "." +simple-start-char = content-char / "@" / "|" +text-char = content-char / ws / "." / "@" / "|" +quoted-char = content-char / ws / "." / "@" / "{" / "}" content-char = %x01-08 ; omit NULL (%x00), HTAB (%x09) and LF (%x0A) / %x0B-0C ; omit CR (%x0D) / %x0E-1F ; omit SP (%x20) @@ -620,10 +606,10 @@ Otherwise, care must be taken to ensure that pattern-significant whitespace is p > > ```properties > hello = {{ Hello }} -> hello2=\ Hello \ +> hello2=\ Hello \ > ``` -#### Placeholder +### Placeholder A **_placeholder_** is an _expression_ or _markup_ that appears inside of a _pattern_ and which will be replaced during the formatting of a _message_. @@ -632,7 +618,7 @@ and which will be replaced during the formatting of a _message_. placeholder = expression / markup ``` -### Matcher +## Matcher A **_matcher_** is the _complex body_ of a _message_ that allows runtime selection of the _pattern_ to use for formatting. @@ -645,24 +631,31 @@ and at least one _variant_. When the _matcher_ is processed, the result will be a single _pattern_ that serves as the template for the formatting process. -A _message_ can only be considered _valid_ if the following requirements are -satisfied: - -- The number of _keys_ on each _variant_ MUST be equal to the number of _selectors_. -- At least one _variant_ MUST exist whose _keys_ are all equal to the "catch-all" key `*`. -- Each _selector_ MUST have an _annotation_, - or contain a _variable_ that directly or indirectly references a _declaration_ with an _annotation_. +A _message_ can only be considered _valid_ if the following requirements are satisfied; +otherwise, a corresponding _Data Model Error_ will be produced during processing: + +- _Variant Key Mismatch_: + The number of _keys_ on each _variant_ MUST be equal to the number of _selectors_. +- _Missing Fallback Variant_: + At least one _variant_ MUST exist whose _keys_ are all equal to the "catch-all" key `*`. +- _Missing Selector Annotation_: + Each _selector_ MUST be a _variable_ that + directly or indirectly references a _declaration_ with a _function_. +- _Duplicate Variant_: + Each _variant_ MUST use a list of _keys_ that is unique from that + of all other _variants_ in the _message_. + _Literal_ _keys_ are compared by their contents, not their syntactical appearance. ```abnf -matcher = match-statement 1*([s] variant) -match-statement = match 1*([s] selector) +matcher = match-statement s variant *(o variant) +match-statement = match 1*(s selector) ``` > A _message_ with a _matcher_: > > ``` > .input {$count :number} -> .match {$count} +> .match $count > one {{You have {$count} notification.}} > * {{You have {$count} notifications.}} > ``` @@ -670,18 +663,18 @@ match-statement = match 1*([s] selector) > A _message_ containing a _matcher_ formatted on a single line: > > ``` -> .match {:platform} windows {{Settings}} * {{Preferences}} +> .local $os = {:platform} .match $os windows {{Settings}} * {{Preferences}} > ``` -#### Selector +### Selector -A **_selector_** is an _expression_ that ranks or excludes the +A **_selector_** is a _variable_ whose resolved value ranks or excludes the _variants_ based on the value of the corresponding _key_ in each _variant_. The combination of _selectors_ in a _matcher_ thus determines which _pattern_ will be used during formatting. ```abnf -selector = expression +selector = variable ``` There MUST be at least one _selector_ in a _matcher_. @@ -692,7 +685,8 @@ There MAY be any number of additional _selectors_. > based on grammatical case: > > ``` -> .match {$userName :hasCase} +> .local $hasCase = {$userName :hasCase} +> .match $hasCase > vocative {{Hello, {$userName :person case=vocative}!}} > accusative {{Please welcome {$userName :person case=accusative}!}} > * {{Hello!}} @@ -703,7 +697,7 @@ There MAY be any number of additional _selectors_. > ``` > .input {$numLikes :integer} > .input {$numShares :integer} -> .match {$numLikes} {$numShares} +> .match $numLikes $numShares > 0 0 {{Your item has no likes and has not been shared.}} > 0 one {{Your item has no likes and has been shared {$numShares} time.}} > 0 * {{Your item has no likes and has been shared {$numShares} times.}} @@ -715,22 +709,22 @@ There MAY be any number of additional _selectors_. > * * {{Your item has {$numLikes} likes and has been shared {$numShares} times.}} > ``` -#### Variant +### Variant -A **_variant_** is a _quoted pattern_ associated with a set of _keys_ in a _matcher_. +A **_variant_** is a _quoted pattern_ associated with a list of _keys_ in a _matcher_. Each _variant_ MUST begin with a sequence of _keys_, -and terminate with a valid _quoted pattern_. +and terminate with a _valid_ _quoted pattern_. The number of _keys_ in each _variant_ MUST match the number of _selectors_ in the _matcher_. Each _key_ is separated from each other by whitespace. Whitespace is permitted but not required between the last _key_ and the _quoted pattern_. ```abnf -variant = key *(s key) [s] quoted-pattern +variant = key *(s key) quoted-pattern key = literal / "*" ``` -##### Key +#### Key A **_key_** is a value in a _variant_ for use by a _selector_ when ranking or excluding _variants_ during the _matcher_ process. @@ -739,7 +733,13 @@ A _key_ can be either a _literal_ value or the "catch-all" key `*`. The **_catch-all key_** is a special key, represented by `*`, that matches all values for a given _selector_. -### Expressions +The value of each _key_ MUST be treated as if it were in +[Unicode Normalization Form C](https://unicode.org/reports/tr15/) ("NFC"). +Two _keys_ are considered equal if they are canonically equivalent strings, +that is, if they consist of the same sequence of Unicode code points after +Unicode Normalization Form C has been applied to both. + +## Expressions An **_expression_** is a part of a _message_ that will be determined during the _message_'s formatting. @@ -751,28 +751,27 @@ An _expression_ cannot contain another _expression_. An _expression_ MAY contain one more _attributes_. A **_literal-expression_** contains a _literal_, -optionally followed by an _annotation_. +optionally followed by a _function_. A **_variable-expression_** contains a _variable_, -optionally followed by an _annotation_. +optionally followed by a _function_. -An **_annotation-expression_** contains an _annotation_ without an _operand_. +A **_function-expression_** contains a _function_ without an _operand_. ```abnf -expression = literal-expression - / variable-expression - / annotation-expression -literal-expression = "{" [s] literal [s annotation] *(s attribute) [s] "}" -variable-expression = "{" [s] variable [s annotation] *(s attribute) [s] "}" -annotation-expression = "{" [s] annotation *(s attribute) [s] "}" +expression = literal-expression + / variable-expression + / function-expression +literal-expression = "{" o literal [s function] *(s attribute) o "}" +variable-expression = "{" o variable [s function] *(s attribute) o "}" +function-expression = "{" o function *(s attribute) o "}" ``` There are several types of _expression_ that can appear in a _message_. All _expressions_ share a common syntax. The types of _expression_ are: 1. The value of a _local-declaration_ -2. A _selector_ -3. A kind of _placeholder_ in a _pattern_ +2. A kind of _placeholder_ in a _pattern_ Additionally, an _input-declaration_ can contain a _variable-expression_. @@ -785,12 +784,6 @@ Additionally, an _input-declaration_ can contain a _variable-expression_. > .local $y = {|This is an expression|} > ``` > -> Selectors: -> -> ``` -> .match {$selector :functionRequired} -> ``` -> > Placeholders: > > ``` @@ -800,38 +793,28 @@ Additionally, an _input-declaration_ can contain a _variable-expression_. > This placeholder contains a function expression with a variable-valued option: {:function option=$variable} > ``` -#### Annotation - -An **_annotation_** is part of an _expression_ containing either -a _function_ together with its associated _options_, or -a _private-use annotation_ or a _reserved annotation_. - -```abnf -annotation = function - / private-use-annotation - / reserved-annotation -``` +### Operand An **_operand_** is the _literal_ of a _literal-expression_ or the _variable_ of a _variable-expression_. -An _annotation_ can appear in an _expression_ by itself or following a single _operand_. -When following an _operand_, the _operand_ serves as input to the _annotation_. +#### Function -##### Function - -A **_function_** is named functionality in an _annotation_. +A **_function_** is named functionality in an _expression_. _Functions_ are used to evaluate, format, select, or otherwise process data values during formatting. +A _function_ can appear in an _expression_ by itself or following a single _operand_. +When following an _operand_, the _operand_ serves as input to the _function_. + Each _function_ is defined by the runtime's _function registry_. A _function_'s entry in the _function registry_ will define whether the _function_ is a _selector_ or formatter (or both), whether an _operand_ is required, what form the values of an _operand_ can take, -what _options_ and _option_ values are valid, +what _options_ and _option_ values are acceptable, and what outputs might result. -See [function registry](#function-registry) for more information. +See [function registry](#registry) for more information. A _function_ starts with a prefix sigil `:` followed by an _identifier_. The _identifier_ MAY be followed by one or more _options_. @@ -847,7 +830,7 @@ function = ":" identifier *(s option) > It is now {$now :datetime}. > ``` -###### Options +##### Options An **_option_** is a key-value pair containing a named argument that is passed to a _function_. @@ -857,16 +840,17 @@ The _identifier_ is separated from the _value_ by an U+003D EQUALS SIGN `=` alon optional whitespace. The value of an _option_ can be either a _literal_ or a _variable_. -Multiple _options_ are permitted in an _annotation_. +Multiple _options_ are permitted in a _function_. _Options_ are separated from the preceding _function_ _identifier_ and from each other by whitespace. -Each _option_'s _identifier_ MUST be unique within the _annotation_: -an _annotation_ with duplicate _option_ _identifiers_ is not valid. +Each _option_'s _identifier_ MUST be unique within the _function_: +a _function_ with duplicate _option_ _identifiers_ is not _valid_ +and will produce a _Duplicate Option Name_ error during processing. The order of _options_ is not significant. ```abnf -option = identifier [s] "=" [s] (literal / variable) +option = identifier o "=" o (literal / variable) ``` > Examples of _functions_ with _options_ @@ -885,83 +869,7 @@ option = identifier [s] "=" [s] (literal / variable) > Today is {$date :datetime weekday=$dateStyle}! > ``` -##### Private-Use Annotations - -A **_private-use annotation_** is an _annotation_ whose syntax is reserved -for use by a specific implementation or by private agreement between multiple implementations. -Implementations MAY define their own meaning and semantics for _private-use annotations_. - -A _private-use annotation_ starts with either U+0026 AMPERSAND `&` or U+005E CIRCUMFLEX ACCENT `^`. - -Characters, including whitespace, are assigned meaning by the implementation. -The definition of escapes in the `reserved-body` production, used for the body of -a _private-use annotation_ is an affordance to implementations that -wish to use a syntax exactly like other functions. Specifically: - -- The characters `\`, `{`, and `}` MUST be escaped as `\\`, `\{`, and `\}` respectively - when they appear in the body of a _private-use annotation_. -- The character `|` is special: it SHOULD be escaped as `\|` in a _private-use annotation_, - but can appear unescaped as long as it is paired with another `|`. - This is an affordance to allow _literals_ to appear in the private use syntax. - -A _private-use annotation_ MAY be empty after its introducing sigil. - -```abnf -private-use-annotation = private-start [[s] reserved-body] -private-start = "^" / "&" -``` - -> [!NOTE] -> Users are cautioned that _private-use annotations_ cannot be reliably exchanged -> and can result in errors during formatting. -> It is generally a better idea to use the function registry -> to define additional formatting or annotation options. - -> Here are some examples of what _private-use_ sequences might look like: -> -> ``` -> Here's private use with an operand: {$foo &bar} -> Here's a placeholder that is entirely private-use: {&anything here} -> Here's a private-use function that uses normal function syntax: {$operand ^foo option=|literal|} -> The character \| has to be paired or escaped: {&private || |something between| or isolated: \| } -> Stop {& "translate 'stop' as a verb" might be a translator instruction or comment } -> Protect stuff in {^ph}{^/ph}private use{^ph}{^/ph} -> ``` - -##### Reserved Annotations - -A **_reserved annotation_** is an _annotation_ whose syntax is reserved -for future standardization. - -A _reserved annotation_ starts with a reserved character. -The remaining part of a _reserved annotation_, called a _reserved body_, -MAY be empty or contain arbitrary text that starts and ends with -a non-whitespace character. - -This allows maximum flexibility in future standardization, -as future definitions MAY define additional semantics and constraints -on the contents of these _annotations_. - -Implementations MUST NOT assign meaning or semantics to -an _annotation_ starting with `reserved-annotation-start`: -these are reserved for future standardization. -Whitespace before or after a _reserved body_ is not part of the _reserved body_. -Implementations MUST NOT remove or alter the contents of a _reserved body_, -including any interior whitespace, -but MAY remove or alter whitespace before or after the _reserved body_. - -While a reserved sequence is technically "well-formed", -unrecognized _reserved-annotations_ or _private-use-annotations_ have no meaning. - -```abnf -reserved-annotation = reserved-annotation-start [[s] reserved-body] -reserved-annotation-start = "!" / "%" / "*" / "+" / "<" / ">" / "?" / "~" - -reserved-body = reserved-body-part *([s] reserved-body-part) -reserved-body-part = reserved-char / reserved-escape / quoted -``` - -### Markup +## Markup **_Markup_** _placeholders_ are _pattern_ parts that can be used to represent non-language parts of a _message_, @@ -987,8 +895,8 @@ It MAY include _options_. is a _pattern_ part ending a span. ```abnf -markup = "{" [s] "#" identifier *(s option) *(s attribute) [s] ["/"] "}" ; open and standalone - / "{" [s] "/" identifier *(s option) *(s attribute) [s] "}" ; close +markup = "{" o "#" identifier *(s option) *(s attribute) o ["/"] "}" ; open and standalone + / "{" o "/" identifier *(s option) *(s attribute) o "}" ; close ``` > A _message_ with one `button` markup span and a standalone `img` markup element: @@ -997,7 +905,8 @@ markup = "{" [s] "#" identifier *(s option) *(s attribute) [s] ["/"] "}" ; open > {#button}Submit{/button} or {#img alt=|Cancel| /}. > ``` -> A _message_ with attributes in the closing tag: +> A _message_ containing _markup_ that uses _options_ to pair +> two closing markup _placeholders_ to the one open markup _placeholder_: > > ``` > {#ansi attr=|bold,italic|}Bold and italic{/ansi attr=|bold|} italic only {/ansi attr=|italic|} no formatting.} @@ -1009,68 +918,27 @@ _Markup_ _placeholders_ can appear in any order without making the _message_ inv However, specifications or implementations defining _markup_ might impose requirements on the pairing, ordering, or contents of _markup_ during _formatting_. -### Attributes - -**_Attributes_ are reserved for standardization by future versions of this specification.** -Examples in this section are meant to be illustrative and -might not match future requirements or usage. - -> [!NOTE] -> The Tech Preview does not provide a built-in mechanism for overriding -> values in the _formatting context_ (most notably the locale) -> Nor does it provide a mechanism for identifying specific expressions -> such as by assigning a name or id. -> The utility of these types of mechanisms has been debated. -> There are at least two proposed mechanisms for implementing support for -> these. -> Specifically, one mechanism would be to reserve specifically-named options, -> possibly using a Unicode namespace (i.e. `locale=xxx` or `u:locale=xxx`). -> Such options would be reserved for use in any and all functions or markup. -> The other mechanism would be to use the reserved "expression attribute" syntax -> for this purpose (i.e. `@locale=xxx` or `@id=foo`) -> Neither mechanism was included in this Tech Preview. -> Feedback on the preferred mechanism for managing these features -> is strongly desired. -> -> In the meantime, function authors and other implementers are cautioned to avoid creating -> function-specific or implementation-specific option values for this purpose. -> One workaround would be to use the implementation's namespace for these -> features to insure later interoperability when such a mechanism is finalized -> during the Tech Preview period. -> Specifically: -> - Avoid specifying an option for setting the locale of an expression as different from -> that of the overall _message_ locale, or use a namespace that later maps to the final -> mechanism. -> - Avoid specifying options for the purpose of linking placeholders -> (such as to pair opening markup to closing markup). -> If such an option is created, the implementer should use an -> implementation-specific namespace. -> Users and implementers are cautioned that such options might be -> replaced with a standard mechanism in a future version. -> - Avoid specifying generic options to communicate with translators and -> translation tooling (i.e. implementation-specific options that apply to all -> functions. -> The above are all desirable features. -> We welcome contributions to and proposals for such features during the -> Technical Preview. +## Attributes An **_attribute_** is an _identifier_ with an optional value that appears in an _expression_ or in _markup_. +During formatting, _attributes_ have no effect, +and they can be treated as code comments. _Attributes_ are prefixed by a U+0040 COMMERCIAL AT `@` sign, followed by an _identifier_. -An _attribute_ MAY have a _value_ which is separated from the _identifier_ +An _attribute_ MAY have a _literal_ _value_ which is separated from the _identifier_ by an U+003D EQUALS SIGN `=` along with optional whitespace. -The _value_ of an _attribute_ can be either a _literal_ or a _variable_. Multiple _attributes_ are permitted in an _expression_ or _markup_. Each _attribute_ is separated by whitespace. -The order of _attributes_ is not significant. - +Each _attribute_'s _identifier_ SHOULD be unique within the _expression_ or _markup_: +all but the last _attribute_ with the same _identifier_ are ignored. +The order of _attributes_ is not otherwise significant. ```abnf -attribute = "@" identifier [[s] "=" [s] (literal / variable)] +attribute = "@" identifier [o "=" o literal] ``` > Examples of _expressions_ and _markup_ with _attributes_: @@ -1087,11 +955,11 @@ attribute = "@" identifier [[s] "=" [s] (literal / variable)] > Have a {#span @can-copy}great and wonderful{/span @can-copy} birthday! > ``` -### Other Syntax Elements +## Other Syntax Elements This section defines common elements used to construct _messages_. -#### Keywords +### Keywords A **_keyword_** is a reserved token that has a unique meaning in the _message_ syntax. @@ -1104,7 +972,7 @@ local = %s".local" match = %s".match" ``` -#### Literals +### Literals A **_literal_** is a character sequence that appears outside of _text_ in various parts of a _message_. @@ -1117,53 +985,70 @@ except for U+0000 NULL or the surrogate code points U+D800 through U+DFFF. All code points are preserved. -A **_quoted_** literal begins and ends with U+005E VERTICAL BAR `|`. -The characters `\` and `|` within a _quoted_ literal MUST be +> [!IMPORTANT] +> Most text, including that produced by common keyboards and input methods, +> is already encoded in the canonical form known as +> [Unicode Normalization Form C](https://unicode.org/reports/tr15) ("NFC"). +> A few languages, legacy character encoding conversions, or operating environments +> can result in _literal_ values that are not in this form. +> Some uses of _literals_ in MessageFormat, +> notably as the value of _keys_, +> apply NFC to the _literal_ value during processing or comparison. +> While there is no requirement that the _literal_ value actually be entered +> in a normalized form, +> users are cautioned to employ the same character sequences +> for equivalent values and, whenever possible, ensure _literals_ are in NFC. + +A **_quoted literal_** begins and ends with U+005E VERTICAL BAR `|`. +The characters `\` and `|` within a _quoted literal_ MUST be escaped as `\\` and `\|`. -An **_unquoted_** literal is a _literal_ that does not require the `|` +An **_unquoted literal_** is a _literal_ that does not require the `|` quotes around it to be distinct from the rest of the _message_ syntax. -An _unquoted_ MAY be used when the content of the _literal_ +An _unquoted literal_ MAY be used when the content of the _literal_ contains no whitespace and otherwise matches the `unquoted` production. -Any _unquoted_ literal MAY be _quoted_. -Implementations MUST NOT distinguish between _quoted_ and _unquoted_ literals +Implementations MUST NOT distinguish between _quoted literals_ and _unquoted literals_ that have the same sequence of code points. -_Unquoted_ literals can contain a _name_ or consist of a _number-literal_. -A _number-literal_ uses the same syntax as JSON and is intended for the encoding +_Unquoted literals_ can contain a _name_ or consist of a _number-literal_. +A _number-literal_ uses the same syntax as JSON and is intended for the encoding of number values in _operands_ or _options_, or as _keys_ for _variants_. ```abnf -literal = quoted / unquoted -quoted = "|" *(quoted-char / quoted-escape) "|" -unquoted = name / number-literal -number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT] +literal = quoted-literal / unquoted-literal +quoted-literal = "|" *(quoted-char / escaped-char) "|" +unquoted-literal = name / number-literal +number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT] ``` -#### Names and Identifiers - -An **_identifier_** is a character sequence that -identifies a _function_, _markup_, or _option_. -Each _identifier_ consists of a _name_ optionally preceeded by -a _namespace_. -When present, the _namespace_ is separated from the _name_ by a -U+003A COLON `:`. -Built-in _functions_ and their _options_ do not have a _namespace_ identifier. +### Names and Identifiers -The _namespace_ `u` (U+0075 LATIN SMALL LETTER U) -is reserved for future standardization. +A **_name_** is a character sequence used in an _identifier_ +or as the name for a _variable_ +or the value of an _unquoted literal_. -_Function_ _identifiers_ are prefixed with `:`. -_Markup_ _identifiers_ are prefixed with `#` or `/`. -_Option_ _identifiers_ have no prefix. +A _name_ can be preceded or followed by bidirectional marks or isolating controls +to aid in presenting names that contain right-to-left or neutral characters. +These characters are **not** part of the value of the _name_ and MUST be treated as if they were not present +when matching _name_ or _identifier_ strings or _unquoted literal_ values. -A **_name_** is a character sequence used in an _identifier_ -or as the name for a _variable_ -or the value of an _unquoted_ _literal_. +_Variable_ _names_ are prefixed with `$`. -_Variable_ names are prefixed with `$`. +Two _names_ are considered equal if they are canonically equivalent strings, +that is, if they consist of the same sequence of Unicode code points after +[Unicode Normalization Form C](https://unicode.org/reports/tr15/) ("NFC") +has been applied to both. -Valid content for _names_ is based on Namespaces in XML 1.0's +> [!NOTE] +> Implementations are not required to normalize all _names_. +> Comparisons of _name_ values only need be done "as-if" normalization +> has occured. +> Since most text in the wild is already in NFC +> and since checking for NFC is fast and efficient, +> implementations can often substitute checking for actually applying normalization +> to _name_ values. + +Valid content for _names_ is based on Namespaces in XML 1.0's [NCName](https://www.w3.org/TR/xml-names/#NT-NCName). This is different from XML's [Name](https://www.w3.org/TR/xml/#NT-Name) in that it MUST NOT contain a U+003A COLON `:`. @@ -1174,6 +1059,21 @@ Otherwise, the set of characters allowed in a _name_ is large. > Such variables cannot be referenced in a _message_, > but are not otherwise errors. +An **_identifier_** is a character sequence that +identifies a _function_, _markup_, or _option_. +Each _identifier_ consists of a _name_ optionally preceeded by +a _namespace_. +When present, the _namespace_ is separated from the _name_ by a +U+003A COLON `:`. +Built-in _functions_ and their _options_ do not have a _namespace_ identifier. + +The _namespace_ `u` (U+0075 LATIN SMALL LETTER U) +is reserved for future standardization. + +_Function_ _identifiers_ are prefixed with `:`. +_Markup_ _identifiers_ are prefixed with `#` or `/`. +_Option_ _identifiers_ have no prefix. + Examples: > A variable: >``` @@ -1197,66 +1097,163 @@ in this release. ```abnf variable = "$" name -option = identifier [s] "=" [s] (literal / variable) +option = identifier o "=" o (literal / variable) identifier = [namespace ":"] name namespace = name -name = name-start *name-char +name = [bidi] name-start *name-char [bidi] name-start = ALPHA / "_" / %xC0-D6 / %xD8-F6 / %xF8-2FF - / %x370-37D / %x37F-1FFF / %x200C-200D + / %x370-37D / %x37F-61B / %x61D-1FFF / %x200C-200D / %x2070-218F / %x2C00-2FEF / %x3001-D7FF / %xF900-FDCF / %xFDF0-FFFC / %x10000-EFFFF name-char = name-start / DIGIT / "-" / "." / %xB7 / %x300-36F / %x203F-2040 ``` -#### Escape Sequences +### Escape Sequences An **_escape sequence_** is a two-character sequence starting with U+005C REVERSE SOLIDUS `\`. An _escape sequence_ allows the appearance of lexically meaningful characters -in the body of _text_, _quoted_, or _reserved_ (which includes, in this case, -_private-use_) sequences respectively: +in the body of _text_ or _quoted literal_ sequences. +Each _escape sequence_ represents the literal character immediately following the initial `\`. ```abnf -text-escape = backslash ( backslash / "{" / "}" ) -quoted-escape = backslash ( backslash / "|" ) -reserved-escape = backslash ( backslash / "{" / "|" / "}" ) -backslash = %x5C ; U+005C REVERSE SOLIDUS "\" +escaped-char = backslash ( backslash / "{" / "|" / "}" ) +backslash = %x5C ; U+005C REVERSE SOLIDUS "\" ``` -#### Whitespace +> [!NOTE] +> The `escaped-char` rule allows escaping some characters in places where +> they do not need to be escaped, such as braces in a _quoted literal_. +> For example, `|foo {bar}|` and `|foo \{bar\}|` are synonymous. + +When writing or generating a _message_, escape sequences SHOULD NOT be used +unless required by the syntax. +That is, inside _literals_ only escape `|` +and inside _patterns_ only escape `{` and `}`. + +### Whitespace -**_Whitespace_** is defined as one or more of -U+0009 CHARACTER TABULATION (tab), -U+000A LINE FEED (new line), -U+000D CARRIAGE RETURN, -U+3000 IDEOGRAPHIC SPACE, -or U+0020 SPACE. +The syntax limits whitespace characters outside of a _pattern_ to the following: +`U+0009 CHARACTER TABULATION` (tab), +`U+000A LINE FEED` (new line), +`U+000D CARRIAGE RETURN`, +`U+3000 IDEOGRAPHIC SPACE`, +or `U+0020 SPACE`. Inside _patterns_ and _quoted literals_, whitespace is part of the content and is recorded and stored verbatim. Whitespace is not significant outside translatable text, except where required by the syntax. +There are two whitespace productions in the syntax. +**_Optional whitespace_** is whitespace that is not required by the syntax, +but which users might want to include to increase the readability of a _message_. +**_Required whitespace_** is whitespace that is required by the syntax. + +Both types of whitespace optionally permit the use of the bidirectional isolate controls +and certain strongly directional marks. +These can assist users in presenting _messages_ that contain right-to-left +text, _literals_, or _names_ (including those for _functions_, _options_, +_option values_, and _keys_) + +_Messages_ that contain right-to-left (aka RTL) characters SHOULD use one of the +following mechanisms to make messages display intelligibly in plain-text editors: + +1. Use paired isolating bidi controls `U+2066 LEFT-TO-RIGHT ISOLATE` ("LRI") + and `U+2069 POP DIRECTIONAL ISOLATE` ("PDI") as permitted by the ABNF around + parts of any _message_ containing RTL characters: + - _inside_ of _placeholder_ markers `{` and `}` + - _outside_ _quoted-pattern_ markers `{{` and `}}` + - _outside_ of _variable_, _function_, _markup_, or _attribute_, + including the identifying sigil (e.g. `$var` or `:ns:name`) +2. Use the 'local-effect' bidi marks + `U+061C ARABIC LETTER MARK`, `U+200E LEFT-TO-RIGHT MARK` or + `U+200F RIGHT-TO-LEFT MARK` as permitted by the ABNF before or after _identifiers_, + _names_, unquoted _literals_, or _option_ values, + especially when the values contain a mix of neutral, weakly directional, and + strongly directional characters. + +> [!IMPORTANT] +> Always take care **not** to add bidirectional controls or marks +> where they would be semantically significant +> or where they would unintentionally become part of the _message_'s output: +> - do not put them inside of a _literal_ except when they are part of the value, +> (instead put them outside of _literal_ quotes, such as `|...|`) +> - do not put them inside quoted _patterns_ except when they are part of the text, +> (instead put them outside of quoted _patterns_, such as `{{...}}`) +> - do not put them outside _placeholders_, +> (instead put them inside the _placeholder_, such as `{$foo :number}`) +> +> Controls placed inside _literal_ quotes or quoted _patterns_ are part of the _literal_ +> or _pattern_. +> Controls in a _pattern_ will appear in the output of the message. +> Controls inside _literal_ quotes are part of the _literal_ and +> will be considered in operations such as matching a _key_ to a _selector_. + +> [!NOTE] +> Users cannot be expected to create or manage bidirectional controls or +> marks in _messages_, since the characters are invisible and can be difficult +> to manage. +> Tools (such as resource editors or translation editors) +> and other implementations of MessageFormat 2 serialization are strongly +> encouraged to provide paired isolates around any right-to-left +> syntax as described above so that _messages_ display appropriately as plain text. + +These definitions of _whitespace_ implement +[UAX#31 Requirement R3a-2](https://www.unicode.org/reports/tr31/#R3a-2). +It is a profile of R3a-1 in that specification because: +- The following pattern whitespace characters are not allowed: + `U+000B FORM FEED`, + `U+000C VERTICAL TABULATION`, + `U+0085 NEXT LINE`, + `U+2028 LINE SEPARATOR` and + `U+2029 PARAGRAPH SEPARATOR`. +- The character `U+3000 IDEOGRAPHIC SPACE` + _is_ interpreted as whitespace. + - The following directional marks and isolates + are treated as ignorable format controls: + `U+061C ARABIC LETTER MARK`, + `U+200E LEFT-TO-RIGHT MARK`, + `U+200F RIGHT-TO-LEFT MARK`, + `U+2066 LEFT-TO-RIGHT ISOLATE`, + `U+2067 RIGHT-TO-LEFT ISOLATE`, + `U+2068 FIRST STRONG ISOLATE`, + and `U+2069 POP DIRECTIONAL ISOLATE`. + (The character `U+061C` is an addition according to R3a.) + + > [!NOTE] > The character U+3000 IDEOGRAPHIC SPACE is included in whitespace for > compatibility with certain East Asian keyboards and input methods, > in which users might accidentally create these characters in a _message_. ```abnf -s = 1*( SP / HTAB / CR / LF / %x3000 ) +; Required whitespace +s = *bidi ws o + +; Optional whitespace +o = *(s / bidi) + +; Bidirectional marks and isolates +; ALM / LRM / RLM / LRI, RLI, FSI & PDI +bidi = %x061C / %x200E / %x200F / %x2066-2069 + +; Whitespace characters +ws = SP / HTAB / CR / LF / %x3000 ``` ## Complete ABNF -The grammar below uses the ABNF notation [[STD68](https://www.rfc-editor.org/info/std68)], +The grammar is formally defined below +using the ABNF notation [[STD68](https://www.rfc-editor.org/info/std68)], including the modifications found in [RFC 7405](https://www.rfc-editor.org/rfc/rfc7405). RFC7405 defines a variation of ABNF that is case-sensitive. Some ABNF tools are only compatible with the specification found in -[RFC 5234](https://www.rfc-editor.org/rfc/rfc5234). +[RFC 5234](https://www.rfc-editor.org/rfc/rfc5234). To make `message.abnf` compatible with that version of ABNF, replace the rules of the same name with this block: @@ -1271,95 +1268,74 @@ match = %x2E.6D.61.74.63.68 ; ".match" ```abnf message = simple-message / complex-message -simple-message = [simple-start pattern] -simple-start = simple-start-char / text-escape / placeholder -pattern = *(text-char / text-escape / placeholder) +simple-message = o [simple-start pattern] +simple-start = simple-start-char / escaped-char / placeholder +pattern = *(text-char / escaped-char / placeholder) placeholder = expression / markup -complex-message = *(declaration [s]) complex-body -declaration = input-declaration / local-declaration / reserved-statement +complex-message = o *(declaration o) complex-body o +declaration = input-declaration / local-declaration complex-body = quoted-pattern / matcher -input-declaration = input [s] variable-expression -local-declaration = local s variable [s] "=" [s] expression +input-declaration = input o variable-expression +local-declaration = local s variable o "=" o expression -quoted-pattern = "{{" pattern "}}" +quoted-pattern = o "{{" pattern "}}" -matcher = match-statement 1*([s] variant) -match-statement = match 1*([s] selector) -selector = expression -variant = key *(s key) [s] quoted-pattern +matcher = match-statement s variant *(o variant) +match-statement = match 1*(s selector) +selector = variable +variant = key *(s key) quoted-pattern key = literal / "*" ; Expressions -expression = literal-expression - / variable-expression - / annotation-expression -literal-expression = "{" [s] literal [s annotation] *(s attribute) [s] "}" -variable-expression = "{" [s] variable [s annotation] *(s attribute) [s] "}" -annotation-expression = "{" [s] annotation *(s attribute) [s] "}" +expression = literal-expression + / variable-expression + / function-expression +literal-expression = "{" o literal [s function] *(s attribute) o "}" +variable-expression = "{" o variable [s function] *(s attribute) o "}" +function-expression = "{" o function *(s attribute) o "}" -annotation = function - / private-use-annotation - / reserved-annotation - -markup = "{" [s] "#" identifier *(s option) *(s attribute) [s] ["/"] "}" ; open and standalone - / "{" [s] "/" identifier *(s option) *(s attribute) [s] "}" ; close +markup = "{" o "#" identifier *(s option) *(s attribute) o ["/"] "}" ; open and standalone + / "{" o "/" identifier *(s option) *(s attribute) o "}" ; close ; Expression and literal parts function = ":" identifier *(s option) -option = identifier [s] "=" [s] (literal / variable) -; Attributes are reserved for future standardization -attribute = "@" identifier [[s] "=" [s] (literal / variable)] +option = identifier o "=" o (literal / variable) + +attribute = "@" identifier [o "=" o literal] variable = "$" name -literal = quoted / unquoted -quoted = "|" *(quoted-char / quoted-escape) "|" -unquoted = name / number-literal + +literal = quoted-literal / unquoted-literal +quoted-literal = "|" *(quoted-char / escaped-char) "|" +unquoted-literal = name / number-literal ; number-literal matches JSON number (https://www.rfc-editor.org/rfc/rfc8259#section-6) -number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT] +number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT] ; Keywords; Note that these are case-sensitive input = %s".input" local = %s".local" match = %s".match" -; Reserve additional .keywords for use by future versions of this specification. -reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) -; Note that the following production is a simplification, -; as this rule MUST NOT be considered to match existing keywords -; (`.input`, `.local`, and `.match`). -reserved-keyword = "." name - -; Reserve additional sigils for use by future versions of this specification. -reserved-annotation = reserved-annotation-start [[s] reserved-body] -reserved-annotation-start = "!" / "%" / "*" / "+" / "<" / ">" / "?" / "~" - -; Reserve sigils for private-use by implementations. -private-use-annotation = private-start [[s] reserved-body] -private-start = "^" / "&" -reserved-body = reserved-body-part *([s] reserved-body-part) -reserved-body-part = reserved-char / reserved-escape / quoted - ; Names and identifiers ; identifier matches https://www.w3.org/TR/REC-xml-names/#NT-QName -; name matches https://www.w3.org/TR/REC-xml-names/#NT-NCName +; name matches https://www.w3.org/TR/REC-xml-names/#NT-NCName but excludes U+FFFD and U+061C identifier = [namespace ":"] name namespace = name -name = name-start *name-char +name = [bidi] name-start *name-char [bidi] name-start = ALPHA / "_" / %xC0-D6 / %xD8-F6 / %xF8-2FF - / %x370-37D / %x37F-1FFF / %x200C-200D + / %x370-37D / %x37F-61B / %x61D-1FFF / %x200C-200D / %x2070-218F / %x2C00-2FEF / %x3001-D7FF / %xF900-FDCF / %xFDF0-FFFC / %x10000-EFFFF name-char = name-start / DIGIT / "-" / "." / %xB7 / %x300-36F / %x203F-2040 ; Restrictions on characters in various contexts -simple-start-char = content-char / s / "@" / "|" -text-char = content-char / s / "." / "@" / "|" -quoted-char = content-char / s / "." / "@" / "{" / "}" -reserved-char = content-char / "." +simple-start-char = content-char / "@" / "|" +text-char = content-char / ws / "." / "@" / "|" +quoted-char = content-char / ws / "." / "@" / "{" / "}" content-char = %x01-08 ; omit NULL (%x00), HTAB (%x09) and LF (%x0A) / %x0B-0C ; omit CR (%x0D) / %x0E-1F ; omit SP (%x20) @@ -1372,53 +1348,79 @@ content-char = %x01-08 ; omit NULL (%x00), HTAB (%x09) and LF (%x0A) / %xE000-10FFFF ; Character escapes -text-escape = backslash ( backslash / "{" / "}" ) -quoted-escape = backslash ( backslash / "|" ) -reserved-escape = backslash ( backslash / "{" / "|" / "}" ) -backslash = %x5C ; U+005C REVERSE SOLIDUS "\" +escaped-char = backslash ( backslash / "{" / "|" / "}" ) +backslash = %x5C ; U+005C REVERSE SOLIDUS "\" + +; Required whitespace +s = *bidi ws o + +; Optional whitespace +o = *(ws / bidi) -; Whitespace -s = 1*( SP / HTAB / CR / LF / %x3000 ) +; Bidirectional marks and isolates +; ALM / LRM / RLM / LRI, RLI, FSI & PDI +bidi = %x061C / %x200E / %x200F / %x2066-2069 + +; Whitespace characters +ws = SP / HTAB / CR / LF / %x3000 ``` ## Errors -Errors in messages and their formatting MAY occur and be detected -at different stages of processing. -Where available, -the use of validation tools is recommended, +Errors can occur during the processing of a _message_. +Some errors can be detected statically, +such as those due to problems with _message_ syntax, +violations of requirements in the data model, +or requirements defined by a _function_. +Other errors might be detected during selection or formatting of a given _message_. +Where available, the use of validation tools is recommended, as early detection of errors makes their correction easier. -### Error Handling +## Error Handling _Syntax Errors_ and _Data Model Errors_ apply to all message processors, and MUST be emitted as soon as possible. The other error categories are only emitted during formatting, but it might be possible to detect them with validation tools. -During selection, an _expression_ handler MUST only emit _Resolution Errors_ and _Selection Errors_. -During formatting, an _expression_ handler MUST only emit _Resolution Errors_ and _Formatting Errors_. +During selection and formatting, +_expression_ handlers MUST only emit _Message Function Errors_. + +Implementations do not have to check for or emit _Resolution Errors_ +or _Message Function Errors_ in _expressions_ that are not otherwise used by the _message_, +such as _placeholders_ in unselected _patterns_ +or _declarations_ that are never referenced during _formatting_. -_Resolution Errors_ and _Formatting Errors_ in _expressions_ that are not used -in _pattern selection_ or _formatting_ MAY be ignored, -as they do not affect the output of the formatter. +When formatting a _message_ with one or more errors, +an implementation MUST provide a mechanism to discover and identify +at least one of the errors. +The exact form of error signaling is implementation defined. +Some examples include throwing an exception, +returning an error code, +or providing a function or method for enumerating any errors. -In all cases, when encountering a runtime error, -a message formatter MUST provide some representation of the message. -An informative error or errors MUST also be separately provided. +For all _valid_ _messages_, +an implementation MUST enable a user to get a formatted result. +The formatted result might include _fallback values_ +such as when a _placeholder_'s _expression_ produced an error +during formatting. + +The two above requirements MAY be fulfilled by a single formatting method, +or separately by more than one such method. When a message contains more than one error, or contains some error which leads to further errors, an implementation which does not emit all of the errors SHOULD prioritise _Syntax Errors_ and _Data Model Errors_ over others. -When an error occurs within a _selector_, +When an error occurs while resolving a _selector_ +or calling MatchSelectorKeys with its resolved value, the _selector_ MUST NOT match any _variant_ _key_ other than the catch-all `*` -and a _Resolution Error_ or a _Selection Error_ MUST be emitted. +and a _Bad Selector_ error MUST be emitted. -### Syntax Errors +## Syntax Errors -**_Syntax Errors_** occur when the syntax representation of a message is not well-formed. +**_Syntax Errors_** occur when the syntax representation of a message is not _well-formed_. > Example invalid messages resulting in a _Syntax Error_: > @@ -1438,12 +1440,12 @@ and a _Resolution Error_ or a _Selection Error_ MUST be emitted. > .local $var = {|no message body|} > ``` -### Data Model Errors +## Data Model Errors -**_Data Model Errors_** occur when a message is invalid due to +**_Data Model Errors_** occur when a message is not _valid_ due to violating one of the semantic requirements on its structure. -#### Variant Key Mismatch +### Variant Key Mismatch A **_Variant Key Mismatch_** occurs when the number of keys on a _variant_ does not equal the number of _selectors_. @@ -1451,19 +1453,22 @@ does not equal the number of _selectors_. > Example invalid messages resulting in a _Variant Key Mismatch_ error: > > ``` -> .match {$one :func} +> .input {$one :func} +> .match $one > 1 2 {{Too many}} > * {{Otherwise}} > ``` > > ``` -> .match {$one :func} {$two :func} +> .input {$one :func} +> .input {$two :func} +> .match $one $two > 1 2 {{Two keys}} > * {{Missing a key}} > * * {{Otherwise}} > ``` -#### Missing Fallback Variant +### Missing Fallback Variant A **_Missing Fallback Variant_** error occurs when the message does not include a _variant_ with only catch-all keys. @@ -1471,46 +1476,49 @@ does not include a _variant_ with only catch-all keys. > Example invalid messages resulting in a _Missing Fallback Variant_ error: > > ``` -> .match {$one :func} +> .input {$one :func} +> .match $one > 1 {{Value is one}} > 2 {{Value is two}} > ``` > > ``` -> .match {$one :func} {$two :func} +> .input {$one :func} +> .input {$two :func} +> .match $one $two > 1 * {{First is one}} > * 1 {{Second is one}} > ``` -#### Missing Selector Annotation +### Missing Selector Annotation A **_Missing Selector Annotation_** error occurs when the _message_ -contains a _selector_ that does not have an _annotation_, -or contains a _variable_ that does not directly or indirectly reference a _declaration_ with an _annotation_. +contains a _selector_ that does not +directly or indirectly reference a _declaration_ with a _function_. > Examples of invalid messages resulting in a _Missing Selector Annotation_ error: > > ``` -> .match {$one} +> .match $one > 1 {{Value is one}} > * {{Value is not one}} > ``` > > ``` > .local $one = {|The one|} -> .match {$one} +> .match $one > 1 {{Value is one}} > * {{Value is not one}} > ``` > > ``` > .input {$one} -> .match {$one} +> .match $one > 1 {{Value is one}} > * {{Value is not one}} > ``` -#### Duplicate Declaration +### Duplicate Declaration A **_Duplicate Declaration_** error occurs when a _variable_ is declared more than once. Note that an input _variable_ is implicitly declared when it is first used, @@ -1541,7 +1549,7 @@ so explicitly declaring it after such use is also an error. > {{{$var} cannot be redefined. {$var2} cannot refer to itself}} > ``` -#### Duplicate Option Name +### Duplicate Option Name A **_Duplicate Option Name_** error occurs when the same _identifier_ appears on the left-hand side of more than one _option_ in the same _expression_. @@ -1557,12 +1565,36 @@ appears on the left-hand side of more than one _option_ in the same _expression_ > {{This is {$foo}}} > ``` -### Resolution Errors +### Duplicate Variant + +A **_Duplicate Variant_** error occurs when the +same list of _keys_ is used for more than one _variant_. + +> Examples of invalid messages resulting in a _Duplicate Variant_ error: +> +> ``` +> .input {$var :string} +> .match $var +> * {{The first default}} +> * {{The second default}} +> ``` +> +> ``` +> .input {$x :string} +> .input {$y :string} +> .match $x $y +> * foo {{The first "foo" variant}} +> bar * {{The "bar" variant}} +> * |foo| {{The second "foo" variant}} +> * * {{The default variant}} +> ``` + +## Resolution Errors **_Resolution Errors_** occur when the runtime value of a part of a message cannot be determined. -#### Unresolved Variable +### Unresolved Variable An **_Unresolved Variable_** error occurs when a variable reference cannot be resolved. @@ -1575,12 +1607,13 @@ An **_Unresolved Variable_** error occurs when a variable reference c > ``` > > ``` -> .match {$var :func} +> .input {$var :func} +> .match $var > 1 {{The value is one.}} > * {{The value is not one.}} > ``` -#### Unknown Function +### Unknown Function An **_Unknown Function_** error occurs when an _expression_ includes a reference to a function which cannot be resolved. @@ -1594,109 +1627,47 @@ a reference to a function which cannot be resolved. > ``` > > ``` -> .match {|horse| :func} -> 1 {{The value is one.}} -> * {{The value is not one.}} -> ``` - -#### Unsupported Expression - -An **_Unsupported Expression_** error occurs when an expression uses -syntax reserved for future standardization, -or for private implementation use that is not supported by the current implementation. - -> For example, attempting to format this message -> would always result in an _Unsupported Expression_ error: -> -> ``` -> The value is {!horse}. -> ``` -> -> Attempting to format this message would result in an _Unsupported Expression_ error -> if done within a context that does not support the `^` private use sigil: -> -> ``` -> .match {|horse| ^private} +> .local $horse = {|horse| :func} +> .match $horse > 1 {{The value is one.}} > * {{The value is not one.}} > ``` -#### Invalid Expression - -An **_Invalid Expression_** error occurs when a _message_ includes an _expression_ -whose implementation-defined internal requirements produce an error during _function resolution_ -or when a _function_ returns a value (such as `null`) that the implementation does not support. - -An **_Operand Mismatch Error_** is an _Invalid Expression_ error that occurs when -an _operand_ provided to a _function_ during _function resolution_ does not match one of the -expected implementation-defined types for that function; -or in which a literal _operand_ value does not have the required format -and thus cannot be processed into one of the expected implementation-defined types -for that specific _function_. - -> For example, the following _message_ produces an _Operand Mismatch Error_ -> (a type of _Invalid Expression_ error) -> because the literal `|horse|` does not match the production `number-literal`, -> which is a requirement of the function `:number` for its operand: -> ``` -> .local $horse = {horse :number} -> {{You have a {$horse}.}} -> ``` -> The following _message_ might produce an _Invalid Expression_ error if the -> the function `:function` threw an exception or otherwise emitted an error -> rather than returning a valid value: ->``` -> {{This has an invalid expression {$var :function} because it has a bug in it.}} ->``` - -#### Unsupported Statement +### Bad Selector -An **_Unsupported Statement_** error occurs when a message includes a _reserved statement_. +A **_Bad Selector_** error occurs when a message includes a _selector_ +with a resolved value which does not support selection. > For example, attempting to format this message -> would always result in an _Unsupported Statement_ error: +> would result in a _Bad Selector_ error: > > ``` -> .some {|horse|} -> {{The message body}} +> .local $day = {|2024-05-01| :date} +> .match $day +> * {{The due date is {$day}}} > ``` -### Selection Errors - -**_Selection Errors_** occur when message selection fails. - -> For example, attempting to format either of the following messages -> might result in a _Selection Error_ if done within a context that -> uses a `:number` selector function which requires its input to be numeric: -> -> ``` -> .match {|horse| :number} -> 1 {{The value is one.}} -> * {{The value is not one.}} -> ``` -> -> ``` -> .local $sel = {|horse| :number} -> .match {$sel} -> 1 {{The value is one.}} -> * {{The value is not one.}} -> ``` +## Message Function Errors -### Formatting Errors +A **_Message Function Error_** is any error that occurs +when calling a message function implementation +or which depends on validation associated with a specific function. -**_Formatting Errors_** occur during the formatting of a resolved value, -for example when encountering a value with an unsupported type -or an internally inconsistent set of options. +Implementations SHOULD provide a way for _functions_ to emit +(or cause to be emitted) any of the types of error defined in this section. +Implementations MAY also provide implementation-defined _Message Function Error_ types. > For example, attempting to format any of the following messages -> might result in a _Formatting Error_ if done within a context that +> might result in a _Message Function Error_ if done within a context that > -> 1. provides for the variable reference `$user` to resolve to +> 1. Provides for the variable reference `$user` to resolve to > an object `{ name: 'Kat', id: 1234 }`, -> 2. provides for the variable reference `$field` to resolve to +> 2. Provides for the variable reference `$field` to resolve to > a string `'address'`, and -> 3. uses a `:get` formatting function which requires its argument to be an object and -> an option `field` to be provided with a string value, +> 3. Uses a `:get` message function which requires its argument to be an object and +> an option `field` to be provided with a string value. +> +> The exact type of _Message Function Error_ is determined by the message function implementation. > > ``` > Hello, {horse :get field=name}! @@ -1715,296 +1686,101 @@ or an internally inconsistent set of options. > Your {$field} is {$id :get field=$field} > ``` -## Function Registry - -Implementations and tooling can greatly benefit from a -structured definition of formatting and matching functions available to messages at runtime. -This specification is intended to provide a mechanism for storing such declarations in a portable manner. - -### Goals - -_This section is non-normative._ +### Bad Operand -The registry provides a machine-readable description of MessageFormat 2 extensions (custom functions), -in order to support the following goals and use-cases: - -- Validate semantic properties of messages. For example: - - Type-check values passed into functions. - - Validate that matching functions are only called in selectors. - - Validate that formatting functions are only called in placeholders. - - Verify the exhaustiveness of variant keys given a selector. -- Support the localization roundtrip. For example: - - Generate variant keys for a given locale during XLIFF extraction. -- Improve the authoring experience. For example: - - Forbid edits to certain function options (e.g. currency options). - - Autocomplete function and option names. - - Display on-hover tooltips for function signatures with documentation. - - Display/edit known message metadata. - - Restrict input in GUI by providing a dropdown with all viable option values. - -### Conformance and Use - -_This section is normative._ - -To be conformant with MessageFormat 2.0, an implementation MUST implement -the _functions_, _options_ and _option_ values, _operands_ and outputs -described in the section [Default Registry](#default-registry) below. - -Implementations MAY implement additional _functions_ or additional _options_. -In particular, implementations are encouraged to provide feedback on proposed -_options_ and their values. - -> [!IMPORTANT] -> In the Tech Preview, the [registry data model](#registry-data-model) should -> be regarded as experimental. -> Changes to the format are expected during this period. -> Feedback on the registry's format and implementation is encouraged! - -Implementations are not required to provide a machine-readable registry -nor to read or interpret the registry data model in order to be conformant. - -The MessageFormat 2.0 Registry was created to describe -the core set of formatting and selection _functions_, -including _operands_, _options_, and _option_ values. -This is the minimum set of functionality needed for conformance. -By using the same names and values, _messages_ can be used interchangeably -by different implementations, -regardless of programming language or runtime environment. -This ensures that developers do not have to relearn core MessageFormat syntax -and functionality when moving between platforms -and that translators do not need to know about the runtime environment for most -selection or formatting operations. - -The registry provides a machine-readable description of _functions_ -suitable for tools, such as those used in translation automation, so that -variant expansion and information about available _options_ and their effects -are available in the translation ecosystem. -To that end, implementations are strongly encouraged to provide appropriately -tailored versions of the registry for consumption by tools -(even if not included in software distributions) -and to encourage any add-on or plug-in functionality to provide -a registry to support localization tooling. - -### Registry Data Model +A **_Bad Operand_** error is any error that occurs due to the content or format of the _operand_, +such as when the _operand_ provided to a _function_ during _function resolution_ does not match one of the +expected implementation-defined types for that function; +or in which a literal _operand_ value does not have the required format +and thus cannot be processed into one of the expected implementation-defined types +for that specific _function_. -_This section is non-normative._ +> For example, the following _messages_ each produce a _Bad Operand_ error +> because the literal `|horse|` does not match the `number-literal` production, +> which is a requirement of the function `:number` for its operand: +> +> ``` +> .local $horse = {|horse| :number} +> {{You have a {$horse}.}} +> ``` +> +> ``` +> .local $horse = {|horse| :number} +> .match $horse +> 1 {{The value is one.}} +> * {{The value is not one.}} +> ``` -> [!IMPORTANT] -> This part of the specification is not part of the Tech Preview. - -The registry contains descriptions of function signatures. - -The main building block of the registry is the `` element. -It represents an implementation of a custom function available to translation at runtime. -A function defines a human-readable `` of its behavior -and one or more machine-readable _signatures_ of how to call it. -Named `` elements can optionally define regex validation rules for -literals, option values, and variant keys. - -MessageFormat 2 functions can be invoked in two contexts: - -- inside placeholders, to produce a part of the message's formatted output; - for example, a raw value of `|1.5|` may be formatted to `1,5` in a language which uses commas as decimal separators, -- inside selectors, to contribute to selecting the appropriate variant among all given variants. - -A single _function name_ may be used in both contexts, -regardless of whether it's implemented as one or multiple functions. - -A _signature_ defines one particular set of at most one argument and any number of named options -that can be used together in a single call to the function. -`` corresponds to a function call inside a placeholder inside translatable text. -`` corresponds to a function call inside a selector. - -A signature may define the positional argument of the function with the `` element. -If the `` element is not present, the function is defined as a nullary function. -A signature may also define one or more `