This is a proposal for better integration of MessageFormat 2 (Unicode proposal, TC39 proposal) with the rest of the web platform.
TLDR
import message from "./message.mf2" with { type: "messageformat" }; export function Notifications({ count }) { return html`<p>${message.format({ count })}</p>`; }
While I am strongly convinced that MessageFormat belongs to Intl
, as it is an internationalization API, it has a significant difference from all the others Intl.*
API: it needs a lot of developer-provided data (all the translations!), rather than using mostly data from CLDR.
The TC39 proposal glosses over how developers are meant to rertive the translations, and instead only shows examples with inline strings. In practice, applications will look similar to this:
const locale = getUserLocale();
const message = await fetch("/messages/notifications-count.mf2?lang=" + locale)
.then(response => response.text())
.then(raw => new Intl.MessageFormat(locale, raw));
export function Notifications({ count }) {
return html`
<p>${message.format({ count })}</p>
`;
}
or, if the developers is pre-bundling their MessageFormat messages in a JSON files, it could look like this:
const locale = getUserLocale();
// The fetch would actually be in its own module to be deduplicated
// among all the components that need it
const message = await fetch("/messages.json?lang=" + locale)
.then(response => response.json())
.then(raw => new Intl.MessageFormat(locale, raw.notifications));
export function Notifications({ count }) {
return html`
<p>${message.format({ count })}</p>
`;
}
Given that MessageFormat messages are a data resource used to render the app, the loading boilerplate could be abstracted away similar to how it has been done for JSON and CSS. We can introduced a new module type specifically for MessageFormat, so that its usage would become as follows:
import message from "/messages/notifications.mf2" with { type: "messageformat" };
export function Notifications({ count }) {
return html`
<p>${message.format({ count })}</p>
`;
}
When importing a type: "messageformat"
module, the following happens:
- as part of module loading, the browser fetches the imported file from the server
- the server choose which language to provide to the client, through one of:
- the
Accept-Language
header in the HTTP request - whatever preference they have stored in their database for the user
- the referrer URL (for websites using, for example,
en.example.com/my-page
orexaple.com/en/my-page
)
- the
- the server will respond to the HTTP request with the message, together with an indication of the message language, through one of:
- the
Content-Language
HTTP header - some in-band annotation stored in the MessageFormat file (such as using a
.lang
keyword) - maybe with a fallback to
navigator.language
- the
- the browser will parse the MessageFormat contents, and create an
Intl.MessageFormat
object with the language defined by the server
This can work either for standalone messages and for hypothetical "message bundles" (https://github.com/eemeli/message-resource-wg/). A message bundle could have an annotation setting the language for all the messages in the file (e.g. @lang it
at the beginning), and the messages could be exposed as named exports of the module.
While in many cases the language would be defined by the server, it's possible that it is client-controlled (for example, with a EN
/CH
switch at the top of the page that re-renders the page without reloading). In this case, applications could still pass it as a dynamic query parameter using dynamic import:
const { default: message } = await import(
"/messages/notifications.mf2?lang=" + lang,
{ with: { type: "messageformat" } }
);
I am proposing this feature for multiple reasons:
- Static analyzability: Imports are much more easier to analyze than
fetch
calls for tools, so with animport
-based syntax it would be possible to have:- bundlers that automatically bundle and tree-shake messages based on how they are used in the app
- linters or type-checkers that check that you are passing the correct values to
message.format
- Ergonomics: The logic to fetch messages and construct the
Intl.MessageFormat
objects is always the same, and this would abstract it away to a one-liner. It is the same reason we had for adding JSON modules to the platform. - Syntax ownership: TC39 expressed that we are not sure wether we want to own the parsing logic for the syntax defined by MessageFormat, or wether we want to just provide the formatting/stringifying logic. I am convinced that a feature that just does half of the job and thus having to load a third-party library is unfortunate, but also that having an Intl-related API in its own spec rather than together with all the other Intl APIs is unfortunate. Developing this feature in a well-integrated way, splitting responsability between the JavaScript standard and other web standards, would avoid having to choose one of the two unfortunate directions.
Module integration is an interesting approach that I had not considered. Some initial thoughts:
The main reason why the
Intl.MessageFormat
proposal glosses over where the strings are coming from is that it can; there are multiple potentially valid ways to solve the problem, but all of them do need a formatter that it's providing. Also, theparseResource()
part of it was split off into its own proposal. That might be an appropriate space for us to continue some parts of this conversation?When considering a scope greater than what's happening with the formatting of one specific message, we almost always want to do something with multiple messages that need to correlate with each other; If a dialog prompts you to "Click OK to continue", the button label better be "OK". In other words, something like an imported module should always resolve as a bundle of messages.
The locale dependency makes resource loading a challenging problem to solve, as it means that a static identifier like
/messages/notifications
is not enough; we also need to ask for a specific locale or even a locale fallback chain (e.g.es-MX, es, fr, en
). If we are to implement this in JS, we probably need to provide a solution that works really well in browsers as well as serverside, where module loading doesn't otherwise consider HTTP headers or user preferences. This raises the question whether TC39 is the right place where to solve the problem, rather than e.g. WhatWG?