Skip to content

Instantly share code, notes, and snippets.

@replaid
Last active October 29, 2022 13:36
Show Gist options
  • Save replaid/cb50794688f1b5ac2f4c76115da956d5 to your computer and use it in GitHub Desktop.
Save replaid/cb50794688f1b5ac2f4c76115da956d5 to your computer and use it in GitHub Desktop.
Federated Wiki i18n proposal

OldSlug vs. Unislug comparison

Input OldSlug Unislug Unislug HTML link Unislug appearance in location bar
[[Peña]] pea peña /pe%C3%B1a.html /peña.html
[[Pea]] pea pea /pea.html /pea.html
[[RAMN]] ramn ramn /ramn.html /ramn.html
[[Ramén]] ramn ramén /ram%C3%A9n.html /ramén.html
[[Ramón]] ramn ramón /ram%C3%B3n.html /ramón.html
[[Гильдии]] (empty string) гильдии /%D0%B3%D0%B8%D0%BB%D1%8C%D0%B4%D0%B8%D0%B8.html /гильдии.html
[[По-русски]] - по-русски /%D0%BF%D0%BE-%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8.html /по-русски.html
[[Что-то]] - что-то /%D1%87%D1%82%D0%BE-%D1%82%D0%BE.html что-то.html

Resolving conflicts

  1. New wiki ensures all page files are renamed to Unislugs derived from the page title at startup. So
    • the Peña page currently having the filename pea is renamed to peña,
    • Ramón from ramn to ramón, and
    • По-русски from - to по-русски. The file name can be the literal Unislug string, its URLencoded equivalent, its Punycode equivalent, or anything else that technically works and unambiguously maps to and from the Unislug.
  2. New wiki builds a hashmap that translates OldSlugs to Unislugs like
    {
      "pea": "pea",
      "ramn": "ramón",
      "-": "по-русски"
    }
    resolving conflicts by mapping to the file with the oldest creation time among the files involved in the name collision, to emulate the existing wiki behavior for such collisions.
  3. New wiki renders the link using the URLencoded Unislug as shown in the above table, which results in the browser displaying the URL in the correct script.
  4. When the link is clicked, wiki-server checks its Unislug pages for a match, but if they don't contain a match, it tries the OldSlug mappings in step 2.
  5. The sitemap.json and site-index.json endpoints continue to present the OldSlug. New endpoints named something like sitemap-i18n.json and site-index-i18n.json provide the same data but with Unislugs.
  6. New wiki clients request the sitemap-i18n.json first for a given wiki entering the neighborhood, then fall back to sitemap.json if that is a 404. The site-index request goes along with whatever version was the result of the sitemap request.

I am unaware of anything that would break in this process. The collisions discussed are already present in current wiki. We continue to strictly limit the character set usable in slugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment