Skip to content

Instantly share code, notes, and snippets.

@arterm-sedov
Last active December 12, 2024 12:07
Show Gist options
  • Save arterm-sedov/af926e13e9bb1536a53dd8e524428313 to your computer and use it in GitHub Desktop.
Save arterm-sedov/af926e13e9bb1536a53dd8e524428313 to your computer and use it in GitHub Desktop.
PHP code to slugify strings into neat URLs
<?php
function slugify($string) {
$string = transliterator_transliterate("Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower();", $string);
$string = preg_replace('/[-\s]+/', '-', $string);
return trim($string, '-');
}
echo slugify("Hello Wörld! Καλημέρα. Привет, 1-й ёжик!. 富士山. 國語");
?>
@arterm-sedov
Copy link
Author

arterm-sedov commented Dec 12, 2024

PHP code to slugify strings into neat URLs.

Input: Hello Wörld! Καλημέρα. Привет, 1-й ёжик!. 富士山. 國語.

Output: hello-world-kalemera-privet-1j-ezik-fu-shi-shan-guo-yu

The construct Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; is a part of the transliterator rules for Unicode transformations, used by PHP's transliterator_transliterate function.

Code Explanation:

  1. transliterator_transliterate Function:

    • This function applies Unicode transformations to a string using a set of transformation rules.
    • The rule "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower();" performs the following steps:
      • Any-Latin: Transliterates characters from any script into their Latin equivalents (e.g., Cyrillic ЯYa).
      • NFD (Normalization Form Decomposition): Decomposes characters into their base form plus any diacritical marks (e.g., é becomes e + ´).
      • [:Nonspacing Mark:] Remove: Removes diacritical marks (e.g., e + ´ becomes e).
      • NFC (Normalization Form Composition): Recombines decomposed characters into a composed form without diacritical marks.
      • [:Punctuation:] Remove: Removes punctuation characters (e.g., !, ,, etc.).
      • Lower(): Converts the resulting string to lowercase.
  2. preg_replace Function:

    • preg_replace('/[-\s]+/', '-', $string) replaces one or more occurrences of spaces or hyphens with a single hyphen. This ensures a clean, consistent slug format.
  3. trim Function:

    • trim($string, '-') removes any leading or trailing hyphens from the resulting string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment