Last active
December 12, 2024 12:07
-
-
Save arterm-sedov/af926e13e9bb1536a53dd8e524428313 to your computer and use it in GitHub Desktop.
PHP code to slugify strings into neat URLs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
function slugify($string) { | |
$string = transliterator_transliterate("Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower();", $string); | |
$string = preg_replace('/[-\s]+/', '-', $string); | |
return trim($string, '-'); | |
} | |
echo slugify("Hello Wörld! Καλημέρα. Привет, 1-й ёжик!. 富士山. 國語"); | |
?> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
PHP code to slugify strings into neat URLs.
Input: Hello Wörld! Καλημέρα. Привет, 1-й ёжик!. 富士山. 國語.
Output: hello-world-kalemera-privet-1j-ezik-fu-shi-shan-guo-yu
The construct
Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove;
is a part of the transliterator rules for Unicode transformations, used by PHP'stransliterator_transliterate
function.Code Explanation:
transliterator_transliterate
Function:"Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower();"
performs the following steps:Any-Latin
: Transliterates characters from any script into their Latin equivalents (e.g., CyrillicЯ
→Ya
).NFD
(Normalization Form Decomposition): Decomposes characters into their base form plus any diacritical marks (e.g.,é
becomese + ´
).[:Nonspacing Mark:] Remove
: Removes diacritical marks (e.g.,e + ´
becomese
).NFC
(Normalization Form Composition): Recombines decomposed characters into a composed form without diacritical marks.[:Punctuation:] Remove
: Removes punctuation characters (e.g.,!
,,
, etc.).Lower()
: Converts the resulting string to lowercase.preg_replace
Function:preg_replace('/[-\s]+/', '-', $string)
replaces one or more occurrences of spaces or hyphens with a single hyphen. This ensures a clean, consistent slug format.trim
Function:trim($string, '-')
removes any leading or trailing hyphens from the resulting string.