Skip to content

Instantly share code, notes, and snippets.

@codeguy
Created September 24, 2013 13:19
Show Gist options
  • Save codeguy/6684588 to your computer and use it in GitHub Desktop.
Save codeguy/6684588 to your computer and use it in GitHub Desktop.
Create slug from string in Javascript
function string_to_slug (str) {
str = str.replace(/^\s+|\s+$/g, ''); // trim
str = str.toLowerCase();
// remove accents, swap ñ for n, etc
var from = "àáäâèéëêìíïîòóöôùúüûñç·/_,:;";
var to = "aaaaeeeeiiiioooouuuunc------";
for (var i=0, l=from.length ; i<l ; i++) {
str = str.replace(new RegExp(from.charAt(i), 'g'), to.charAt(i));
}
str = str.replace(/[^a-z0-9 -]/g, '') // remove invalid chars
.replace(/\s+/g, '-') // collapse whitespace and replace by -
.replace(/-+/g, '-'); // collapse dashes
return str;
}
@juanlanus
Copy link

@torma616
AFAIK your version, which looks good, is missing the following line below the normalize one:
replace( /[\u0300-\u036f]/g, '' )

The normalize() function splits each accented character in two: the base character, and its accent.
The subsequent replace() line deletes all the accents, which happen to be all in the \u03xx UNICODE block.
Removing the accents requires these two steps.

I got the info for my slugify version from June 6, 2020 from the MDN docs:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize

@guillefd
Copy link

Works! thanks.

@nithincb-oss
Copy link

@codeguy - Could you please add an open-source license to this gist?

@auvansang
Copy link

auvansang commented Jul 4, 2024

Look seem that does not work for "đ" char

@juanlanus
Copy link

@auvansang
You are right, "đ" can not be normalized, because it's not the combination of a letter with an accent, it's a letter by itself.
Check this:
https://stackoverflow.com/questions/2362810/why-doesnt-%C4%90-get-flattened-to-d-when-removing-accents-diacritics

@thierryc
Copy link

thierryc commented Aug 1, 2024

@auvansang
How should the character "đ" be replaced for you—by "d" or another letter?

@auvansang
Copy link

@auvansang How should the character "đ" be replaced for you—by "d" or another letter?

Yes d is the correct letter

@juanlanus
Copy link

If "đ" was replaced by "d" it is theoretically possible (albeit not likely) to generate a duplicate slug.
For example if I hat a slug "mad" and generating a new slug the input was "mađ" then I'd have a collision: two "mad" slugs.
For some applications this might be totally irrelevant, because the URLs have a tipically numeric id before the slug, preventing address duplication.
So, the decision on how to replace "đ" depends on the context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment