Last active
August 21, 2024 14:26
-
-
Save abn/779166b0c766ce67351c588489831852 to your computer and use it in GitHub Desktop.
A slugify function for postgres
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- original source: https://medium.com/adhawk-engineering/using-postgresql-to-generate-slugs-5ec9dd759e88 | |
-- https://www.postgresql.org/docs/9.6/unaccent.html | |
CREATE EXTENSION IF NOT EXISTS unaccent; | |
-- create the function in the public schema | |
CREATE OR REPLACE FUNCTION public.slugify( | |
v TEXT | |
) RETURNS TEXT | |
LANGUAGE plpgsql | |
STRICT IMMUTABLE AS | |
$function$ | |
BEGIN | |
-- 1. trim trailing and leading whitespaces from text | |
-- 2. remove accents (diacritic signs) from a given text | |
-- 3. lowercase unaccented text | |
-- 4. replace non-alphanumeric (excluding hyphen, underscore) with a hyphen | |
-- 5. trim leading and trailing hyphens | |
RETURN trim(BOTH '-' FROM regexp_replace(lower(unaccent(trim(v))), '[^a-z0-9\\-_]+', '-', 'gi')); | |
END; | |
$function$; |
if you put the hyphen to the end, there is no need for escaping at all [^a-z0-9_-]+
Thanks!
Note that I've had issues when trying to restore a dump of a db using this slugify
method. The solution was to specify the schema for unaccent
(not obvious, it took me hours to find the solution) :
CREATE OR REPLACE FUNCTION public.slugify(
v TEXT
) RETURNS TEXT
LANGUAGE plpgsql
STRICT IMMUTABLE AS
$function$
BEGIN
-- 1. trim trailing and leading whitespaces from text
-- 2. remove accents (diacritic signs) from a given text
-- 3. lowercase unaccented text
-- 4. replace non-alphanumeric (excluding hyphen, underscore) with a hyphen
-- 5. trim leading and trailing hyphens
RETURN trim(BOTH '-' FROM regexp_replace(lower(public.unaccent(trim(v))), '[^a-z0-9\\-_]+', '-', 'gi'));
END;
$function$;
@kbsali if public is in your user's search path, you should not need to explicitly need to prefix unaccent with public.
AFAIK, the default search path is "$user", public
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@libovness
Replace the regex by
[^a-z0-9\-_]+
(remove the extra slash)There is an extra \ which means according to regex101
Which means that \, ], ^, _ won't be replaced.