CREATE EXTENSION IF NOT EXISTS "unaccent" | |
CREATE OR REPLACE FUNCTION slugify("value" TEXT) | |
RETURNS TEXT AS $$ | |
-- removes accents (diacritic signs) from a given string -- | |
WITH "unaccented" AS ( | |
SELECT unaccent("value") AS "value" | |
), | |
-- lowercases the string | |
"lowercase" AS ( | |
SELECT lower("value") AS "value" | |
FROM "unaccented" | |
), | |
-- remove single and double quotes | |
"removed_quotes" AS ( | |
SELECT regexp_replace("value", '[''"]+', '', 'gi') AS "value" | |
FROM "lowercase" | |
), | |
-- replaces anything that's not a letter, number, hyphen('-'), or underscore('_') with a hyphen('-') | |
"hyphenated" AS ( | |
SELECT regexp_replace("value", '[^a-z0-9\\-_]+', '-', 'gi') AS "value" | |
FROM "removed_quotes" | |
), | |
-- trims hyphens('-') if they exist on the head or tail of the string | |
"trimmed" AS ( | |
SELECT regexp_replace(regexp_replace("value", '\-+$', ''), '^\-', '') AS "value" | |
FROM "hyphenated" | |
) | |
SELECT "value" FROM "trimmed"; | |
$$ LANGUAGE SQL STRICT IMMUTABLE; |
CREATE FUNCTION public.set_slug_from_name() RETURNS trigger | |
LANGUAGE plpgsql | |
AS $$ | |
BEGIN | |
NEW.slug := slugify(NEW.name); | |
RETURN NEW; | |
END | |
$$; | |
CREATE TRIGGER "trg_slug_insert" | |
BEFORE INSERT ON "my_table_name" | |
FOR EACH ROW | |
WHEN (NEW.name IS NOT NULL AND NEW.slug IS NULL) | |
EXECUTE PROCEDURE set_slug_from_name(); |
How would you rework this to make sure the slug is unique and if not add a value at the end?
At the moment the trigger and function is not aware of what table it's running against, so there's no immediately obvious way to check the uniqueness. You could use TG_RELID
and pass through the table name to loop and check, but it's maybe not the best place for this logic.
A quick win would be to enforce uniqueness on the name column. Alternatively create the slug with the name and the primary key column:
NEW.slug := slugify(concat(NEW.name, '-', (NEW.id)::varchar);
Assuming you had a name of "Foo bar" and an ID of "1" this would give you "foo-bar-1" and the use of the primary key should always keep it unique, but you might not want to leak your primary key info like this.
This is great. Thank you. I was looking to have another function that checked the table and looped through it to see if the value was unique but this is a good fit for now. You rock.
Man, this is incredibly useful!! Thanks a lot, been searching for this for quite a while before I found your post: https://www.kdobson.net/2019/ultimate-postgresql-slug-function/
Man, this is incredibly useful!! Thanks a lot, been searching for this for quite a while before I found your post: https://www.kdobson.net/2019/ultimate-postgresql-slug-function/
Thanks augnustin. There's a function below which checks for duplicates then adds an incrementing number until it's unique. Not that well tested and probably not best practice:
`
CREATE or replace FUNCTION public.set_slug_from_name() RETURNS trigger
LANGUAGE plpgsql
AS
DECLARE
sql_string varchar;
tmp_slug varchar;
increment integer;
tmp_row record;
tmp_row_count integer;
BEGIN
tmp_row_count = 1;
increment = 0;
while tmp_row_count > 0 LOOP
if increment > 0 then
tmp_slug = slugify(NEW.name || ' ' || increment::varchar);
ELSE
tmp_slug = slugify(NEW.name);
end if;
sql_string = format(' select count(1) cnt from ' || TG_TABLE_NAME || ' where slug = ''' || tmp_slug || '''; ');
for tmp_row in execute(sql_string)
loop
raise notice '%', tmp_row;
tmp_row_count = tmp_row.cnt;
end loop;
increment = increment + 1;
END LOOP;
NEW.slug := tmp_slug;
RETURN NEW;
END
`
Cheers @kez
I don't need that though, because in my case, I simply need to write things like select * from users u where slufigy(users.city) = 'paris';
What do you think about this oneliner:
SELECT TRIM(BOTH '-_' FROM regexp_replace(unaccent(lower('VALUE')), '[^a-z0-9\\-_]+', '-', 'gi'));
- I think "remove single and double quotes" is not required, since the next regex will remove it anyway.
- the trims hyphens in your example does not remove underscore and TRIM is build in function in postgres.
If you don't care that the slug changes when the name changes, you can also make it a generated column since slugify(value text)
is immutable.
CREATE TABLE person (
name text not null,
slug text generated always as (slugify(name)) stored
);
I've rewritten this in PL/PGSQL, making it more readable to me. Also added some german specific umlaut replacement rules, and declared it as PARALLEL SAFE which allows for a substantially neater and faster query plan in my case.
CREATE EXTENSION IF NOT EXISTS "unaccent";
CREATE OR REPLACE FUNCTION slugify(t text) RETURNS text
AS $$
BEGIN
t := regexp_replace(t, '[Ää]', 'ae', 'g');
t := regexp_replace(t, '[Öö]', 'oe', 'g');
t := regexp_replace(t, '[Üü]', 'ue', 'g');
t := unaccent(t);
t := lower(t);
t := regexp_replace(t, '[''"]+', '', 'gi');
t := regexp_replace(t, '[^a-z0-9\\-_]+', '-', 'gi');
t := regexp_replace(t, '\-+$', '');
t := regexp_replace(t, '^\-', '');
RETURN t;
END;
$$ LANGUAGE plpgsql STRICT IMMUTABLE PARALLEL SAFE;
Thanks for sharing this modified version of @ianks function, the removal of the leading and trailing -
didn't work for me. So I replaced it with the trim
string function.
"trimmed" AS (
SELECT trim(both '-' from "value") AS "value"
FROM "hyphenated"
)
the following chars are all not being filtered out:
\ backslash
] right square bracket
^ caret
_ underscore
The issue seems to be a slash too many, \\-_
will match characters from range "\" to "_" (char code 92 to 95).
Removing one of the slashes made it work as intended, i.e. :
-- replaces anything that's not a letter, number, hyphen('-'), or underscore('_') with a hyphen('-')
"hyphenated" AS (
SELECT regexp_replace("value", '[^a-z0-9\-_]+', '-', 'gi') AS "value"
FROM "removed_quotes"
),
the following chars are all not being filtered out:
\ backslash ] right square bracket ^ caret _ underscore [...]
Thank you @NielsRenard! With your fix the slugify
function is now more robust for corner cases like "the title [something]".
@kez: can you please update the code in the gist please? This is a useful utility.
Forked to remove single and double quotes so they don't get switched to hyphens. Also updated the trailing/leading hyphen regex as the
\\-
was not working for me ('\-
does).