RegEx Snippets

Extract URLs

Find all links

Works pretty well in capturing the full URL when using this in a search (like in Sublime Text 2). (https?|ftps?)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/?

Find all links within a specific subfolder structure and replace with alt subfolders

The following will capture the URL in a SQL dump including escaped quotation marks.

URL pattern: http://www.yourwebsite.org/calculator/degrees/sociology
RegEx: (https?|ftps?)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/calculator/degrees/([^\\"]+)
Replacement: http://www.yourwebsite.org/degrees/$3/
Result: http://www.yourwebsite.org/degrees/sociology/

Find all links without trailing slashes

This will find all hrefs that do not contain a trailing slash. NOTE: it will detect links in your <head> and ones that end in .html that purposfully do not have a trailing slash, so becareful performing a find/replace.

URL pattern: href="http://www.yourwebsite.org/calculator/degrees/financial"
RegEx: href="(\S)+[^/]"

Find all `id="*"` attributes

This will find all id tags with either ' or ".

RegEx: id=("|')[^("|')]*("|')

Find all `<a href=""></a>` anchor tags

RegEx: <(?:\s?)[aA].*?href=[\'\"](?<link>.*?)[\'\"].*?>(?<text>.*)<(?:\s?)\/(?:\s?)[aA](?:\s?)>

Replaces Subdomain URLs

Convert from http://old.yourwebsite.org/whatever/ to http://new.yourwebsite.com/whatever/

EXPLAINED:

since the pattern will contain literal forward slashes for the url (eg "schema://domain/path"), we're delimiting the path with pipe chars to avoid having to backslash-escape each forward slash
just in case we have a mix of http & https urls, we'll match both with "https?" which means match "http" followed by one or zero "s" chars
we're backslash-escaping the dots in the domain, since in regex syntax an unescaped dot normally means "any single character other than a newline"
we're capturing everything before and after the "request" in "requestinfo" with parentheses in the pattern, then joining them together in the replacement using backreferences
we're making the entire pattern match as case insensitive by adding an "i" flag after the closing pattern delimiter

preg_replace('|(https?://)old(new\.yourwebsite\.)org|i', '$1$2com', $content);

Rewrite subfolder with keywords

#RewriteRule ^calculator/degrees(?:/([\w-]+?)(?:-in.+)?)?/?$ /degrees/$1/ [L,R=301]

resultakak/RegEx Snippets.md

RegEx Snippets

Extract URLs

Find all links

Find all links within a specific subfolder structure and replace with alt subfolders

Find all links without trailing slashes

Find all `id="*"` attributes

Find all `<a href=""></a>` anchor tags

Replaces Subdomain URLs

Rewrite subfolder with keywords

Misc.

resultakak/RegEx Snippets.md

RegEx Snippets

Extract URLs

Find all links

Find all links within a specific subfolder structure and replace with alt subfolders

Find all links without trailing slashes

Find all id="*" attributes

Find all <a href=""></a> anchor tags

Replaces Subdomain URLs

Rewrite subfolder with keywords

Misc.

Find all `id="*"` attributes

Find all `<a href=""></a>` anchor tags