~
(?:(smile\.|www\.))? # optionally starts with smile. or www.
ama?zo?n\. # also allow shortened amzn.com URLs
(?:
com # match all Amazon domains
|
ca
|
co\.uk
|
co\.jp
|
de
|
fr
)
/
(?: # here comes the stuff before the ASIN
exec/obidos/ASIN/ # the possible components of an Amazon URL
|
o/
|
gp/product/
|
(?: # the dp/ format may contain a title
(?:[^"\'/]*)/ # anything but a slash or quote
)? # optional
dp/
| # if amzn.com format, nothing before the ASIN
)
([A-Z0-9]{10}) # capture group $2 will contain the ASIN
(?: # everything after the ASIN
(?:/|\?|\#) # starting with a slash, question mark, or hash
(?:[^"\'\s]*) # everything up to a quote or white space
)? # optional
~isx
Last active
August 31, 2022 11:12
-
-
Save GreenFootballs/6731201fafc67ecc9322ccb4a7977018 to your computer and use it in GitHub Desktop.
A PHP regular expression to match Amazon links and extract the ASIN identifier
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@mostmaz :
$asin = preg_replace($regex, '$2', $your_amazon_url);
or with preg_match
you can replace
$2
/$match[2]
by "1" if remove bracket :(?:(smile\.|www\.))?
>(?:smile\.|www\.)?