Created
September 30, 2011 20:28
-
-
Save afeld/1254889 to your computer and use it in GitHub Desktop.
YouTube video ID regex
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Parses YouTube URLs directly or from iframe code. Handles: | |
# * Address bar on YouTube url (ex: http://www.youtube.com/watch?v=ZFqlHhCNBOI) | |
# * Direct http://youtu.be/ url (ex: http://youtu.be/ZFqlHhCNBOI) | |
# * Full iframe embed code (ex: <iframe src="http://www.youtube.com/embed/ZFqlHhCNBOI">) | |
# * Old <object> tag embed code (ex: <object><param name="movie" value="http://www.youtube.com/v/ZFqlHhCNBOI">...) | |
/(youtu\.be\/|youtube\.com\/(watch\?(.*&)?v=|(embed|v)\/))([^\?&"'>]+)/ | |
$5 #=> the video ID | |
# test it on Rubular: http://rubular.com/r/eaJeSMkJvo |
This seems to be a better regex (ig flags) - It is much more flexible to the input text:
Match is in 1st match group, all other groups are non-capturing:
(?:youtube(?:-nocookie)?\.com\/(?:[^\/\n\s]+\/\S+\/|(?:v|e(?:mbed)?)\/|\S*?[?&]v=)|youtu\.be\/)([a-zA-Z0-9_-]{11})\W
Tested against this text at regex101.com:
var url = "http://www.youtube.com/user/Scobleizer#p/u/1/1p3vONEsYGo
url = "http://youtu.be/NLqAFTWOVbY
var url = "http://www.youtube.com/embed/NLqTHREEVbY https://www.youtube.com/embed/NLqAFOURVbY
var url = "http://www.youtube.com/v/NLqAFFIVEbY?fs=1&hl=en_US
var url = "http://www.youtube.com/watch?v=NLqASIXrVbY
var url = "http://www.youtube.com/user/Scobleizer#p/u/1/1pSEVENsYGo
var url = "http://www.youtube.com/ytscreeningroom?v=NRHEIGHTx8I
var url = "http://www.youtube.com/user/Scobleizer#p/u/1/1p3NINEsYGo
var url = "http://www.youtube.com/watch?v=JYATEN_TzhA&feature=featured
# Parses YouTube URLs directly or from iframe code. Handles:
# * Address bar on YouTube url (ex: http://www.youtube.com/watch?v=ZFELEVEN-OI)
# * Direct http://youtu.be/ url (ex: http://youtu.be/ZFTWELVEBOI)
# * Full iframe embed code (ex: <iframe src="http://www.youtube.com/embed/13_lHhCNBOI">)
# * Old <object> tag embed code (ex: <object><param name="movie" value="http://www.youtube.com/v/FOURTEEN_--">...)
http://www.youtube.com/user/Scobleizer#p/u/1/1p3vLASTYGo
This last regexp does not work with ruby:
> url = "http://www.youtube.com/watch?v=NLqASIXrVbY"
=> "http://www.youtube.com/watch?v=NLqASIXrVbY"
> vid_regex = /(?:youtube(?:-nocookie)?\.com\/(?:[^\/\n\s]+\/\S+\/|(?:v|e(?:mbed)?)\/|\S*?[?&]v=)|youtu\.be\/)([a-zA-Z0-9_-]{11})\W/
=> /(?:youtube(?:-nocookie)?\.com\/(?:[^\/\n\s]+\/\S+\/|(?:v|e(?:mbed)?)\/|\S*?[?&]v=)|youtu\.be\/)([a-zA-Z0-9_-]{11})\W/
> url =~ vid_regex
=> nil
> $1
=> nil
@toots it works for me removing the last \W
Same thing for php - must remove trailing \W
. I also had to enclose it in slashes /
.
Thanks, I use it in PHP:
function getYoutubeId($url)
{
// original regex source: https://gist.github.com/afeld/1254889#gistcomment-1253992
$regex = '/(?:youtube(?:-nocookie)?\.com\/(?:[^\/\n\s]+\/\S+\/|(?:v|e(?:mbed)?)\/|\S*?[?&]v=)|youtu\.be\/)([a-zA-Z0-9_-]{11})/mi';
preg_match($regex, $url, $matches);
return isset($matches[1]) ? $matches[1] : null;
}
$url = 'https://www.youtube.com/watch?v=SN102svZHQg';
echo getYoutubeId($url); // SN102svZHQg
private const MAGIC_REGEX = '#^(?:https?://|//)?(?:www\.|m\.|.+\.)?(?:youtu\.be/|youtube\.com/(?:embed/|v/|shorts/|feeds/api/videos/|watch\?v=|watch\?.+&v=))([\w-]{11})(?![\w-])#';
added support for shorts :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks! Just one comment for other people coming across this, you may need to escape the "' symbols:
/(youtu.be/|youtube.com/(watch?(.*&)?v=|(embed|v)/))([^\?&\"\'>]+)/