Skip to content

Instantly share code, notes, and snippets.

@damusnet
Forked from marcgg/gist:733592
Created August 11, 2012 10:28
Show Gist options
  • Save damusnet/3323622 to your computer and use it in GitHub Desktop.
Save damusnet/3323622 to your computer and use it in GitHub Desktop.
Regex to get the Facebook Page ID from a given URL
<?php
$tests = array(
'http://www.facebook.com/page_id' => 'page_id',
'https://www.facebook.com/page_id' => 'page_id',
'http://www.facebook.com/#!/page_id' => 'page_id',
'http://www.facebook.com/pages/Parisé-France/Vanity-Url/123456?v=app_555' => '123456',
'http://www.facebook.com/pages/Vanity-Url/45678' => '45678',
'http://www.facebook.com/#!/page_with_1_number' => 'page_with_1_number',
'http://www.facebook.com/bounce_page#!/pages/Vanity-Url/45678' => '45678',
'http://www.facebook.com/bounce_page#!/my_page_id?v=app_166292090072334' => 'my_page_id',
'http://www.facebook.com/pages/some-café-or-èàù-url/123456' => '123456',
'http://www.facebook.com/some-vanity-url/123456' => '123456',
'http://www.facebook.com/some.page.9' => 'some.page.9',
'http://www.facebook.com/a_page_with_id/123456789?ref=hl' => '123456789',
'vanityurl/123456789?ref=hl' => '123456789',
'pages/really-long-vanity-url-page/123456789?ref=hl' => '123456789',
'http://www.facebook.com/profile.php?id=123456789' => '123456789'
);
echo '<table>';
foreach ($tests as $url => $result) {
echo '<tr><td>';
echo $url;
echo '</td><td>';
echo preg_replace(
'#'
. '(?:https?://)?'
. '(?:www.)?'
. '(?:facebook.com/)?'
. '(?:(?:\w)*\#!/)?'
. '(?:pages/)?'
. '(?:[?\p{L}\-_]*/)?'
. '(?:[?\w\-_]*/)?'
. '(?:profile.php\?id=(?=\d.*))?'
. '([\d\-]*)?'
. '(?:\?.*)?'
. '#u',
'$1',
$url
);
echo '</td></tr>';
}
echo '</table>';
?>
@damusnet
Copy link
Author

Based on a small sample of real world data, I added a few use cases and improved on the regex:

  • handles https pages
  • works without the "facebook.com" part
  • changed the delimiter from / to # to be more readable (based on a comment this SO question: http://stackoverflow.com/q/11907178/233404)
  • added the u modifier and replaced \w by \p{L} to account for possible accents

And I wrapped the whole in a small test to easily add more use cases. Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment