Forked from janogarcia/php_valid_twitter_hashtag_regex.php
Created
March 16, 2017 16:24
-
-
Save antoniofrignani/1c26a710858e1850f0ed8e4c85eb495f to your computer and use it in GitHub Desktop.
PHP Twitter Hashtag Validation Regex
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
/** | |
* PHP Regex to validate a Twitter hashtag | |
* | |
* Useful for validating a text input (an HTML form in your CMS or custom application) that must be a valid Twitter hashtag. | |
* Valid examples: #a, #_, #_1, #_a, #1a, #áéìôü, #123hàsh_täg446 | |
* Invalid examples: #1, ##hashtag, #hash-tag, #hash.tag, #hash tag, #hashtag!, (any hashtag that is more than 140 characters long, hash symbol included) | |
* | |
* Regex explanation: | |
* First, the lookahead assertion (?=.{2,140}$) checks the minimum and max length, as explained here http://stackoverflow.com/a/4223213/1441613 | |
* A hash symbol must be the first character. The allowed values for the hash symbol can be expressed with any of the following subpatterns: (#|\\uff0){1}, (#|\x{ff03}){1}, or (#|#){1}. | |
* A hashtag can contain any UTF-8 alphanumeric character, plus the underscore symbol. That's expressed with the character class [0-9_\p{L}]*, based on http://stackoverflow.com/a/5767106/1441613 | |
* A hashtag can't be only numeric, it must have at least one alpahanumeric character or the underscore symbol. That condition is checked by ([0-9_\p{L}]*[_\p{L}][0-9_\p{L}]*), similar to http://stackoverflow.com/a/1051998/1441613 | |
* Finally, the modifier 'u' is added to ensure that the strings are treated as UTF-8. | |
* | |
* More info: | |
* https://github.com/twitter/twitter-text-conformance | |
* https://github.com/nojimage/twitter-text-php | |
* https://github.com/ngnpope/twitter-text-php | |
*/ | |
preg_match('/^(?=.{2,140}$)(#|\x{ff03}){1}([0-9_\p{L}]*[_\p{L}][0-9_\p{L}]*)$/u', '#hashtag'); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment