osddeitf/Vietnamese-Regex.md

Last active July 1, 2020 13:38

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/osddeitf/e6922618534e8cbecff9b0bc7ebf3413.js"></script>
Save osddeitf/e6922618534e8cbecff9b0bc7ebf3413 to your computer and use it in GitHub Desktop.

Regex for capturing Vietnamese characters.

Raw

Regex to capture Vietnamese characters (aăâbcdđ..., àáảãạ).

/^[a-z\u0111\u0300\u0301\u0302\u0303\u0306\u0309\u031b\u0323]+$/i

Usage:

regexp.test(str.normalize('NFD'))

Vietnamese characters actually not having 'fjwz', I'm just lazy so I use a-z instead.

What are '\uxxxx' characters anyway? Well, good question.

They are diacritics marks, as extract from, e.g. 'ắ'.normalize('NFD').