-
-
Save simonwhitaker/5748487 to your computer and use it in GitHub Desktop.
var tweet = "Currently chilling out at W1B 2EL, then on to WC2E 8HA or maybe even L1 8JF! :-)"; | |
// Here's a simple regex that tries to recognise postcode-like strings. | |
// See http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation | |
// for the rules on how UK postcodes are formatted. | |
var postcode_regex = /[A-Z]{1,2}[0-9][0-9A-Z]?\s?[0-9][A-Z]{2}/g; | |
var postcodes = tweet.match(postcode_regex); | |
console.log(postcodes); |
Thanks for the heads up @MarkBird
Just arrived here in the future from a search engine.
Likewise, thanks @MarkBird!
As someone else arriving in the future from a search engine, I was curious why the UK Gov's validation didn't work as mentioned by @MarkBird.
Turns out that the regex itself appears to be correct - just not when you copy-paste it. The error is because one of the important '-' characters gets stripped out when you copy it, as it thinks that it's to break a word across a newline.
I'm not sure what the comment about spaces means; looking at the regex, it requires exactly one space (0x20) character, so is perhaps a little pedantic for some uses. It successfuly finds all the example postcodes in @diegoarcega's test (except WC2A, which is not a complete postcode on its own, but is a prefix/district. Full WC2A 0XX postcodes are matched)
This is the probably correct (but pedantic) regex from the UK Gov document linked by @indirap, to save anyone else from the copypaste issue:
^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) [0-9][A-Za-z]{2})$
Edit: tested and working against the ONS list of ~2.6million real postcodes (all have spaces)
Thanks for the helpful post!
My use case needed it to also work with 0 or 1 spaces {0,1}
, so here is my slightly modded version for anyone's convenience, including future me!
^([Gg][Ii][Rr] 0[Aa]{2})|((([A-Za-z][0-9]{1,2})|(([A-Za-z][A-Ha-hJ-Yj-y][0-9]{1,2})|(([A-Za-z][0-9][A-Za-z])|([A-Za-z][A-Ha-hJ-Yj-y][0-9]?[A-Za-z])))) {0,1}[0-9][A-Za-z]{2})$
I'd advise anyone arriving here from search engines to avoid the regex in the above comment (it's perhaps typical that the UK government would be using a badly formed regex) because it doesn't match a lot of valid postcodes (e.g. W1T 1PG), and it's inconsistent in that it matches some postcodes with a space and some without. The one posted by lucasjahn seems to work fine though.