Skip to content

Instantly share code, notes, and snippets.

@rodneyrehm
Created July 8, 2012 09:20
Show Gist options
  • Save rodneyrehm/3070128 to your computer and use it in GitHub Desktop.
Save rodneyrehm/3070128 to your computer and use it in GitHub Desktop.
PHP: parse HTML element attributes
<?php
function parseAttributes($text) {
$attributes = array();
$pattern = '#(?(DEFINE)
(?<name>[a-zA-Z][a-zA-Z0-9-:]*)
(?<value_double>"[^"]+")
(?<value_single>\'[^\']+\')
(?<value_none>[^\s>]+)
(?<value>((?&value_double)|(?&value_single)|(?&value_none)))
)
(?<n>(?&name))(=(?<v>(?&value)))?#xs';
if (preg_match_all($pattern, $text, $matches, PREG_SET_ORDER)) {
foreach ($matches as $match) {
$attributes[$match['n']] = isset($match['v'])
? trim($match['v'], '\'"')
: null;
}
}
return $attributes;
}
//$text = '<a double="google.com" keyword single=\'google.com\' none=google.com>';
$text = 'double="double.com" keyword single=\'single.com\' none=none.com';
$res = parseAttributes($text);
var_dump($res);
@rickhellewell
Copy link

Hello! Stumbled (actually, googled) upon this. Nice regex; but doesn't parse elements that might have spaces around the '=', as in ' title = "something" ' (any number of spaces around/after the = character).
Adjustment?
Thanks...Rick..

@ryanbriscall
Copy link

@rickhellewell Try changing line 12
from: (?<n>(?&name))(=(?<v>(?&value)))?#xs';
to: (?<n>(?&name))(\s*=\s*(?<v>(?&value)))?#xs';

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment