Performance Comparison:
Last active
July 13, 2022 08:37
-
-
Save franz-josef-kaiser/5416e776bf6b1f2152ee to your computer and use it in GitHub Desktop.
PHP Regex vs. DOMDocument
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
$content = 'foo bar baz <img src="http://ak-hdl.buzzfed.com/static/2014-06/6/12/enhanced/webdr08/enhanced-21313-1402070821-11.jpg" width="500" height="1000"/> dragons and ponies.'; | |
$dom = new \DOMDocument( '1.0' ); | |
$dom->loadHTML( $content ); | |
for ( $i = 0; $i < 1000; $i++ ) | |
{ | |
// \DOMNodesList | |
$nodes = $dom->getElementsByTagName( 'img' ); | |
if ( $nodes->length ) | |
{ | |
$img = $nodes->item(0); | |
$src = $img->getAttribute( 'src' ); | |
$width = $img->getAttribute( 'width' ); | |
$height = $img->getAttribute( 'height' ); | |
} | |
} | |
for ( $i = 0; $i < 1000; $i++ ) | |
{ | |
if ( preg_match_all( '#<(?P<tag>img)[^<]*?(?:>[\s\S]*?<\/(?P=tag)>|\s*\/>)#', $content, $matches ) ) | |
{ | |
foreach ( $matches[0] as $match ) | |
{ | |
preg_match( '/ src="([^"]+)"/', $content, $src ); | |
list( $image_src ) = explode( '?', $src ); | |
preg_match( '/ width="([0-9]+)"/', $content, $width ); | |
preg_match( '/ height="([0-9]+)"/', $content, $height ); | |
} | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
this is not a valid test. The DomNodesList version just looks at the very first returned image, whereas the regex version checks every image. With the test data presented its ok, but not if that is expanded to contain many images.
add a loop of the $nodes to make this a valid like for like test