Skip to content

Instantly share code, notes, and snippets.

@franz-josef-kaiser
Last active July 13, 2022 08:37
Show Gist options
  • Save franz-josef-kaiser/5416e776bf6b1f2152ee to your computer and use it in GitHub Desktop.
Save franz-josef-kaiser/5416e776bf6b1f2152ee to your computer and use it in GitHub Desktop.
PHP Regex vs. DOMDocument
<?php
$content = 'foo bar baz <img src="http://ak-hdl.buzzfed.com/static/2014-06/6/12/enhanced/webdr08/enhanced-21313-1402070821-11.jpg" width="500" height="1000"/> dragons and ponies.';
$dom = new \DOMDocument( '1.0' );
$dom->loadHTML( $content );
for ( $i = 0; $i < 1000; $i++ )
{
// \DOMNodesList
$nodes = $dom->getElementsByTagName( 'img' );
if ( $nodes->length )
{
$img = $nodes->item(0);
$src = $img->getAttribute( 'src' );
$width = $img->getAttribute( 'width' );
$height = $img->getAttribute( 'height' );
}
}
for ( $i = 0; $i < 1000; $i++ )
{
if ( preg_match_all( '#<(?P<tag>img)[^<]*?(?:>[\s\S]*?<\/(?P=tag)>|\s*\/>)#', $content, $matches ) )
{
foreach ( $matches[0] as $match )
{
preg_match( '/ src="([^"]+)"/', $content, $src );
list( $image_src ) = explode( '?', $src );
preg_match( '/ width="([0-9]+)"/', $content, $width );
preg_match( '/ height="([0-9]+)"/', $content, $height );
}
}
}
@andykillen
Copy link

this is not a valid test. The DomNodesList version just looks at the very first returned image, whereas the regex version checks every image. With the test data presented its ok, but not if that is expanded to contain many images.

add a loop of the $nodes to make this a valid like for like test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment