Last active
January 22, 2024 13:18
-
-
Save jakzal/8dd52d3df9a49c1e5922 to your computer and use it in GitHub Desktop.
Removing nodes with DomCrawler
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
<<<CONFIG | |
packages: | |
- "symfony/dom-crawler: ~2.3" | |
- "symfony/css-selector: ~2.3" | |
CONFIG; | |
use Symfony\Component\DomCrawler\Crawler; | |
$html = <<<HTML | |
<html> | |
<div class="content"> | |
<h2 class="gamma">Excerpt</h2> | |
<p>...content html...</p> | |
</div> | |
<div class="content"> | |
<h2 class="gamma">Excerpt</h2> | |
<p>...more content html...</p> | |
</div> | |
</html> | |
HTML; | |
$crawler = new Crawler($html, 'http://localhost'); | |
// remove all h2 nodes inside .content | |
$crawler->filter('html .content h2')->each(function (Crawler $crawler) { | |
foreach ($crawler as $node) { | |
$node->parentNode->removeChild($node); | |
} | |
}); | |
// output .content nodes with h2 removed | |
$crawler->filter('html .content')->each(function (Crawler $crawler) { | |
echo $crawler->html(); | |
}); |
this helps me today. thanks!
Thanks for this. Awesome example.
+1 Thanks
Thank you, man with a glorious beer
👍
much appreciated
Since each()
gets one node at a time, this has the same effect:
$crawler->filter('html .content h2')->each(function (Crawler $crawler) {
$node = $crawler->getNode(0);
$node->parentNode->removeChild($node);
});
IMO this is more readable
Since
each()
gets one node at a time, this has the same effect
👍 Nice catch
Since
each()
gets one node at a time, this has the same effect:$crawler->filter('html .content h2')->each(function (Crawler $crawler) { $node = $crawler->getNode(0); $node->parentNode->removeChild($node); });
IMO this is more readable
@NinoSkopac I am aware that it has been several years since your message, but for anyone stumbling across this thread as I have, be aware that you can't use this method in WebDriver mode now. You get the following error:
Uncaught InvalidArgumentException: The "getNode" method cannot be used in WebDriver mode. Use "getElement" instead.
You're amazing thanks!!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Run this with melody: