Skip to content

Instantly share code, notes, and snippets.

@jakzal
Last active January 22, 2024 13:18
Show Gist options
  • Select an option

  • Save jakzal/8dd52d3df9a49c1e5922 to your computer and use it in GitHub Desktop.

Select an option

Save jakzal/8dd52d3df9a49c1e5922 to your computer and use it in GitHub Desktop.
Removing nodes with DomCrawler
<?php
<<<CONFIG
packages:
- "symfony/dom-crawler: ~2.3"
- "symfony/css-selector: ~2.3"
CONFIG;
use Symfony\Component\DomCrawler\Crawler;
$html = <<<HTML
<html>
<div class="content">
<h2 class="gamma">Excerpt</h2>
<p>...content html...</p>
</div>
<div class="content">
<h2 class="gamma">Excerpt</h2>
<p>...more content html...</p>
</div>
</html>
HTML;
$crawler = new Crawler($html, 'http://localhost');
// remove all h2 nodes inside .content
$crawler->filter('html .content h2')->each(function (Crawler $crawler) {
foreach ($crawler as $node) {
$node->parentNode->removeChild($node);
}
});
// output .content nodes with h2 removed
$crawler->filter('html .content')->each(function (Crawler $crawler) {
echo $crawler->html();
});
@jakzal
Copy link
Copy Markdown
Author

jakzal commented Apr 1, 2015

Run this with melody:

melody run https://gist.github.com/jakzal/8dd52d3df9a49c1e5922

@howtomakeaturn
Copy link
Copy Markdown

this helps me today. thanks!

@Verron
Copy link
Copy Markdown

Verron commented Jul 19, 2016

Thanks for this. Awesome example.

@Insolita
Copy link
Copy Markdown

+1 Thanks

@yog-strina
Copy link
Copy Markdown

Thank you, man with a glorious beer

@Exadra37
Copy link
Copy Markdown

Exadra37 commented Aug 1, 2017

๐Ÿ‘

@NinoSkopac
Copy link
Copy Markdown

much appreciated

@NinoSkopac
Copy link
Copy Markdown

Since each() gets one node at a time, this has the same effect:

$crawler->filter('html .content h2')->each(function (Crawler $crawler) {
    $node = $crawler->getNode(0);
    $node->parentNode->removeChild($node);
});

IMO this is more readable

@broiniac
Copy link
Copy Markdown

broiniac commented Oct 29, 2021

Since each() gets one node at a time, this has the same effect

๐Ÿ‘ Nice catch

@ajmeese7
Copy link
Copy Markdown

Since each() gets one node at a time, this has the same effect:

$crawler->filter('html .content h2')->each(function (Crawler $crawler) {
    $node = $crawler->getNode(0);
    $node->parentNode->removeChild($node);
});

IMO this is more readable

@NinoSkopac I am aware that it has been several years since your message, but for anyone stumbling across this thread as I have, be aware that you can't use this method in WebDriver mode now. You get the following error:

Uncaught InvalidArgumentException: The "getNode" method cannot be used in WebDriver mode. Use "getElement" instead.

@eduance
Copy link
Copy Markdown

eduance commented Jan 22, 2024

You're amazing thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment