Skip to content

Instantly share code, notes, and snippets.

@573
Forked from jhy/RemoveComments.java
Created May 9, 2022 13:14
Show Gist options
  • Save 573/8f2be8551abaf089df67d44eccef6c16 to your computer and use it in GitHub Desktop.
Save 573/8f2be8551abaf089df67d44eccef6c16 to your computer and use it in GitHub Desktop.
// example solution to remove comments from HTML.
// re: http://groups.google.com/group/jsoup/browse_thread/thread/419b5ac4be88b086
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Node;
public class RemoveComments {
public static void main(String... args) {
String h = "<div><!-- no --><p>Hello<!-- gone --></div>";
Document doc = Jsoup.parse(h);
removeComments(doc);
print(doc.html());
}
private static void removeComments(Node node) {
// as we are removing child nodes while iterating, we cannot use a normal foreach over children,
// or will get a concurrent list modification error.
int i = 0;
while (i < node.childNodes().size()) {
Node child = node.childNode(i);
if (child.nodeName().equals("#comment"))
child.remove();
else {
removeComments(child);
i++;
}
}
}
private static void print(String msg) {
System.out.println(msg);
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment