Skip to content

Instantly share code, notes, and snippets.

@shyamsalimkumar
Last active August 21, 2016 00:28
Show Gist options
  • Save shyamsalimkumar/9f62dbe94b3e6f8186cf69aa7a4faf60 to your computer and use it in GitHub Desktop.
Save shyamsalimkumar/9f62dbe94b3e6f8186cf69aa7a4faf60 to your computer and use it in GitHub Desktop.

The following lines correspond to the output of lines 79-82. I just get an infinite loop instead. I presume that node elements first child node.FirstChild's parent ie node.FirstChild.Parent would be node itself rather than node.Parent.

Since the heirarchy is div.color_picker > div.over_wrap. Its probably some Golang concept that I've over looked or something silly that I can't seem to notice. What am I doing wrong?

####Expected result

actual parent
&{0xc0820347e0 0xc082034930 0xc082034a10 0xc082034850 0xc082034a80 3 div div  [{
 class over_wrap} { data-cc 0}]}
child's parent
&{0xc0820347e0 0xc082034930 0xc082034a10 0xc082034850 0xc082034a80 3 div div  [{
 class over_wrap} { data-cc 0}]}

####Actual Result

actual parent
&{0xc0820347e0 0xc082034930 0xc082034a10 0xc082034850 0xc082034a80 3 div div  [{
 class over_wrap} { data-cc 0}]}
child's parent
&{0xc082034460 0xc082034850 0xc082034a80 0xc082034770 0xc082034af0 3 div div  [{
 class color_picker}]}
package main
import (
"bytes"
"fmt"
"io"
"golang.org/x/net/html"
)
func main() {
b := bytes.NewBufferString(`<html>
<head></head>
<body>
<div class="product-card front">
<div class="product-description">
<a href="">
<p class="product-name">Test text</p>
</a>
<div class="color_picker">
<div class="over_wrap" data-cc="0">
<a href="#"></a>
</div>
</div>
</div>
</div>
</body>
</html>
`)
parse(b)
}
func parse(r io.Reader) {
doc, err := html.Parse(r)
if err != nil {
fmt.Printf("error: %s\n", err)
return
}
findProductCardFronts(doc)
}
func findProductCardFronts(node *html.Node) {
var isProductCardFront bool
if isDivElementNode(node) && getClass(node) == "product-card front" {
isProductCardFront = true
}
for child := node.FirstChild; child != nil; child = child.NextSibling {
if !isProductCardFront {
findProductCardFronts(child)
} else {
searchForProductDescription(child)
}
}
}
func searchForProductDescription(node *html.Node) {
if isDivElementNode(node) && getClass(node) == "product-description" {
for child := node.FirstChild; child != nil; child = child.NextSibling {
colorsAvailable := findColorsAvailable(child)
fmt.Println(colorsAvailable)
}
}
}
func findColorsAvailable(node *html.Node) (colorsAvailable []string) {
if isDivElementNode(node) && getClass(node) == "color_picker" {
for child := node.FirstChild; child != nil; child = child.NextSibling {
findOverWrapDiv(child)
}
}
return
}
func findOverWrapDiv(node *html.Node) {
if isDivElementNode(node) && getClass(node) == "over_wrap" {
for child := node.FirstChild; child != nil; child = node.NextSibling {
// This section here
// I expect `node` to be equal to `child.Parent` but its not.
fmt.Println("actual parent")
fmt.Println(node)
fmt.Println("child's parent")
fmt.Println(child.Parent)
// ^ Above section
}
}
}
func isElementNode(node *html.Node, data string) bool {
return node.Type == html.ElementNode && node.Data == data
}
func isAnchorElementNode(node *html.Node) bool {
return node.Type == html.ElementNode && node.Data == "a"
}
func hasHref(node *html.Node) (href string, foundHref bool) {
for _, attr := range node.Attr {
if attr.Key == "href" && len(attr.Val) != 0 {
href = attr.Val
foundHref = true
return
}
}
return
}
func getClass(node *html.Node) (class string) {
for _, attr := range node.Attr {
if attr.Key == "class" {
return attr.Val
}
}
return
}
func isDivElementNode(node *html.Node) bool {
return node.Type == html.ElementNode && node.Data == "div"
}
func isParagraphElementNode(node *html.Node) bool {
return node.Type == html.ElementNode && node.Data == "p"
}
func isTextNode(node *html.Node) bool {
return node.Type == html.TextNode
}
<html>
<head></head>
<body>
<div class="product-card front">
<div class="product-description">
<a href="">
<p class="product-name">Test text</p>
</a>
<div class="color_picker">
<div class="over_wrap" data-cc="0">
<a href="#"></a>
</div>
</div>
</div>
</div>
</body>
</html>
@shyamsalimkumar
Copy link
Author

Silly silly me... https://gist.github.com/shyamsalimkumar/9f62dbe94b3e6f8186cf69aa7a4faf60#file-test-go-L78 is supposed to be for child := node.FirstChild; child != nil; child = child.NextSibling {. Thanks to the nice folks at #go-nuts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment