-
-
Save rjurney/4368194 to your computer and use it in GitHub Desktop.
I want to extend Pig's existing XMLLoader to go beyond capturing the text inside a tag and to actually create a Pig mapping of the Document Object Model the XML represents. This would be similar to elephant-bird's JsonLoader. Semi-structured data can vary, so this behavior can be risky but... I want people to be able to load JSON and XML data ea…
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
characters = load 'example.xml' using XMLLoader('character'); | |
describe characters | |
{properties:map[], name:chararray, born:datetime, qualification:chararray} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<book id="b0836217462" available="true"> | |
<isbn> | |
0836217462 | |
</isbn> | |
<title lang="en"> | |
Being a Dog Is a Full-Time Job | |
</title> | |
<author id="CMS"> | |
<name> | |
Charles M Schulz | |
</name> | |
<born> | |
1922-11-26 | |
</born> | |
<dead> | |
2000-02-12 | |
</dead> | |
</author> | |
<character id="PP"> | |
<name> | |
Peppermint Patty | |
</name> | |
<born> | |
1966-08-22 | |
</born> | |
<qualification> | |
bold, brash and tomboyish | |
</qualification> | |
</character> | |
<character id="Snoopy"> | |
<name> | |
Snoopy | |
</name> | |
<born> | |
1950-10-04 | |
</born> | |
<qualification> | |
extroverted beagle | |
</qualification> | |
</character> | |
<character id="Schroeder"> | |
<name> | |
Schroeder | |
</name> | |
<born> | |
1951-05-30 | |
</born> | |
<qualification> | |
brought classical music to the Peanuts strip | |
</qualification> | |
</character> | |
<character id="Lucy"> | |
<name> | |
Lucy | |
</name> | |
<born> | |
1952-03-03 | |
</born> | |
<qualification> | |
bossy, crabby and selfish | |
</qualification> | |
</character> | |
</book> | |
</library> |
could you please send the code
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @rjurney, How can one read such an XML file using Pig XMLLoader to transform the tags into a tabular format? Any/all help will be useful.