Skip to content

Instantly share code, notes, and snippets.

@aparrish
Created February 26, 2016 02:19
Show Gist options
  • Save aparrish/e0a3a936e68a9668e226 to your computer and use it in GitHub Desktop.
Save aparrish/e0a3a936e68a9668e226 to your computer and use it in GitHub Desktop.
A quick tutorial on using xmltodict to parse civic data.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parsing XML with xmltodict\n",
"\n",
"Yue asked how to parse the data [here](http://advisory.mtanyct.info/LPUWebServices/CurrentLostProperty.aspx), and it turns out to be not so easy. Fetching the data is easy enough:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'<?xml version=\\'1.0\\' encoding=\\'utf-8\\'?><LostProperty><responsecode>0</responsecode><message></message><NumberOfLostArticles>213161</NumberOfLostArticles><NumberOfItemsclaimed>70247</NumberOfItemsclaimed><Category Category=\"Home Furnishings\"><SubCategory SubCategory=\"Wall and Window Covering\" count=\"79\"/><SubCategory SubCategory=\"Ornaments\" count=\"49\"/><SubCategory SubCategory=\"Appliances\" count=\"97\"/><SubCategory SubCategory=\"Linen\" count=\"971\"/><SubCategory SubCategory=\"Floor Covering\" count=\"30\"/><SubCategory SubCategory=\"All Other Furnishings\" count=\"556\"/><SubCategory SubCategory=\"Dishware\" count=\"609\"/></Category><Category Category=\"Sports Equipment\"><SubCategory SubCategory=\"Bats\" count=\"13\"/><SubCategory SubCategory=\"Scooter\" count=\"64\"/><SubCategory SubCategory=\"Golf Club\" count=\"5\"/><SubCategory SubCategory=\"Ball\" count=\"228\"/><SubCategory SubCategory=\"Skateboard\" count=\"44\"/><SubCategory SubCategory=\"Binocular\" count=\"20\"/><SubCategory SubCategory=\"Electric Motor Scooter\" count=\"1\"/><SubCategory SubCategory=\"Protective Gear\" count=\"327\"/><SubCategory SubCategory=\"Hockey Stick\" count=\"6\"/><SubCategory SubCategory=\"Sports Racket\" count=\"116\"/><SubCategory SubCategory=\"Bicycle\" count=\"170\"/><SubCategory SubCategory=\"Skis\" count=\"3\"/><SubCategory SubCategory=\"All Other sports equipment\" count=\"465\"/></Category><Category Category=\"Electronics\"><SubCategory SubCategory=\"Digital Camera\" count=\"857\"/><SubCategory SubCategory=\"USB Drive\" count=\"537\"/><SubCategory SubCategory=\"Portable radio\" count=\"292\"/><SubCategory SubCategory=\"Computer \" count=\"972\"/><SubCategory SubCategory=\"GPS Navigation System\" count=\"41\"/><SubCategory SubCategory=\"CD Player\" count=\"144\"/><SubCategory SubCategory=\"Electronic accessories\" count=\"707\"/><SubCategory SubCategory=\"All other Electronics\" count=\"1745\"/><SubCategory SubCategory=\"Digital media/MP3/4 Player\" count=\"2009\"/><SubCategory SubCategory=\"Record Player\" count=\"32\"/><SubCategory SubCategory=\"Photo/Video Equipment\" count=\"326\"/><SubCategory SubCategory=\"Television set\" count=\"7\"/><SubCategory SubCategory=\"Body Activity Tracker\" count=\"29\"/><SubCategory SubCategory=\"Earphones Speakers\" count=\"627\"/><SubCategory SubCategory=\"Computer External Hard Drive\" count=\"38\"/><SubCategory SubCategory=\"Camera Accessories\" count=\"261\"/><SubCategory SubCategory=\"Clock\" count=\"19\"/><SubCategory SubCategory=\"Walkman\" count=\"81\"/><SubCategory SubCategory=\"DVD Player\" count=\"122\"/><SubCategory SubCategory=\"Camera\" count=\"269\"/><SubCategory SubCategory=\"Calculator\" count=\"449\"/><SubCategory SubCategory=\"Air conditioner\" count=\"2\"/><SubCategory SubCategory=\"Translator\" count=\"30\"/><SubCategory SubCategory=\"Computer Accessories\" count=\"377\"/><SubCategory SubCategory=\"PDA\" count=\"18\"/><SubCategory SubCategory=\"Electronic Tablet\" count=\"781\"/><SubCategory SubCategory=\"Electronic Reading Device\" count=\"631\"/></Category><Category Category=\"Carry Bag / Luggage\"><SubCategory SubCategory=\"Suitcase\" count=\"472\"/><SubCategory SubCategory=\"Handbag\" count=\"3362\"/><SubCategory SubCategory=\"Shoulder bag\" count=\"3335\"/><SubCategory SubCategory=\"Briefcase\" count=\"609\"/><SubCategory SubCategory=\"Shopping bag\" count=\"15146\"/><SubCategory SubCategory=\"Tote bag\" count=\"5944\"/><SubCategory SubCategory=\"Garment/suit bag\" count=\"148\"/><SubCategory SubCategory=\"Duffle bag\" count=\"1154\"/><SubCategory SubCategory=\"Backpack\" count=\"12538\"/><SubCategory SubCategory=\"All other baggage\" count=\"5808\"/><SubCategory SubCategory=\"Messenger bag\" count=\"232\"/></Category><Category Category=\"Eye Wear\"><SubCategory SubCategory=\"Eyeglass case without glasses\" count=\"355\"/><SubCategory SubCategory=\"Eye glasses\" count=\"7648\"/><SubCategory SubCategory=\"Other\" count=\"232\"/><SubCategory SubCategory=\"Sunglasses\" count=\"2670\"/></Category><Category Category=\"Medical Equipment &amp; Medication\"><SubCategory SubCategory=\"Insulin Pump\" count=\"52\"/><SubCategory SubCategory=\"Medication\" count=\"4657\"/><SubCategory SubCategory=\"X - Rays\" count=\"101\"/><SubCategory SubCategory=\"Other\" count=\"378\"/><SubCategory SubCategory=\"Walker\" count=\"10\"/><SubCategory SubCategory=\"Wheel Chair\" count=\"13\"/><SubCategory SubCategory=\"Blood Pressure Meter\" count=\"49\"/><SubCategory SubCategory=\"All Other Medical Equipment / Supplies\" count=\"766\"/><SubCategory SubCategory=\"Electric wheelchair/scooter\" count=\"2\"/><SubCategory SubCategory=\"Cane\" count=\"468\"/><SubCategory SubCategory=\"Diabetic Testing Equipment\" count=\"95\"/><SubCategory SubCategory=\"Glucose Meter\" count=\"273\"/></Category><Category Category=\"Tools\"><SubCategory SubCategory=\"Tools\" count=\"933\"/><SubCategory SubCategory=\"Tool Box\" count=\"45\"/><SubCategory SubCategory=\"Tool Belt\" count=\"14\"/></Category><Category Category=\"Keys\"><SubCategory SubCategory=\"Car\" count=\"1833\"/><SubCategory SubCategory=\"House\" count=\"9264\"/><SubCategory SubCategory=\"Electronic key card\" count=\"393\"/><SubCategory SubCategory=\"All Other keys\" count=\"3552\"/></Category><Category Category=\"Miscellaneous\"><SubCategory SubCategory=\"Lunch box\" count=\"1165\"/><SubCategory SubCategory=\"Blue print\" count=\"22\"/><SubCategory SubCategory=\"Photo/Pictures\" count=\"929\"/><SubCategory SubCategory=\"Baby stroller/Carrier\" count=\"118\"/><SubCategory SubCategory=\"Portfolio\" count=\"712\"/><SubCategory SubCategory=\"All Other Misc Items \" count=\"25132\"/><SubCategory SubCategory=\"Art and Crafts\" count=\"524\"/><SubCategory SubCategory=\"Mail\" count=\"2353\"/><SubCategory SubCategory=\"Checkbook\" count=\"854\"/><SubCategory SubCategory=\"Tefillin/Phylacteries (Jewish)\" count=\"14\"/><SubCategory SubCategory=\"Folder\" count=\"4257\"/><SubCategory SubCategory=\"Envelope\" count=\"2437\"/><SubCategory SubCategory=\"Stationary Supplies\" count=\"3701\"/></Category><Category Category=\"Cell Phone/Telephone/Communication Device\"><SubCategory SubCategory=\"Walkie Talkie\" count=\"122\"/><SubCategory SubCategory=\"Answering Machine\" count=\"9\"/><SubCategory SubCategory=\"Other\" count=\"53\"/><SubCategory SubCategory=\"Cell phone \" count=\"38373\"/><SubCategory SubCategory=\"Pager\" count=\"70\"/><SubCategory SubCategory=\"Telephone Accessories\" count=\"3412\"/></Category><Category Category=\"Toys\"><SubCategory SubCategory=\"Baby Toys\" count=\"118\"/><SubCategory SubCategory=\"All Other Toys\" count=\"648\"/><SubCategory SubCategory=\"Board Game\" count=\"73\"/><SubCategory SubCategory=\"Doll\" count=\"176\"/><SubCategory SubCategory=\"Cars/Trucks\" count=\"190\"/><SubCategory SubCategory=\"Stuffed Animals\" count=\"270\"/></Category><Category Category=\"Tickets\"><SubCategory SubCategory=\"Transit Check\" count=\"174\"/><SubCategory SubCategory=\"All Other Tickets\" count=\"440\"/><SubCategory SubCategory=\"NJ Transit\" count=\"331\"/><SubCategory SubCategory=\"Metro North Tickets\" count=\"189\"/><SubCategory SubCategory=\"Lottery Tickets\" count=\"584\"/><SubCategory SubCategory=\"Entertainment Tickets\" count=\"105\"/><SubCategory SubCategory=\"MTA Metrocard\" count=\"38873\"/><SubCategory SubCategory=\"LIRR Tickets\" count=\"299\"/><SubCategory SubCategory=\"Airline\" count=\"89\"/></Category><Category Category=\"Book\"><SubCategory SubCategory=\"Photo album\" count=\"31\"/><SubCategory SubCategory=\"Paperback\" count=\"5699\"/><SubCategory SubCategory=\"Address book\" count=\"675\"/><SubCategory SubCategory=\"Soft cover\" count=\"2970\"/><SubCategory SubCategory=\"Other\" count=\"658\"/><SubCategory SubCategory=\"Diary\" count=\"496\"/><SubCategory SubCategory=\"Hard Cover\" count=\"3093\"/></Category><Category Category=\"NYCT Equipment\"><SubCategory SubCategory=\"Vest\" count=\"10\"/><SubCategory SubCategory=\"Badge\" count=\"40\"/><SubCategory SubCategory=\"Radio\" count=\"8\"/><SubCategory SubCategory=\"Pass or Badge holder\" count=\"12\"/><SubCategory SubCategory=\"All other NYCT Equiment\" count=\"38\"/><SubCategory SubCategory=\"Epic pass\" count=\"308\"/></Category><Category Category=\"Wallet/Purse\"><SubCategory SubCategory=\"Pouch\" count=\"2565\"/><SubCategory SubCategory=\"Wristlet\" count=\"1620\"/><SubCategory SubCategory=\"Purse\" count=\"10166\"/><SubCategory SubCategory=\"ID Holder\" count=\"7045\"/><SubCategory SubCategory=\"Business Card holder\" count=\"484\"/><SubCategory SubCategory=\"Billfold\" count=\"189\"/><SubCategory SubCategory=\"Handbag\" count=\"529\"/><SubCategory SubCategory=\"Lanyard\" count=\"53\"/><SubCategory SubCategory=\"All Other Wallet / Purse\" count=\"1186\"/><SubCategory SubCategory=\"Fanny pack\" count=\"181\"/><SubCategory SubCategory=\"Wallet\" count=\"35950\"/></Category><Category Category=\"Accessories\"><SubCategory SubCategory=\"Scarf\" count=\"1975\"/><SubCategory SubCategory=\"All Other Accessories\" count=\"1202\"/><SubCategory SubCategory=\"Hat\" count=\"3732\"/><SubCategory SubCategory=\"Gloves\" count=\"3378\"/><SubCategory SubCategory=\"Make-up\" count=\"1761\"/><SubCategory SubCategory=\"Ties\" count=\"262\"/><SubCategory SubCategory=\"Belt\" count=\"254\"/><SubCategory SubCategory=\"Umbrella\" count=\"2823\"/></Category><Category Category=\"Clothing \"><SubCategory SubCategory=\"Jacket\" count=\"3191\"/><SubCategory SubCategory=\"Coat \" count=\"548\"/><SubCategory SubCategory=\"Nightwear\" count=\"185\"/><SubCategory SubCategory=\"Skirt\" count=\"492\"/><SubCategory SubCategory=\"Blouse\" count=\"842\"/><SubCategory SubCategory=\"Sweater\" count=\"2394\"/><SubCategory SubCategory=\"Pants/trousers/shorts\" count=\"5762\"/><SubCategory SubCategory=\"Suit\" count=\"75\"/><SubCategory SubCategory=\"Shirt\" count=\"3951\"/><SubCategory SubCategory=\"Under garment\" count=\"2419\"/><SubCategory SubCategory=\"All Other Clothing\" count=\"3535\"/><SubCategory SubCategory=\"T-shirt\" count=\"3123\"/><SubCategory SubCategory=\"Dress\" count=\"630\"/><SubCategory SubCategory=\"Leather Jacket\" count=\"86\"/></Category><Category Category=\"Identification\"><SubCategory SubCategory=\"Debit Card\" count=\"31871\"/><SubCategory SubCategory=\"Badge\" count=\"167\"/><SubCategory SubCategory=\"Passport\" count=\"2416\"/><SubCategory SubCategory=\"Social Security Card\" count=\"6111\"/><SubCategory SubCategory=\"Birth Certificate\" count=\"1030\"/><SubCategory SubCategory=\"School ID\" count=\"14173\"/><SubCategory SubCategory=\"Credit Card\" count=\"17074\"/><SubCategory SubCategory=\"AAA Card\" count=\"656\"/><SubCategory SubCategory=\"Resident Card\" count=\"855\"/><SubCategory SubCategory=\"Employment ID\" count=\"6594\"/><SubCategory SubCategory=\"Identification Card\" count=\"11787\"/><SubCategory SubCategory=\"Library card\" count=\"5877\"/><SubCategory SubCategory=\"All Other Personal Identification\" count=\"9690\"/><SubCategory SubCategory=\"Membership card\" count=\"8999\"/><SubCategory SubCategory=\"Military ID\" count=\"248\"/><SubCategory SubCategory=\"Drivers License\" count=\"15259\"/><SubCategory SubCategory=\"Registration Card\" count=\"386\"/><SubCategory SubCategory=\"Benefit Card\" count=\"10101\"/><SubCategory SubCategory=\"Insurance Card\" count=\"13425\"/><SubCategory SubCategory=\"Death Certificate\" count=\"21\"/></Category><Category Category=\"Entertainment (Music/Movies/Games)\"><SubCategory SubCategory=\"Video Games System\" count=\"411\"/><SubCategory SubCategory=\"All Other Entertainment\" count=\"285\"/><SubCategory SubCategory=\"VHS Tape\" count=\"116\"/><SubCategory SubCategory=\"DVD\" count=\"972\"/><SubCategory SubCategory=\"Blu-Ray\" count=\"4\"/><SubCategory SubCategory=\"Computer Games \" count=\"355\"/><SubCategory SubCategory=\"Compact Disc\" count=\"884\"/></Category><Category Category=\"Footwear\"><SubCategory SubCategory=\"Shoes\" count=\"2204\"/><SubCategory SubCategory=\"Boots\" count=\"764\"/><SubCategory SubCategory=\"Slippers\" count=\"523\"/><SubCategory SubCategory=\"Sandals\" count=\"430\"/><SubCategory SubCategory=\"Galoshes\" count=\"22\"/><SubCategory SubCategory=\"All Other Footwear\" count=\"134\"/><SubCategory SubCategory=\"Sneakers\" count=\"2200\"/></Category><Category Category=\"Currency\"><SubCategory SubCategory=\"Gift Card \" count=\"1900\"/><SubCategory SubCategory=\"Payroll check\" count=\"827\"/><SubCategory SubCategory=\"Money Order\" count=\"164\"/><SubCategory SubCategory=\"Personal Check\" count=\"1312\"/><SubCategory SubCategory=\"Travelers Checks \" count=\"21\"/><SubCategory SubCategory=\"Other Currency\" count=\"145\"/><SubCategory SubCategory=\"Cash\" count=\"33063\"/><SubCategory SubCategory=\"Foreign Currency\" count=\"1938\"/></Category><Category Category=\"Musical Instrument\"><SubCategory SubCategory=\"Flute\" count=\"88\"/><SubCategory SubCategory=\"Musical instrument accessories\" count=\"53\"/><SubCategory SubCategory=\"Clarinet\" count=\"76\"/><SubCategory SubCategory=\"All Other Instruments\" count=\"113\"/><SubCategory SubCategory=\"Guitar\" count=\"44\"/><SubCategory SubCategory=\"Violin\" count=\"85\"/><SubCategory SubCategory=\"Harmonica\" count=\"8\"/><SubCategory SubCategory=\"Saxophone\" count=\"32\"/><SubCategory SubCategory=\"Trumpet\" count=\"71\"/></Category><Category Category=\"Jewelry\"><SubCategory SubCategory=\"Fashion (Costume)\" count=\"2518\"/><SubCategory SubCategory=\"Cuff Links\" count=\"5\"/><SubCategory SubCategory=\"Necklace\" count=\"416\"/><SubCategory SubCategory=\"Watch\" count=\"1633\"/><SubCategory SubCategory=\"Ring\" count=\"1062\"/><SubCategory SubCategory=\"Bracelet\" count=\"713\"/><SubCategory SubCategory=\"Earrings\" count=\"382\"/><SubCategory SubCategory=\"All other Jewelry\" count=\"161\"/><SubCategory SubCategory=\"Tie Clips\" count=\"2\"/></Category></LostProperty>'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import urllib\n",
"\n",
"# fetch data from url\n",
"data = urllib.urlopen(\"http://advisory.mtanyct.info/LPUWebServices/CurrentLostProperty.aspx\").read()\n",
"\n",
"# take a look at the data\n",
"data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Okay, so we have a bunch of XML. Gross. `xmltodict` to the rescue! It takes XML and returns data structures that are easier to work with... or are supposed to be easier to work with, at least."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import xmltodict\n",
"\n",
"# parse using xmltodict\n",
"doc = xmltodict.parse(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The value returned from `xmltodict.parse()` is a weird bundle of [`OrderedDict`](https://docs.python.org/2/library/collections.html#collections.OrderedDict) objects and (lists of `OrderedDict`s). It looks weird when you just print it out:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([(u'LostProperty', OrderedDict([(u'responsecode', u'0'), (u'message', None), (u'NumberOfLostArticles', u'213161'), (u'NumberOfItemsclaimed', u'70247'), (u'Category', [OrderedDict([(u'@Category', u'Home Furnishings'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Wall and Window Covering'), (u'@count', u'79')]), OrderedDict([(u'@SubCategory', u'Ornaments'), (u'@count', u'49')]), OrderedDict([(u'@SubCategory', u'Appliances'), (u'@count', u'97')]), OrderedDict([(u'@SubCategory', u'Linen'), (u'@count', u'971')]), OrderedDict([(u'@SubCategory', u'Floor Covering'), (u'@count', u'30')]), OrderedDict([(u'@SubCategory', u'All Other Furnishings'), (u'@count', u'556')]), OrderedDict([(u'@SubCategory', u'Dishware'), (u'@count', u'609')])])]), OrderedDict([(u'@Category', u'Sports Equipment'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Bats'), (u'@count', u'13')]), OrderedDict([(u'@SubCategory', u'Scooter'), (u'@count', u'64')]), OrderedDict([(u'@SubCategory', u'Golf Club'), (u'@count', u'5')]), OrderedDict([(u'@SubCategory', u'Ball'), (u'@count', u'228')]), OrderedDict([(u'@SubCategory', u'Skateboard'), (u'@count', u'44')]), OrderedDict([(u'@SubCategory', u'Binocular'), (u'@count', u'20')]), OrderedDict([(u'@SubCategory', u'Electric Motor Scooter'), (u'@count', u'1')]), OrderedDict([(u'@SubCategory', u'Protective Gear'), (u'@count', u'327')]), OrderedDict([(u'@SubCategory', u'Hockey Stick'), (u'@count', u'6')]), OrderedDict([(u'@SubCategory', u'Sports Racket'), (u'@count', u'116')]), OrderedDict([(u'@SubCategory', u'Bicycle'), (u'@count', u'170')]), OrderedDict([(u'@SubCategory', u'Skis'), (u'@count', u'3')]), OrderedDict([(u'@SubCategory', u'All Other sports equipment'), (u'@count', u'465')])])]), OrderedDict([(u'@Category', u'Electronics'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Digital Camera'), (u'@count', u'857')]), OrderedDict([(u'@SubCategory', u'USB Drive'), (u'@count', u'537')]), OrderedDict([(u'@SubCategory', u'Portable radio'), (u'@count', u'292')]), OrderedDict([(u'@SubCategory', u'Computer '), (u'@count', u'972')]), OrderedDict([(u'@SubCategory', u'GPS Navigation System'), (u'@count', u'41')]), OrderedDict([(u'@SubCategory', u'CD Player'), (u'@count', u'144')]), OrderedDict([(u'@SubCategory', u'Electronic accessories'), (u'@count', u'707')]), OrderedDict([(u'@SubCategory', u'All other Electronics'), (u'@count', u'1745')]), OrderedDict([(u'@SubCategory', u'Digital media/MP3/4 Player'), (u'@count', u'2009')]), OrderedDict([(u'@SubCategory', u'Record Player'), (u'@count', u'32')]), OrderedDict([(u'@SubCategory', u'Photo/Video Equipment'), (u'@count', u'326')]), OrderedDict([(u'@SubCategory', u'Television set'), (u'@count', u'7')]), OrderedDict([(u'@SubCategory', u'Body Activity Tracker'), (u'@count', u'29')]), OrderedDict([(u'@SubCategory', u'Earphones Speakers'), (u'@count', u'627')]), OrderedDict([(u'@SubCategory', u'Computer External Hard Drive'), (u'@count', u'38')]), OrderedDict([(u'@SubCategory', u'Camera Accessories'), (u'@count', u'261')]), OrderedDict([(u'@SubCategory', u'Clock'), (u'@count', u'19')]), OrderedDict([(u'@SubCategory', u'Walkman'), (u'@count', u'81')]), OrderedDict([(u'@SubCategory', u'DVD Player'), (u'@count', u'122')]), OrderedDict([(u'@SubCategory', u'Camera'), (u'@count', u'269')]), OrderedDict([(u'@SubCategory', u'Calculator'), (u'@count', u'449')]), OrderedDict([(u'@SubCategory', u'Air conditioner'), (u'@count', u'2')]), OrderedDict([(u'@SubCategory', u'Translator'), (u'@count', u'30')]), OrderedDict([(u'@SubCategory', u'Computer Accessories'), (u'@count', u'377')]), OrderedDict([(u'@SubCategory', u'PDA'), (u'@count', u'18')]), OrderedDict([(u'@SubCategory', u'Electronic Tablet'), (u'@count', u'781')]), OrderedDict([(u'@SubCategory', u'Electronic Reading Device'), (u'@count', u'631')])])]), OrderedDict([(u'@Category', u'Carry Bag / Luggage'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Suitcase'), (u'@count', u'472')]), OrderedDict([(u'@SubCategory', u'Handbag'), (u'@count', u'3362')]), OrderedDict([(u'@SubCategory', u'Shoulder bag'), (u'@count', u'3335')]), OrderedDict([(u'@SubCategory', u'Briefcase'), (u'@count', u'609')]), OrderedDict([(u'@SubCategory', u'Shopping bag'), (u'@count', u'15146')]), OrderedDict([(u'@SubCategory', u'Tote bag'), (u'@count', u'5944')]), OrderedDict([(u'@SubCategory', u'Garment/suit bag'), (u'@count', u'148')]), OrderedDict([(u'@SubCategory', u'Duffle bag'), (u'@count', u'1154')]), OrderedDict([(u'@SubCategory', u'Backpack'), (u'@count', u'12538')]), OrderedDict([(u'@SubCategory', u'All other baggage'), (u'@count', u'5808')]), OrderedDict([(u'@SubCategory', u'Messenger bag'), (u'@count', u'232')])])]), OrderedDict([(u'@Category', u'Eye Wear'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Eyeglass case without glasses'), (u'@count', u'355')]), OrderedDict([(u'@SubCategory', u'Eye glasses'), (u'@count', u'7648')]), OrderedDict([(u'@SubCategory', u'Other'), (u'@count', u'232')]), OrderedDict([(u'@SubCategory', u'Sunglasses'), (u'@count', u'2670')])])]), OrderedDict([(u'@Category', u'Medical Equipment & Medication'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Insulin Pump'), (u'@count', u'52')]), OrderedDict([(u'@SubCategory', u'Medication'), (u'@count', u'4657')]), OrderedDict([(u'@SubCategory', u'X - Rays'), (u'@count', u'101')]), OrderedDict([(u'@SubCategory', u'Other'), (u'@count', u'378')]), OrderedDict([(u'@SubCategory', u'Walker'), (u'@count', u'10')]), OrderedDict([(u'@SubCategory', u'Wheel Chair'), (u'@count', u'13')]), OrderedDict([(u'@SubCategory', u'Blood Pressure Meter'), (u'@count', u'49')]), OrderedDict([(u'@SubCategory', u'All Other Medical Equipment / Supplies'), (u'@count', u'766')]), OrderedDict([(u'@SubCategory', u'Electric wheelchair/scooter'), (u'@count', u'2')]), OrderedDict([(u'@SubCategory', u'Cane'), (u'@count', u'468')]), OrderedDict([(u'@SubCategory', u'Diabetic Testing Equipment'), (u'@count', u'95')]), OrderedDict([(u'@SubCategory', u'Glucose Meter'), (u'@count', u'273')])])]), OrderedDict([(u'@Category', u'Tools'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Tools'), (u'@count', u'933')]), OrderedDict([(u'@SubCategory', u'Tool Box'), (u'@count', u'45')]), OrderedDict([(u'@SubCategory', u'Tool Belt'), (u'@count', u'14')])])]), OrderedDict([(u'@Category', u'Keys'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Car'), (u'@count', u'1833')]), OrderedDict([(u'@SubCategory', u'House'), (u'@count', u'9264')]), OrderedDict([(u'@SubCategory', u'Electronic key card'), (u'@count', u'393')]), OrderedDict([(u'@SubCategory', u'All Other keys'), (u'@count', u'3552')])])]), OrderedDict([(u'@Category', u'Miscellaneous'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Lunch box'), (u'@count', u'1165')]), OrderedDict([(u'@SubCategory', u'Blue print'), (u'@count', u'22')]), OrderedDict([(u'@SubCategory', u'Photo/Pictures'), (u'@count', u'929')]), OrderedDict([(u'@SubCategory', u'Baby stroller/Carrier'), (u'@count', u'118')]), OrderedDict([(u'@SubCategory', u'Portfolio'), (u'@count', u'712')]), OrderedDict([(u'@SubCategory', u'All Other Misc Items '), (u'@count', u'25132')]), OrderedDict([(u'@SubCategory', u'Art and Crafts'), (u'@count', u'524')]), OrderedDict([(u'@SubCategory', u'Mail'), (u'@count', u'2353')]), OrderedDict([(u'@SubCategory', u'Checkbook'), (u'@count', u'854')]), OrderedDict([(u'@SubCategory', u'Tefillin/Phylacteries (Jewish)'), (u'@count', u'14')]), OrderedDict([(u'@SubCategory', u'Folder'), (u'@count', u'4257')]), OrderedDict([(u'@SubCategory', u'Envelope'), (u'@count', u'2437')]), OrderedDict([(u'@SubCategory', u'Stationary Supplies'), (u'@count', u'3701')])])]), OrderedDict([(u'@Category', u'Cell Phone/Telephone/Communication Device'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Walkie Talkie'), (u'@count', u'122')]), OrderedDict([(u'@SubCategory', u'Answering Machine'), (u'@count', u'9')]), OrderedDict([(u'@SubCategory', u'Other'), (u'@count', u'53')]), OrderedDict([(u'@SubCategory', u'Cell phone '), (u'@count', u'38373')]), OrderedDict([(u'@SubCategory', u'Pager'), (u'@count', u'70')]), OrderedDict([(u'@SubCategory', u'Telephone Accessories'), (u'@count', u'3412')])])]), OrderedDict([(u'@Category', u'Toys'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Baby Toys'), (u'@count', u'118')]), OrderedDict([(u'@SubCategory', u'All Other Toys'), (u'@count', u'648')]), OrderedDict([(u'@SubCategory', u'Board Game'), (u'@count', u'73')]), OrderedDict([(u'@SubCategory', u'Doll'), (u'@count', u'176')]), OrderedDict([(u'@SubCategory', u'Cars/Trucks'), (u'@count', u'190')]), OrderedDict([(u'@SubCategory', u'Stuffed Animals'), (u'@count', u'270')])])]), OrderedDict([(u'@Category', u'Tickets'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Transit Check'), (u'@count', u'174')]), OrderedDict([(u'@SubCategory', u'All Other Tickets'), (u'@count', u'440')]), OrderedDict([(u'@SubCategory', u'NJ Transit'), (u'@count', u'331')]), OrderedDict([(u'@SubCategory', u'Metro North Tickets'), (u'@count', u'189')]), OrderedDict([(u'@SubCategory', u'Lottery Tickets'), (u'@count', u'584')]), OrderedDict([(u'@SubCategory', u'Entertainment Tickets'), (u'@count', u'105')]), OrderedDict([(u'@SubCategory', u'MTA Metrocard'), (u'@count', u'38873')]), OrderedDict([(u'@SubCategory', u'LIRR Tickets'), (u'@count', u'299')]), OrderedDict([(u'@SubCategory', u'Airline'), (u'@count', u'89')])])]), OrderedDict([(u'@Category', u'Book'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Photo album'), (u'@count', u'31')]), OrderedDict([(u'@SubCategory', u'Paperback'), (u'@count', u'5699')]), OrderedDict([(u'@SubCategory', u'Address book'), (u'@count', u'675')]), OrderedDict([(u'@SubCategory', u'Soft cover'), (u'@count', u'2970')]), OrderedDict([(u'@SubCategory', u'Other'), (u'@count', u'658')]), OrderedDict([(u'@SubCategory', u'Diary'), (u'@count', u'496')]), OrderedDict([(u'@SubCategory', u'Hard Cover'), (u'@count', u'3093')])])]), OrderedDict([(u'@Category', u'NYCT Equipment'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Vest'), (u'@count', u'10')]), OrderedDict([(u'@SubCategory', u'Badge'), (u'@count', u'40')]), OrderedDict([(u'@SubCategory', u'Radio'), (u'@count', u'8')]), OrderedDict([(u'@SubCategory', u'Pass or Badge holder'), (u'@count', u'12')]), OrderedDict([(u'@SubCategory', u'All other NYCT Equiment'), (u'@count', u'38')]), OrderedDict([(u'@SubCategory', u'Epic pass'), (u'@count', u'308')])])]), OrderedDict([(u'@Category', u'Wallet/Purse'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Pouch'), (u'@count', u'2565')]), OrderedDict([(u'@SubCategory', u'Wristlet'), (u'@count', u'1620')]), OrderedDict([(u'@SubCategory', u'Purse'), (u'@count', u'10166')]), OrderedDict([(u'@SubCategory', u'ID Holder'), (u'@count', u'7045')]), OrderedDict([(u'@SubCategory', u'Business Card holder'), (u'@count', u'484')]), OrderedDict([(u'@SubCategory', u'Billfold'), (u'@count', u'189')]), OrderedDict([(u'@SubCategory', u'Handbag'), (u'@count', u'529')]), OrderedDict([(u'@SubCategory', u'Lanyard'), (u'@count', u'53')]), OrderedDict([(u'@SubCategory', u'All Other Wallet / Purse'), (u'@count', u'1186')]), OrderedDict([(u'@SubCategory', u'Fanny pack'), (u'@count', u'181')]), OrderedDict([(u'@SubCategory', u'Wallet'), (u'@count', u'35950')])])]), OrderedDict([(u'@Category', u'Accessories'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Scarf'), (u'@count', u'1975')]), OrderedDict([(u'@SubCategory', u'All Other Accessories'), (u'@count', u'1202')]), OrderedDict([(u'@SubCategory', u'Hat'), (u'@count', u'3732')]), OrderedDict([(u'@SubCategory', u'Gloves'), (u'@count', u'3378')]), OrderedDict([(u'@SubCategory', u'Make-up'), (u'@count', u'1761')]), OrderedDict([(u'@SubCategory', u'Ties'), (u'@count', u'262')]), OrderedDict([(u'@SubCategory', u'Belt'), (u'@count', u'254')]), OrderedDict([(u'@SubCategory', u'Umbrella'), (u'@count', u'2823')])])]), OrderedDict([(u'@Category', u'Clothing '), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Jacket'), (u'@count', u'3191')]), OrderedDict([(u'@SubCategory', u'Coat '), (u'@count', u'548')]), OrderedDict([(u'@SubCategory', u'Nightwear'), (u'@count', u'185')]), OrderedDict([(u'@SubCategory', u'Skirt'), (u'@count', u'492')]), OrderedDict([(u'@SubCategory', u'Blouse'), (u'@count', u'842')]), OrderedDict([(u'@SubCategory', u'Sweater'), (u'@count', u'2394')]), OrderedDict([(u'@SubCategory', u'Pants/trousers/shorts'), (u'@count', u'5762')]), OrderedDict([(u'@SubCategory', u'Suit'), (u'@count', u'75')]), OrderedDict([(u'@SubCategory', u'Shirt'), (u'@count', u'3951')]), OrderedDict([(u'@SubCategory', u'Under garment'), (u'@count', u'2419')]), OrderedDict([(u'@SubCategory', u'All Other Clothing'), (u'@count', u'3535')]), OrderedDict([(u'@SubCategory', u'T-shirt'), (u'@count', u'3123')]), OrderedDict([(u'@SubCategory', u'Dress'), (u'@count', u'630')]), OrderedDict([(u'@SubCategory', u'Leather Jacket'), (u'@count', u'86')])])]), OrderedDict([(u'@Category', u'Identification'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Debit Card'), (u'@count', u'31871')]), OrderedDict([(u'@SubCategory', u'Badge'), (u'@count', u'167')]), OrderedDict([(u'@SubCategory', u'Passport'), (u'@count', u'2416')]), OrderedDict([(u'@SubCategory', u'Social Security Card'), (u'@count', u'6111')]), OrderedDict([(u'@SubCategory', u'Birth Certificate'), (u'@count', u'1030')]), OrderedDict([(u'@SubCategory', u'School ID'), (u'@count', u'14173')]), OrderedDict([(u'@SubCategory', u'Credit Card'), (u'@count', u'17074')]), OrderedDict([(u'@SubCategory', u'AAA Card'), (u'@count', u'656')]), OrderedDict([(u'@SubCategory', u'Resident Card'), (u'@count', u'855')]), OrderedDict([(u'@SubCategory', u'Employment ID'), (u'@count', u'6594')]), OrderedDict([(u'@SubCategory', u'Identification Card'), (u'@count', u'11787')]), OrderedDict([(u'@SubCategory', u'Library card'), (u'@count', u'5877')]), OrderedDict([(u'@SubCategory', u'All Other Personal Identification'), (u'@count', u'9690')]), OrderedDict([(u'@SubCategory', u'Membership card'), (u'@count', u'8999')]), OrderedDict([(u'@SubCategory', u'Military ID'), (u'@count', u'248')]), OrderedDict([(u'@SubCategory', u'Drivers License'), (u'@count', u'15259')]), OrderedDict([(u'@SubCategory', u'Registration Card'), (u'@count', u'386')]), OrderedDict([(u'@SubCategory', u'Benefit Card'), (u'@count', u'10101')]), OrderedDict([(u'@SubCategory', u'Insurance Card'), (u'@count', u'13425')]), OrderedDict([(u'@SubCategory', u'Death Certificate'), (u'@count', u'21')])])]), OrderedDict([(u'@Category', u'Entertainment (Music/Movies/Games)'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Video Games System'), (u'@count', u'411')]), OrderedDict([(u'@SubCategory', u'All Other Entertainment'), (u'@count', u'285')]), OrderedDict([(u'@SubCategory', u'VHS Tape'), (u'@count', u'116')]), OrderedDict([(u'@SubCategory', u'DVD'), (u'@count', u'972')]), OrderedDict([(u'@SubCategory', u'Blu-Ray'), (u'@count', u'4')]), OrderedDict([(u'@SubCategory', u'Computer Games '), (u'@count', u'355')]), OrderedDict([(u'@SubCategory', u'Compact Disc'), (u'@count', u'884')])])]), OrderedDict([(u'@Category', u'Footwear'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Shoes'), (u'@count', u'2204')]), OrderedDict([(u'@SubCategory', u'Boots'), (u'@count', u'764')]), OrderedDict([(u'@SubCategory', u'Slippers'), (u'@count', u'523')]), OrderedDict([(u'@SubCategory', u'Sandals'), (u'@count', u'430')]), OrderedDict([(u'@SubCategory', u'Galoshes'), (u'@count', u'22')]), OrderedDict([(u'@SubCategory', u'All Other Footwear'), (u'@count', u'134')]), OrderedDict([(u'@SubCategory', u'Sneakers'), (u'@count', u'2200')])])]), OrderedDict([(u'@Category', u'Currency'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Gift Card '), (u'@count', u'1900')]), OrderedDict([(u'@SubCategory', u'Payroll check'), (u'@count', u'827')]), OrderedDict([(u'@SubCategory', u'Money Order'), (u'@count', u'164')]), OrderedDict([(u'@SubCategory', u'Personal Check'), (u'@count', u'1312')]), OrderedDict([(u'@SubCategory', u'Travelers Checks '), (u'@count', u'21')]), OrderedDict([(u'@SubCategory', u'Other Currency'), (u'@count', u'145')]), OrderedDict([(u'@SubCategory', u'Cash'), (u'@count', u'33063')]), OrderedDict([(u'@SubCategory', u'Foreign Currency'), (u'@count', u'1938')])])]), OrderedDict([(u'@Category', u'Musical Instrument'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Flute'), (u'@count', u'88')]), OrderedDict([(u'@SubCategory', u'Musical instrument accessories'), (u'@count', u'53')]), OrderedDict([(u'@SubCategory', u'Clarinet'), (u'@count', u'76')]), OrderedDict([(u'@SubCategory', u'All Other Instruments'), (u'@count', u'113')]), OrderedDict([(u'@SubCategory', u'Guitar'), (u'@count', u'44')]), OrderedDict([(u'@SubCategory', u'Violin'), (u'@count', u'85')]), OrderedDict([(u'@SubCategory', u'Harmonica'), (u'@count', u'8')]), OrderedDict([(u'@SubCategory', u'Saxophone'), (u'@count', u'32')]), OrderedDict([(u'@SubCategory', u'Trumpet'), (u'@count', u'71')])])]), OrderedDict([(u'@Category', u'Jewelry'), (u'SubCategory', [OrderedDict([(u'@SubCategory', u'Fashion (Costume)'), (u'@count', u'2518')]), OrderedDict([(u'@SubCategory', u'Cuff Links'), (u'@count', u'5')]), OrderedDict([(u'@SubCategory', u'Necklace'), (u'@count', u'416')]), OrderedDict([(u'@SubCategory', u'Watch'), (u'@count', u'1633')]), OrderedDict([(u'@SubCategory', u'Ring'), (u'@count', u'1062')]), OrderedDict([(u'@SubCategory', u'Bracelet'), (u'@count', u'713')]), OrderedDict([(u'@SubCategory', u'Earrings'), (u'@count', u'382')]), OrderedDict([(u'@SubCategory', u'All other Jewelry'), (u'@count', u'161')]), OrderedDict([(u'@SubCategory', u'Tie Clips'), (u'@count', u'2')])])])])]))])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"doc"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ugh. That sucks and is weird. Maybe it'll be easier to make sense of this if we start with a simpler example."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"xml = \"\"\"<?xml version=\"1.0\" encoding=\"utf-8\"?>\n",
"<ShoppingList>\n",
" <Shopper>Allison Parrish</Shopper>\n",
" <ItemsToBuy>\n",
" <item name=\"milk\" aisle=\"dairy\" />\n",
" <item name=\"flour\" aisle=\"baking needs\" />\n",
" <item name=\"apples\" aisle=\"produce\" />\n",
" </ItemsToBuy>\n",
"</ShoppingList>\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just a very small bit of XML. Let's say our task is to print out the item *names* of each item. So first we'll parse the XML with `xmltodict`:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([(u'ShoppingList', OrderedDict([(u'Shopper', u'Allison Parrish'), (u'ItemsToBuy', OrderedDict([(u'item', [OrderedDict([(u'@name', u'milk'), (u'@aisle', u'dairy')]), OrderedDict([(u'@name', u'flour'), (u'@aisle', u'baking needs')]), OrderedDict([(u'@name', u'apples'), (u'@aisle', u'produce')])])]))]))])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping = xmltodict.parse(xml)\n",
"shopping"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is still intimidating but is maybe at least readable. The `xmltodict` library returns a *dictionary* whose keys are the names of the elements in the XML document and whose values are the contents of those elements. So we can get the contents of the `ShoppingList` element like so:"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([(u'Shopper', u'Allison Parrish'), (u'ItemsToBuy', OrderedDict([(u'item', [OrderedDict([(u'@name', u'milk'), (u'@aisle', u'dairy')]), OrderedDict([(u'@name', u'flour'), (u'@aisle', u'baking needs')]), OrderedDict([(u'@name', u'apples'), (u'@aisle', u'produce')])])]))])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping['ShoppingList']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... and then the contents of the `Shopper` item:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"u'Allison Parrish'"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping['ShoppingList']['Shopper']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Getting the *items* from the `ItemsToBuy` element is a little bit more complicated. Because the element has multiple children all with the same tag name, `xmltodict` represents those items as a *list*. We'll start by getting the element itself:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"[OrderedDict([(u'@name', u'milk'), (u'@aisle', u'dairy')]),\n",
" OrderedDict([(u'@name', u'flour'), (u'@aisle', u'baking needs')]),\n",
" OrderedDict([(u'@name', u'apples'), (u'@aisle', u'produce')])]"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping['ShoppingList']['ItemsToBuy']['item']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can get any individual element using integer indexes:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([(u'@name', u'milk'), (u'@aisle', u'dairy')])"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping['ShoppingList']['ItemsToBuy']['item'][0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"XML attributes are represented by `xmltodict` as keys beginning with `@`. So, to print just the name of the item:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"u'milk'"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping['ShoppingList']['ItemsToBuy']['item'][0]['@name']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Putting it all together, we can loop through all of the items like so:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"milk\n",
"flour\n",
"apples\n"
]
}
],
"source": [
"for item in shopping['ShoppingList']['ItemsToBuy']['item']:\n",
" print item['@name']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A handy trick for seeing the relationships in the underlying data more clearly is using Python's `json` module to format the data:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"ShoppingList\": {\n",
" \"Shopper\": \"Allison Parrish\", \n",
" \"ItemsToBuy\": {\n",
" \"item\": [\n",
" {\n",
" \"@name\": \"milk\", \n",
" \"@aisle\": \"dairy\"\n",
" }, \n",
" {\n",
" \"@name\": \"flour\", \n",
" \"@aisle\": \"baking needs\"\n",
" }, \n",
" {\n",
" \"@name\": \"apples\", \n",
" \"@aisle\": \"produce\"\n",
" }\n",
" ]\n",
" }\n",
" }\n",
"}\n"
]
}
],
"source": [
"import json\n",
"print json.dumps(shopping, indent=4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's getting back to the MTA data. First, we'll use the `json` trick to display the data in a more sensible way."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"LostProperty\": {\n",
" \"responsecode\": \"0\", \n",
" \"message\": null, \n",
" \"NumberOfLostArticles\": \"213161\", \n",
" \"NumberOfItemsclaimed\": \"70247\", \n",
" \"Category\": [\n",
" {\n",
" \"@Category\": \"Home Furnishings\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Wall and Window Covering\", \n",
" \"@count\": \"79\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Ornaments\", \n",
" \"@count\": \"49\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Appliances\", \n",
" \"@count\": \"97\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Linen\", \n",
" \"@count\": \"971\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Floor Covering\", \n",
" \"@count\": \"30\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Furnishings\", \n",
" \"@count\": \"556\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Dishware\", \n",
" \"@count\": \"609\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Sports Equipment\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Bats\", \n",
" \"@count\": \"13\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Scooter\", \n",
" \"@count\": \"64\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Golf Club\", \n",
" \"@count\": \"5\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Ball\", \n",
" \"@count\": \"228\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Skateboard\", \n",
" \"@count\": \"44\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Binocular\", \n",
" \"@count\": \"20\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Electric Motor Scooter\", \n",
" \"@count\": \"1\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Protective Gear\", \n",
" \"@count\": \"327\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Hockey Stick\", \n",
" \"@count\": \"6\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Sports Racket\", \n",
" \"@count\": \"116\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Bicycle\", \n",
" \"@count\": \"170\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Skis\", \n",
" \"@count\": \"3\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other sports equipment\", \n",
" \"@count\": \"465\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Electronics\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Digital Camera\", \n",
" \"@count\": \"857\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"USB Drive\", \n",
" \"@count\": \"537\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Portable radio\", \n",
" \"@count\": \"292\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Computer \", \n",
" \"@count\": \"972\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"GPS Navigation System\", \n",
" \"@count\": \"41\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"CD Player\", \n",
" \"@count\": \"144\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Electronic accessories\", \n",
" \"@count\": \"707\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All other Electronics\", \n",
" \"@count\": \"1745\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Digital media/MP3/4 Player\", \n",
" \"@count\": \"2009\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Record Player\", \n",
" \"@count\": \"32\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Photo/Video Equipment\", \n",
" \"@count\": \"326\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Television set\", \n",
" \"@count\": \"7\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Body Activity Tracker\", \n",
" \"@count\": \"29\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Earphones Speakers\", \n",
" \"@count\": \"627\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Computer External Hard Drive\", \n",
" \"@count\": \"38\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Camera Accessories\", \n",
" \"@count\": \"261\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Clock\", \n",
" \"@count\": \"19\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Walkman\", \n",
" \"@count\": \"81\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"DVD Player\", \n",
" \"@count\": \"122\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Camera\", \n",
" \"@count\": \"269\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Calculator\", \n",
" \"@count\": \"449\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Air conditioner\", \n",
" \"@count\": \"2\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Translator\", \n",
" \"@count\": \"30\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Computer Accessories\", \n",
" \"@count\": \"377\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"PDA\", \n",
" \"@count\": \"18\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Electronic Tablet\", \n",
" \"@count\": \"781\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Electronic Reading Device\", \n",
" \"@count\": \"631\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Carry Bag / Luggage\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Suitcase\", \n",
" \"@count\": \"472\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Handbag\", \n",
" \"@count\": \"3362\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Shoulder bag\", \n",
" \"@count\": \"3335\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Briefcase\", \n",
" \"@count\": \"609\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Shopping bag\", \n",
" \"@count\": \"15146\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Tote bag\", \n",
" \"@count\": \"5944\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Garment/suit bag\", \n",
" \"@count\": \"148\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Duffle bag\", \n",
" \"@count\": \"1154\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Backpack\", \n",
" \"@count\": \"12538\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All other baggage\", \n",
" \"@count\": \"5808\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Messenger bag\", \n",
" \"@count\": \"232\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Eye Wear\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Eyeglass case without glasses\", \n",
" \"@count\": \"355\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Eye glasses\", \n",
" \"@count\": \"7648\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Other\", \n",
" \"@count\": \"232\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Sunglasses\", \n",
" \"@count\": \"2670\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Medical Equipment & Medication\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Insulin Pump\", \n",
" \"@count\": \"52\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Medication\", \n",
" \"@count\": \"4657\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"X - Rays\", \n",
" \"@count\": \"101\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Other\", \n",
" \"@count\": \"378\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Walker\", \n",
" \"@count\": \"10\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Wheel Chair\", \n",
" \"@count\": \"13\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Blood Pressure Meter\", \n",
" \"@count\": \"49\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Medical Equipment / Supplies\", \n",
" \"@count\": \"766\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Electric wheelchair/scooter\", \n",
" \"@count\": \"2\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Cane\", \n",
" \"@count\": \"468\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Diabetic Testing Equipment\", \n",
" \"@count\": \"95\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Glucose Meter\", \n",
" \"@count\": \"273\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Tools\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Tools\", \n",
" \"@count\": \"933\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Tool Box\", \n",
" \"@count\": \"45\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Tool Belt\", \n",
" \"@count\": \"14\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Keys\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Car\", \n",
" \"@count\": \"1833\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"House\", \n",
" \"@count\": \"9264\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Electronic key card\", \n",
" \"@count\": \"393\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other keys\", \n",
" \"@count\": \"3552\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Miscellaneous\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Lunch box\", \n",
" \"@count\": \"1165\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Blue print\", \n",
" \"@count\": \"22\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Photo/Pictures\", \n",
" \"@count\": \"929\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Baby stroller/Carrier\", \n",
" \"@count\": \"118\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Portfolio\", \n",
" \"@count\": \"712\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Misc Items \", \n",
" \"@count\": \"25132\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Art and Crafts\", \n",
" \"@count\": \"524\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Mail\", \n",
" \"@count\": \"2353\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Checkbook\", \n",
" \"@count\": \"854\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Tefillin/Phylacteries (Jewish)\", \n",
" \"@count\": \"14\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Folder\", \n",
" \"@count\": \"4257\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Envelope\", \n",
" \"@count\": \"2437\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Stationary Supplies\", \n",
" \"@count\": \"3701\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Cell Phone/Telephone/Communication Device\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Walkie Talkie\", \n",
" \"@count\": \"122\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Answering Machine\", \n",
" \"@count\": \"9\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Other\", \n",
" \"@count\": \"53\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Cell phone \", \n",
" \"@count\": \"38373\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Pager\", \n",
" \"@count\": \"70\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Telephone Accessories\", \n",
" \"@count\": \"3412\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Toys\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Baby Toys\", \n",
" \"@count\": \"118\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Toys\", \n",
" \"@count\": \"648\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Board Game\", \n",
" \"@count\": \"73\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Doll\", \n",
" \"@count\": \"176\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Cars/Trucks\", \n",
" \"@count\": \"190\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Stuffed Animals\", \n",
" \"@count\": \"270\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Tickets\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Transit Check\", \n",
" \"@count\": \"174\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Tickets\", \n",
" \"@count\": \"440\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"NJ Transit\", \n",
" \"@count\": \"331\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Metro North Tickets\", \n",
" \"@count\": \"189\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Lottery Tickets\", \n",
" \"@count\": \"584\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Entertainment Tickets\", \n",
" \"@count\": \"105\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"MTA Metrocard\", \n",
" \"@count\": \"38873\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"LIRR Tickets\", \n",
" \"@count\": \"299\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Airline\", \n",
" \"@count\": \"89\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Book\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Photo album\", \n",
" \"@count\": \"31\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Paperback\", \n",
" \"@count\": \"5699\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Address book\", \n",
" \"@count\": \"675\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Soft cover\", \n",
" \"@count\": \"2970\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Other\", \n",
" \"@count\": \"658\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Diary\", \n",
" \"@count\": \"496\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Hard Cover\", \n",
" \"@count\": \"3093\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"NYCT Equipment\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Vest\", \n",
" \"@count\": \"10\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Badge\", \n",
" \"@count\": \"40\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Radio\", \n",
" \"@count\": \"8\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Pass or Badge holder\", \n",
" \"@count\": \"12\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All other NYCT Equiment\", \n",
" \"@count\": \"38\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Epic pass\", \n",
" \"@count\": \"308\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Wallet/Purse\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Pouch\", \n",
" \"@count\": \"2565\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Wristlet\", \n",
" \"@count\": \"1620\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Purse\", \n",
" \"@count\": \"10166\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"ID Holder\", \n",
" \"@count\": \"7045\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Business Card holder\", \n",
" \"@count\": \"484\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Billfold\", \n",
" \"@count\": \"189\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Handbag\", \n",
" \"@count\": \"529\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Lanyard\", \n",
" \"@count\": \"53\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Wallet / Purse\", \n",
" \"@count\": \"1186\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Fanny pack\", \n",
" \"@count\": \"181\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Wallet\", \n",
" \"@count\": \"35950\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Accessories\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Scarf\", \n",
" \"@count\": \"1975\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Accessories\", \n",
" \"@count\": \"1202\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Hat\", \n",
" \"@count\": \"3732\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Gloves\", \n",
" \"@count\": \"3378\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Make-up\", \n",
" \"@count\": \"1761\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Ties\", \n",
" \"@count\": \"262\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Belt\", \n",
" \"@count\": \"254\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Umbrella\", \n",
" \"@count\": \"2823\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Clothing \", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Jacket\", \n",
" \"@count\": \"3191\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Coat \", \n",
" \"@count\": \"548\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Nightwear\", \n",
" \"@count\": \"185\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Skirt\", \n",
" \"@count\": \"492\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Blouse\", \n",
" \"@count\": \"842\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Sweater\", \n",
" \"@count\": \"2394\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Pants/trousers/shorts\", \n",
" \"@count\": \"5762\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Suit\", \n",
" \"@count\": \"75\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Shirt\", \n",
" \"@count\": \"3951\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Under garment\", \n",
" \"@count\": \"2419\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Clothing\", \n",
" \"@count\": \"3535\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"T-shirt\", \n",
" \"@count\": \"3123\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Dress\", \n",
" \"@count\": \"630\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Leather Jacket\", \n",
" \"@count\": \"86\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Identification\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Debit Card\", \n",
" \"@count\": \"31871\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Badge\", \n",
" \"@count\": \"167\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Passport\", \n",
" \"@count\": \"2416\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Social Security Card\", \n",
" \"@count\": \"6111\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Birth Certificate\", \n",
" \"@count\": \"1030\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"School ID\", \n",
" \"@count\": \"14173\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Credit Card\", \n",
" \"@count\": \"17074\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"AAA Card\", \n",
" \"@count\": \"656\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Resident Card\", \n",
" \"@count\": \"855\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Employment ID\", \n",
" \"@count\": \"6594\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Identification Card\", \n",
" \"@count\": \"11787\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Library card\", \n",
" \"@count\": \"5877\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Personal Identification\", \n",
" \"@count\": \"9690\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Membership card\", \n",
" \"@count\": \"8999\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Military ID\", \n",
" \"@count\": \"248\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Drivers License\", \n",
" \"@count\": \"15259\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Registration Card\", \n",
" \"@count\": \"386\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Benefit Card\", \n",
" \"@count\": \"10101\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Insurance Card\", \n",
" \"@count\": \"13425\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Death Certificate\", \n",
" \"@count\": \"21\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Entertainment (Music/Movies/Games)\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Video Games System\", \n",
" \"@count\": \"411\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Entertainment\", \n",
" \"@count\": \"285\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"VHS Tape\", \n",
" \"@count\": \"116\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"DVD\", \n",
" \"@count\": \"972\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Blu-Ray\", \n",
" \"@count\": \"4\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Computer Games \", \n",
" \"@count\": \"355\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Compact Disc\", \n",
" \"@count\": \"884\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Footwear\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Shoes\", \n",
" \"@count\": \"2204\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Boots\", \n",
" \"@count\": \"764\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Slippers\", \n",
" \"@count\": \"523\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Sandals\", \n",
" \"@count\": \"430\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Galoshes\", \n",
" \"@count\": \"22\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Footwear\", \n",
" \"@count\": \"134\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Sneakers\", \n",
" \"@count\": \"2200\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Currency\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Gift Card \", \n",
" \"@count\": \"1900\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Payroll check\", \n",
" \"@count\": \"827\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Money Order\", \n",
" \"@count\": \"164\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Personal Check\", \n",
" \"@count\": \"1312\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Travelers Checks \", \n",
" \"@count\": \"21\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Other Currency\", \n",
" \"@count\": \"145\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Cash\", \n",
" \"@count\": \"33063\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Foreign Currency\", \n",
" \"@count\": \"1938\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Musical Instrument\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Flute\", \n",
" \"@count\": \"88\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Musical instrument accessories\", \n",
" \"@count\": \"53\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Clarinet\", \n",
" \"@count\": \"76\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All Other Instruments\", \n",
" \"@count\": \"113\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Guitar\", \n",
" \"@count\": \"44\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Violin\", \n",
" \"@count\": \"85\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Harmonica\", \n",
" \"@count\": \"8\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Saxophone\", \n",
" \"@count\": \"32\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Trumpet\", \n",
" \"@count\": \"71\"\n",
" }\n",
" ]\n",
" }, \n",
" {\n",
" \"@Category\": \"Jewelry\", \n",
" \"SubCategory\": [\n",
" {\n",
" \"@SubCategory\": \"Fashion (Costume)\", \n",
" \"@count\": \"2518\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Cuff Links\", \n",
" \"@count\": \"5\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Necklace\", \n",
" \"@count\": \"416\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Watch\", \n",
" \"@count\": \"1633\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Ring\", \n",
" \"@count\": \"1062\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Bracelet\", \n",
" \"@count\": \"713\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Earrings\", \n",
" \"@count\": \"382\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"All other Jewelry\", \n",
" \"@count\": \"161\"\n",
" }, \n",
" {\n",
" \"@SubCategory\": \"Tie Clips\", \n",
" \"@count\": \"2\"\n",
" }\n",
" ]\n",
" }\n",
" ]\n",
" }\n",
"}\n"
]
}
],
"source": [
"print json.dumps(doc, indent=4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `Category` key points to a list of `Category` elements, each of which itself has a list of `SubCategory` elements. In order to get the subcategory labels and the corresponding counts, we need to loop over all of the `Category`s and, for each `Category`, loop over all of the `SubCategory`s. It'd look something like this:"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Wall and Window Covering: 79\n",
"Ornaments: 49\n",
"Appliances: 97\n",
"Linen: 971\n",
"Floor Covering: 30\n",
"All Other Furnishings: 556\n",
"Dishware: 609\n",
"Bats: 13\n",
"Scooter: 64\n",
"Golf Club: 5\n",
"Ball: 228\n",
"Skateboard: 44\n",
"Binocular: 20\n",
"Electric Motor Scooter: 1\n",
"Protective Gear: 327\n",
"Hockey Stick: 6\n",
"Sports Racket: 116\n",
"Bicycle: 170\n",
"Skis: 3\n",
"All Other sports equipment: 465\n",
"Digital Camera: 857\n",
"USB Drive: 537\n",
"Portable radio: 292\n",
"Computer : 972\n",
"GPS Navigation System: 41\n",
"CD Player: 144\n",
"Electronic accessories: 707\n",
"All other Electronics: 1745\n",
"Digital media/MP3/4 Player: 2009\n",
"Record Player: 32\n",
"Photo/Video Equipment: 326\n",
"Television set: 7\n",
"Body Activity Tracker: 29\n",
"Earphones Speakers: 627\n",
"Computer External Hard Drive: 38\n",
"Camera Accessories: 261\n",
"Clock: 19\n",
"Walkman: 81\n",
"DVD Player: 122\n",
"Camera: 269\n",
"Calculator: 449\n",
"Air conditioner: 2\n",
"Translator: 30\n",
"Computer Accessories: 377\n",
"PDA: 18\n",
"Electronic Tablet: 781\n",
"Electronic Reading Device: 631\n",
"Suitcase: 472\n",
"Handbag: 3362\n",
"Shoulder bag: 3335\n",
"Briefcase: 609\n",
"Shopping bag: 15146\n",
"Tote bag: 5944\n",
"Garment/suit bag: 148\n",
"Duffle bag: 1154\n",
"Backpack: 12538\n",
"All other baggage: 5808\n",
"Messenger bag: 232\n",
"Eyeglass case without glasses: 355\n",
"Eye glasses: 7648\n",
"Other: 232\n",
"Sunglasses: 2670\n",
"Insulin Pump: 52\n",
"Medication: 4657\n",
"X - Rays: 101\n",
"Other: 378\n",
"Walker: 10\n",
"Wheel Chair: 13\n",
"Blood Pressure Meter: 49\n",
"All Other Medical Equipment / Supplies: 766\n",
"Electric wheelchair/scooter: 2\n",
"Cane: 468\n",
"Diabetic Testing Equipment: 95\n",
"Glucose Meter: 273\n",
"Tools: 933\n",
"Tool Box: 45\n",
"Tool Belt: 14\n",
"Car: 1833\n",
"House: 9264\n",
"Electronic key card: 393\n",
"All Other keys: 3552\n",
"Lunch box: 1165\n",
"Blue print: 22\n",
"Photo/Pictures: 929\n",
"Baby stroller/Carrier: 118\n",
"Portfolio: 712\n",
"All Other Misc Items : 25132\n",
"Art and Crafts: 524\n",
"Mail: 2353\n",
"Checkbook: 854\n",
"Tefillin/Phylacteries (Jewish): 14\n",
"Folder: 4257\n",
"Envelope: 2437\n",
"Stationary Supplies: 3701\n",
"Walkie Talkie: 122\n",
"Answering Machine: 9\n",
"Other: 53\n",
"Cell phone : 38373\n",
"Pager: 70\n",
"Telephone Accessories: 3412\n",
"Baby Toys: 118\n",
"All Other Toys: 648\n",
"Board Game: 73\n",
"Doll: 176\n",
"Cars/Trucks: 190\n",
"Stuffed Animals: 270\n",
"Transit Check: 174\n",
"All Other Tickets: 440\n",
"NJ Transit: 331\n",
"Metro North Tickets: 189\n",
"Lottery Tickets: 584\n",
"Entertainment Tickets: 105\n",
"MTA Metrocard: 38873\n",
"LIRR Tickets: 299\n",
"Airline: 89\n",
"Photo album: 31\n",
"Paperback: 5699\n",
"Address book: 675\n",
"Soft cover: 2970\n",
"Other: 658\n",
"Diary: 496\n",
"Hard Cover: 3093\n",
"Vest: 10\n",
"Badge: 40\n",
"Radio: 8\n",
"Pass or Badge holder: 12\n",
"All other NYCT Equiment: 38\n",
"Epic pass: 308\n",
"Pouch: 2565\n",
"Wristlet: 1620\n",
"Purse: 10166\n",
"ID Holder: 7045\n",
"Business Card holder: 484\n",
"Billfold: 189\n",
"Handbag: 529\n",
"Lanyard: 53\n",
"All Other Wallet / Purse: 1186\n",
"Fanny pack: 181\n",
"Wallet: 35950\n",
"Scarf: 1975\n",
"All Other Accessories: 1202\n",
"Hat: 3732\n",
"Gloves: 3378\n",
"Make-up: 1761\n",
"Ties: 262\n",
"Belt: 254\n",
"Umbrella: 2823\n",
"Jacket: 3191\n",
"Coat : 548\n",
"Nightwear: 185\n",
"Skirt: 492\n",
"Blouse: 842\n",
"Sweater: 2394\n",
"Pants/trousers/shorts: 5762\n",
"Suit: 75\n",
"Shirt: 3951\n",
"Under garment: 2419\n",
"All Other Clothing: 3535\n",
"T-shirt: 3123\n",
"Dress: 630\n",
"Leather Jacket: 86\n",
"Debit Card: 31871\n",
"Badge: 167\n",
"Passport: 2416\n",
"Social Security Card: 6111\n",
"Birth Certificate: 1030\n",
"School ID: 14173\n",
"Credit Card: 17074\n",
"AAA Card: 656\n",
"Resident Card: 855\n",
"Employment ID: 6594\n",
"Identification Card: 11787\n",
"Library card: 5877\n",
"All Other Personal Identification: 9690\n",
"Membership card: 8999\n",
"Military ID: 248\n",
"Drivers License: 15259\n",
"Registration Card: 386\n",
"Benefit Card: 10101\n",
"Insurance Card: 13425\n",
"Death Certificate: 21\n",
"Video Games System: 411\n",
"All Other Entertainment: 285\n",
"VHS Tape: 116\n",
"DVD: 972\n",
"Blu-Ray: 4\n",
"Computer Games : 355\n",
"Compact Disc: 884\n",
"Shoes: 2204\n",
"Boots: 764\n",
"Slippers: 523\n",
"Sandals: 430\n",
"Galoshes: 22\n",
"All Other Footwear: 134\n",
"Sneakers: 2200\n",
"Gift Card : 1900\n",
"Payroll check: 827\n",
"Money Order: 164\n",
"Personal Check: 1312\n",
"Travelers Checks : 21\n",
"Other Currency: 145\n",
"Cash: 33063\n",
"Foreign Currency: 1938\n",
"Flute: 88\n",
"Musical instrument accessories: 53\n",
"Clarinet: 76\n",
"All Other Instruments: 113\n",
"Guitar: 44\n",
"Violin: 85\n",
"Harmonica: 8\n",
"Saxophone: 32\n",
"Trumpet: 71\n",
"Fashion (Costume): 2518\n",
"Cuff Links: 5\n",
"Necklace: 416\n",
"Watch: 1633\n",
"Ring: 1062\n",
"Bracelet: 713\n",
"Earrings: 382\n",
"All other Jewelry: 161\n",
"Tie Clips: 2\n"
]
}
],
"source": [
"for category in doc['LostProperty']['Category']:\n",
" for subcat in category['SubCategory']:\n",
" print subcat['@SubCategory'] + \": \" + subcat['@count']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You could also make a list from these values to work with on their own later (i.e., to reorder them alphabetically, select one at random, etc.)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment