Last active
December 21, 2015 12:59
-
-
Save marks/6309421 to your computer and use it in GitHub Desktop.
Some real quick code to extract NDC values and specific attributes from a FDA SPL XML document. Provided as-is.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Input: FDA SPL XML file (can be found at http://dailymed.nlm.nih.gov/dailymed/) | |
# Output: Array of hashes containing NDCs and their sizes | |
# This code is provided AS-IS: THE INFORMATION CONTAINED HEREIN OR ANY SITE-RELATED SERVICE, IS PROVIDED "AS IS" WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, SITE-RELATED SERVICES, AND ANY HYPERLINKED WEBSITES. | |
require 'pp' | |
require 'rexml/document' | |
xml_file = "12e3a517-4858-42f3-ab2e-062f1a592737.xml" # or use ARGV[0] to take in a command-line argument | |
xml_doc = REXML::Document.new(File.read(xml_file)) | |
data_to_return = [] | |
# find specific characteristics | |
size_elements = xml_doc.elements.to_a("//code[@code='SPLSIZE']") | |
size_elements.each do |size_element| | |
ndc_data = {} | |
characteristic = size_element.parent | |
manufactured_product_parent = characteristic.parent.parent | |
# look up/backwards for NDC information | |
manufactured_products = manufactured_product_parent.elements.to_a("manufacturedProduct") | |
manufactured_products.each do |manufactured_product| | |
manufactured_product.elements.to_a("code[@codeSystem='2.16.840.1.113883.6.69']").each do |ndc| | |
ndc_data[:ndc] = ndc.attributes["code"] | |
end | |
end | |
# store information for this ndc | |
size_values = characteristic.elements.to_a("value") | |
size_values.each do |size_value| | |
ndc_data[:unit] = size_value.attributes["unit"] | |
ndc_data[:size_value] = size_value.attributes["value"] | |
end | |
data_to_return << ndc_data | |
end | |
puts data_to_return.inspect | |
# [ | |
# {:unit=>"mm", :ndc=>"61748-015", :size_value=>"18"}, | |
# {:unit=>"mm", :ndc=>"61748-018", :size_value=>"20"} | |
# ] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Revision #1 -> #2 changes: fixing indentation formatting