Created
April 26, 2010 16:24
-
-
Save mbklein/379543 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kleinm-apim24:Desktop KleinM$ irb | |
>> require 'open-uri' | |
=> true | |
>> raw = open('https://catalog.library.jhu.edu/mods/?format=marc&bib=1144347') { |io| io.read } | |
=> "\nLDR:\302\24001341nam 2200301 a 450 \n005: 19971120234400.0\n008: 890316s1988 caua b 101 0 eng \n010: \302\240\302\240$a 86025055\n020: \302\240\302\240$a0520055756 (alk. paper)\n035: \302\240\302\240$a1144347\n035: \302\240\302\240$aADR9480EI\n035: \302\240\302\240$a(CStRLIN)MDJGADR9480-B\n040: \302\240\302\240$dMdBJ\n043: \302\240\302\240$aaz-----\n049: \302\24000\302\240$aJHE\n050: \302\24000\302\240$aBP63.A4$bS646 1988\n082: \302\24000\302\240$a297/.14/0954$219\n245: \302\24000\302\240$aShari\314\204\312\273at and ambiguity in South Asian Islam /$cedited by Katherine P. Ewing.\n260: \302\24000\302\240$aBerkeley :$bUniversity of California Press,$cc1988.\n300: \302\24000\302\240$axiii, 321 p. :$bill. ;$c24 cm.\n500: \302\24000\302\240$aPapers from the conference entitled \"South Asian Islam: moral principles in tension\" held at the Pendle Hill Conference Center in Pennsylvania, May 22-24, 1981.\n500: \302\24000\302\240$a\"Sponsored by the Joint Committee on South Asia of the Social Science Research Council and the American Council of Learned Societies\"--P. facing t.p.\n504: \302\24000\302\240$aIncludes bibliographies and index.\n650: \302\240 0\302\240$aIslam$zSouth Asia$vCongresses.\n650: \302\240 0\302\240$aIslamic law$zSouth Asia$vCongresses.\n700: \302\240 0\302\240$aEwing, Katherine Pratt\n710: \302\240 0\302\240$aJoint Committee on South Asia\n910: \302\240 0\302\240$a1144347$bHorizon bib#\n999: \302\240 0\302\240$lmain$nelc$s1$wi$aBP63.A4$bS783 1988$mc. 1$61$tnorm$c4$103/16/1989$211/02/1995$711/02/1995$i31151005777655$z96f@$zNCC:L$xemsel$henorm$gemain\n\n" | |
# Some copy-and-paste action in here | |
>> title = "Shari\314\204\312\273at and ambiguity in South Asian Islam" | |
=> "Shari\314\204\312\273at and ambiguity in South Asian Islam" | |
>> print title | |
Sharīʻat and ambiguity in South Asian Islam=> nil | |
# Convert title to an array of hex bytes by unpacking to integers and using String.to_s(base) | |
>> bytes = title.unpack('U' * title.length).collect { |b| b.to_s(16) } | |
=> ["53", "68", "61", "72", "69", "304", "2bb", "61", "74", "20", "61", "6e", "64", "20", "61", "6d", "62", "69", "67", "75", "69", "74", "79", "20", "69", "6e", "20", "53", "6f", "75", "74", "68", "20", "41", "73", "69", "61", "6e", "20", "49", "73", "6c", "61", "6d"] | |
# Unicode (module) to the rescue! | |
>> require 'unicode' | |
=> true | |
# Convert title to its precomposed form | |
>> title = Unicode.normalize_KC(title) | |
=> "Shar\304\253\312\273at and ambiguity in South Asian Islam" | |
>> bytes = title.unpack('U' * title.length).collect { |b| b.to_s(16) } | |
=> ["53", "68", "61", "72", "12b", "2bb", "61", "74", "20", "61", "6e", "64", "20", "61", "6d", "62", "69", "67", "75", "69", "74", "79", "20", "69", "6e", "20", "53", "6f", "75", "74", "68", "20", "41", "73", "69", "61", "6e", "20", "49", "73", "6c", "61", "6d"] | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment