Skip to content

Instantly share code, notes, and snippets.

@sween
Created July 2, 2009 19:40
Show Gist options
  • Save sween/139662 to your computer and use it in GitHub Desktop.
Save sween/139662 to your computer and use it in GitHub Desktop.
Jabber IM with circuitshaman <[email protected]>7/2/09 2:38 PM
Ron Sweeney
ping
2:54 PM
circuitshaman
pong
3:02 PM
Ron Sweeney
are you familiar with document based db's ?
circuitshaman
familiar, yes
circuitshaman
used, no
circuitshaman
couchdb is the thing though
circuitshaman
fedora doesn't scale worth a darn
circuitshaman
it's poo
Ron Sweeney
wrong, but lets have another discussion about that...
circuitshaman
haha
Ron Sweeney
get off the couch
Ron Sweeney
anyway...
circuitshaman
no good?
Ron Sweeney
triple stores...
Ron Sweeney
mongodb > couch
circuitshaman
yes, I know those
circuitshaman
sweet
Ron Sweeney
if triple stores are just a storage mechanism for triples (pairs I assume)...
circuitshaman
usually fours in fact
circuitshaman
but at least triples
Ron Sweeney
ok, so these are like hash of hash of hashes ?
circuitshaman
well in the abstract sense, which is to say what a programmer cares about, they really are triples
circuitshaman
but the storage mechanism/method changes from triple store to triple store
circuitshaman
subject predicate object
circuitshaman
myOnt:Banana rdfs:subClassOf owl:Thing
circuitshaman
that's a triple
circuitshaman
but I guess I'm confused about your actual question
Ron Sweeney
not really, red light blinking on my forehead
Ron Sweeney
if I confused you, its because I dont know what im talking about...
Ron Sweeney
my underlying interest was to see if a document based db would be an alternative to a triple store...
circuitshaman
nope
circuitshaman
very very different needs
circuitshaman
I need to do some serious writing on the subject because there is jack squat out there to explain what this stuff really is and what it's good for
Ron Sweeney
right... im having a hard time mapping the niche it provides
Ron Sweeney
s/it/they/g
circuitshaman
it's kind of a chicken and egg situation. The problems that it's good for solving are usually ones that people don't think are solvable or at least not pressing
circuitshaman
he's an example at a high level
circuitshaman
When I say TB, what do you think?
Ron Sweeney
turberculosis, terabyte
circuitshaman
right, now how did you get there?
circuitshaman
and are those two thing related?
Ron Sweeney
not at all
circuitshaman
but in your head there is a mapping that happens on the acronym TB for both
Ron Sweeney
yup
circuitshaman
what's even more important is that while you know TB is an acronym for both turberculosis and terabytes are very different things and totally unrelated
circuitshaman
you an understanding of those words well beyond their spelling
Ron Sweeney
right, and one does not "overpower" the other, they are equal.
Ron Sweeney
meaning, one is not a "better" representation of TB
circuitshaman
sure, they're just related. But you know that relation isn't important in this case
circuitshaman
but if I asked, what do tuberculosis and terabyte have in common you'd have to conclude TB
Ron Sweeney
right, like that show on NPR
Ron Sweeney
dont let my banter hinder your lesson
circuitshaman
so with the idea that there is a far greater depth to a relationship than a hash map think of a relational database of the taxonomy of life
circuitshaman
and with that you also have diseases that are related to species
circuitshaman
and those species are susceptible to those diseases after a certain age.
circuitshaman
I'm not 
Ron Sweeney
ok, now stop.
Ron Sweeney
take species...
Ron Sweeney
its a "model" so to speak with properties too, like => 'age'
circuitshaman
well here's the point I'm making, try to come up with for a second the models that would make this work in a relational database
Ron Sweeney
ok, im tracking you...
circuitshaman
the whole taxonomy of species (kingdom phila,etc), disease that may have their own taxonomy, relationships between the two
circuitshaman
it gets insanely complicated very fast
Ron Sweeney
yeah, im there too.
circuitshaman
and the queries are ever more disgusting even if you manage to come up with a schema
Ron Sweeney
the queries even themsevles seem hard to "validate" or "test"
circuitshaman
right
Ron Sweeney
but, that is insanely powerful
Ron Sweeney
because, you know the answer to a question that hasn't been asked before really.
circuitshaman
exactly!
Ron Sweeney
i think this chat window is going to end up as a gist
circuitshaman
can you imagine a researcher trying to figure out which animal to test his new drug for a particular disease
circuitshaman
haha
Ron Sweeney
you'd first need to know which animal is suseptible
circuitshaman
right, and what age they are susceptible
circuitshaman
and maybe you want it to be a certain size
Ron Sweeney
weight
Ron Sweeney
right.
circuitshaman
right, the relational query is mammoth
circuitshaman
but it would look something like this in sparql:
Ron Sweeney
so populating this beast is just as easy ?
circuitshaman
SELECT ?animal WHERE { ?animal owl:Is_A Animal . ?animal susceptible_to MyDisease ...} and so on
circuitshaman
it's pattern matching
circuitshaman
adding to this beasty is just adding a series of declaration or triples
circuitshaman
Tiger owl:Is_A Animal
Ron Sweeney
ohhhhhhh
circuitshaman
and whether Animal is described further doesn't matter
Ron Sweeney
tiger owl platypus bsddevil:Is_animal
Ron Sweeney
thats a triple ?
circuitshaman
the store knows that Animal exists if only to be related to Tiger by owl:Is_A
circuitshaman
no, but I think you're going for:
circuitshaman
tiger owl:Is_A bsddevil
circuitshaman
owl is the web ontology language
Ron Sweeney
hahahaha
circuitshaman
it's a namespace of already declared things
Ron Sweeney
yeah, I actually knew that kind of
Ron Sweeney
just because this is going to be a gist...
Ron Sweeney
hahaa
circuitshaman
haha
circuitshaman
so obviously this is powerful stuff, no?
circuitshaman
it's using old computer science techniques (pattern matching and mathematical graphs) to attack the problem of complex relationships and semantic information
Ron Sweeney
absolutely... maybe this will help me get a grip a bit and use active_sesame to excercise what I've seen here
circuitshaman
well let me give you the new version
circuitshaman
it will make way more sense
circuitshaman
the original version is quite the hack and obfuscates way too much
Ron Sweeney
let me give you a use case..., you tell me if this could help
circuitshaman
sure
Ron Sweeney
ICD-10
circuitshaman
sounds good so far 
Ron Sweeney
the relationship of the code to the procedure to the related procedures to make up a single code for cascading procedures
circuitshaman
yes
circuitshaman
that would be an excellent and extensible use case
Ron Sweeney
dood, you're a rock star.
circuitshaman
hahaha
circuitshaman
I'd love to work on it with you
circuitshaman
having someone actively involved in solving a real problem would help me finish the library
Ron Sweeney
I have 2 weeks off this month...im hacking every single day of them...
circuitshaman
plus I'd love a test writer, it's never been my strong suit
circuitshaman
hahaha
circuitshaman
I'll throw up the latest version as a branch on github
circuitshaman
in the mean time you'll want to grab allegrograph
Ron Sweeney
yup, seems we've gotten to this point before, I will make it happen now with my recent enlightenment...
circuitshaman
haha
circuitshaman
you also have a clear problem
circuitshaman
and I've recently started writing it again
circuitshaman
it needs to be called open sesame though
Ron Sweeney
hahahah
Ron Sweeney
awesome, gotta run, have a good holiday weekend. thanks too!
circuitshaman
no problem, enjoy the fight!
circuitshaman
that's way cool
Ron Sweeney
yup, we'll take lotsa pics
circuitshaman
nice
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment