Skip to content

Instantly share code, notes, and snippets.

@Asher-
Created October 3, 2015 02:15
Show Gist options
  • Select an option

  • Save Asher-/f1e1c3cbe4d7a631b9d2 to your computer and use it in GitHub Desktop.

Select an option

Save Asher-/f1e1c3cbe4d7a631b9d2 to your computer and use it in GitHub Desktop.
3:58:21 AM Asher: maaku - you described what you want to build as "basically be datalog over a probabalistic logic database" - would you be building your own implementation based off that premise?
3:58:45 AM Asher: database implementation
3:59:26 AM Asher: i guess the question is the same as: is the language you're developing identical with the database that implements it, or are you thinking you will rely on existing database tools to adapt to your needs?
4:01:52 AM maaku: "is the language you're developing identical with the database that implements it" <-- yes
4:02:01 AM maaku: "will you rely on existing database tools" <-- yes
4:02:01 AM Asher: cool
4:02:07 AM Asher: what existing tools?
4:02:11 AM maaku: not sure why you think those are opposing
4:02:27 AM Asher: i'm trying to understand how much of a new database architecture you will be making
4:02:50 AM Asher: or have in mind, whether or not you make it
4:03:08 AM maaku: i'd rather not maintain my own db technology
4:03:24 AM maaku: but unfortunately the state of the art of probabalistic databases sucks :(
4:03:44 AM Asher: what about the state of the art of graph databases?
4:04:08 AM maaku: good for what they do, but not for what I'm doing
4:04:15 AM maaku: it really needs to be relational
4:04:26 AM maaku: and I don't know of any relational graph databases...
4:04:45 AM Asher: can you describe the requirement that is the limit case there?
4:05:37 AM Asher: in what sense does it need to be relational that would not be defined by the specific graph connections?
zadock [~outsider@cthulhu.tuiasi.ro] entered the room. (4:15:30 AM)
4:21:12 AM Asher: given that i'm building my own database, i'm curious what it would have to do to satisfy your needs
4:21:33 AM maaku: So I come from this assumption which you may not agree with: the relational model is the best model for storing and querying data that computer science has come up with.
4:21:45 AM maaku: I very much agree with Date and Darwen’s “third manifesto” (thethirdmanifesto.com) -- the relational model is the natural representation of structured data on a computer.
4:22:08 AM maaku: (I just wish that either one of them learned even a smidgen of programming language theory :\ )
4:22:12 AM Asher: if i exclude my own work then i agree with you
4:22:32 AM maaku: If you google ‘datalog’ you’ll find plenty of references referring to it as sql-in-prolog, or prolog-in-sql. Unfortunately that’s very much missing the point.
4:22:41 AM Asher: i think that the set theoretic approach does one better to the relational model
4:22:43 AM maaku: Datalog is a homoiconic relational database language. Datalog is to tables what Lisp is to lists.
4:23:05 AM Asher: ok so when you say relational model you mean specifically something like datalog and not something like SQL?
4:23:09 AM maaku: Asher: the real relational model (not SQL standards) is set theoretic
4:23:13 AM maaku: right
4:23:35 AM Asher: i'll have to look more into datalog
4:23:37 AM maaku: I mean Codd's relational model, which was set theoretic data storage
4:23:50 AM Asher: have a link?
4:23:54 AM maaku: In datalog, code is represented as tables, and tables represent code.
4:23:58 AM Asher: or is that what you linked
4:24:08 AM maaku: Let me find the best reference I have (it's an AI paper!)
4:24:43 AM Asher: i'm confused about the third manifesto
4:24:49 AM Asher: is the manifesto available on the website?
4:25:21 AM maaku: Asher: this is the paper that turned me on to datalog: http://www.cs.jhu.edu/~nwf/datalog20-paper.pdf
4:25:43 AM Asher: ah here is the manifesto http://www.dcs.warwick.ac.uk/~hugh/TTM/DTATRM.pdf
4:27:59 AM maaku: Asher: not sure I'd recommend the manifesto as a technical plan. As I mentioned it's missing a half century of programming language and type theory :\
4:28:28 AM maaku: but specifically relating to the data model, I believe it is right
4:29:41 AM maaku: So my ideal language is: Datalog with a Agda/Idris-like dependent type system (or if that's too scary, Haskell like will do)
4:29:51 AM maaku: Where data is stored in relational tables a la third manifesto
4:30:26 AM maaku: And because datalog, code is data and data is code, so it's as easy as lisp to self-modify.
4:30:52 AM maaku: And probabalistic semantics are added.
4:31:25 AM maaku: the probabalistic bit is still an open area of research ... it mucks up some of the relationsl db assumptions
4:32:20 AM maaku: Here's a good overview of prob. dbms: http://www.cs.stanford.edu/people/chrismre/papers/cacm-paper-full.pdf
4:32:45 AM Asher: thanks
4:34:16 AM maaku: A prob. database has a concept of 'lineage' -- a constructed view/query has to keep track of how it was calculated, as the way in which you construct the query matters
4:34:44 AM maaku: Querying different ways gets different results, in contrast to an exact, binary truth database.
4:35:00 AM maaku: This scares relational people.
4:36:00 AM Asher: for sure
4:36:37 AM maaku: But my driving insight here is that 'lineage' here means causal model, and these lineages that have to be annoyingly tracked are actually the real structure of the PGM being computed
4:36:50 AM Asher: right
4:37:27 AM Asher: i would suggest that the primary task in architectural design is figuring out how to best separate the lineage from the storage without losing functionality by separating them
4:37:50 AM Asher: the primary barrier i see in many contexts is the additional structural data (like lineage) makes it very difficult to dynamically organize or query the primary data
4:38:12 AM Asher: which isn't to argue against that additional data (like lineage), only to say that encapsulation i think is the primary task/difficulty
4:39:55 AM maaku: Indeed, I think that's the only part here that is frontier research
4:40:24 AM Asher: i think you're right about the importance of lineage tho
4:40:43 AM Asher: i think actually this is the point where all the suggestions about the importance of compression for AI come in
4:40:46 AM maaku: I'm not sure what the API would look like to interface with these 'lineages' (PGMs), but that is sortof the key part of the architecture
4:41:10 AM Asher: you don't actually want (most of the time) the exact lineage... you actually want an accurate abstraction of the lineage
4:41:25 AM Asher: so what you actually want is a compressed image of the lineage
4:41:29 AM Asher: which is consolidated in some complex way
4:41:41 AM Asher: that also relates it to other lineages
4:41:58 AM Asher: the points of compression becomes points of association
4:42:30 AM maaku: Compression meaning abstraction? E.g. model generation?
4:42:50 AM Asher: that might be one approach, but not the only possibility
4:43:05 AM Asher: it could be much simpler, for example in the brain when patterns overlap
4:43:20 AM Asher: i would suggest the brain tries to organize the overlap so that when there is overlap, that is a desired association
4:43:33 AM Asher: much of which is determined, as you pointed out, by lineage
4:43:40 AM maaku: right, ok
4:43:48 AM Asher: so if there is overlap you only have to write it down "once"
4:44:09 AM Asher: so the brain effectively compresses the information by re-using a portion of the activation pattern
4:44:29 AM maaku: that's part of my motivation for the strong, expressive type system glossed over above
4:44:54 AM maaku: i'm imagining a component that is able to compare or index these based on their expressed type or properties
4:45:01 AM maaku: but there are probably other approaches too
4:45:31 AM maaku: (the other reason for typing is evolutionary cross over, but that's somewhat unrelated to this discussion)
4:45:47 AM Asher: right
4:46:00 AM Asher: i wouldn't describe my approach as a type system, but i can easily see how it maps directly on to what you're talking about
4:46:08 AM Asher: the difference is that my approach is a way of organizing sets in dual relations
4:46:15 AM Asher: intrinsic/extrinsic definition
4:46:32 AM Asher: and the evolutionary cross over is the "debate" over whether a defining member is intrinsic or extrinsic
4:47:04 AM Asher: the dual relation (a "term") permits a definition of other "terms" by inclusion
4:47:13 AM Asher: so a term is defined by its intrinsic elements
4:47:23 AM Asher: but when a term is used to define another term, that other term becomes one of the term's extrinsic elements
4:47:45 AM Asher: it's all based on Tarski's theory of reference
4:48:02 AM maaku: Asher: btw above you asked for link to set-theoretic relational model, this is the one that started it all: https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf
4:48:10 AM Asher: thanks!
4:49:24 AM maaku: Codd's 1970 paper that introduced the term 'relational model' Unfortunately 'relational' became a buzzword after that, and some hybrid, impure database systems got standardized as SQL
4:51:05 AM maaku: Asher: it's interesting to parse your description of dual relations
4:51:33 AM maaku: I agree and I think we're working towards the same things, but because of vocabulary differences it took me a while to be sure!
4:52:42 AM maaku: right so you end up with a graph of terms using that system?
4:53:53 AM maaku: i guess what i've worried most about is things like inherentance -- when can you say a type/term is a subclass/subset of another?
4:54:03 AM maaku: and harder perhaps, what if it is context dependent?
4:55:09 AM maaku: well I need to finish reading your thesis
4:55:29 AM maaku: but for now I'm going to sign off and get some sleep..
doomlord left the room (quit: Quit: My MacBook Pro has gone to sleep. ZZZzzz…). (5:01:32 AM)
5:01:44 AM Asher: that's what the extrinsic relations describe
5:01:49 AM Asher: but we shall continue later :)
5:01:51 AM Asher: sleep well!
5:02:05 AM Asher: i look forward to elaborating and hearing your thoughts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment