Since we’re already using Datascript as the Datalog engine in the previous response, adapting the CQL-to-Clojure approach to focus solely on Datascript means we’ll streamline the workflow by fully embedding the knowledge graph (KG) reasoning within Datascript, while still leveraging CQL’s Java libraries via interop for schema definition and category-theoretic operations. This eliminates any redundant steps and keeps Datascript as the central hub for querying and reasoning, directly feeding your Neo4j-based RAG system with models like Grok 3, Qwen-QwQ, and DeepSeek. Here’s how we can adapt and refine this.
- Neo4j: Remains the persistent KG store.
- CQL (Java): Used via interop to define schemas and instances, providing a category-theoretic foundation.
- Datascript: Acts as the in-memory Datalog database for all querying and reasoning, replacing Cypher entirely.
- Clojure: Orchestrates interop, data conversion, and RAG integration.
The workflow:
- Define a CQL schema and instance from Neo4j data.
- Convert CQL instance data into Datascript facts.
- Use Datascript’s Datalog to define rules and run queries.
- Feed results to reasoning models.
We’ll use CQL’s Java libraries to define the schema and populate it with Neo4j data, then extract it for Datascript.
(ns cql-datascript-rag
(:require [datascript.core :as d])
(:import [catdata.aql AqlEnv AqlOptions Schema Instance]
[catdata.aql.semantics AqlCompiler]))
(def env (AqlEnv.))
(def opts (AqlOptions.))
(def compiler (AqlCompiler. opts))
(def schema-str
"typeside Ty = literal {
types string
}
schema S = literal : Ty {
entities Author Article Topic
foreign_keys
published : Article -> Author
in_topic : Article -> Topic
attributes
name : Author -> string
title : Article -> string
topic_name : Topic -> string
}")
(.run env compiler schema-str)
(def schema (.getSchema env "S"))
Pull data from Neo4j and create a dynamic CQL instance:
(require '[neo4j-clj.core :as neo4j])
(def conn (neo4j/connect "bolt://localhost:7687" "neo4j" "password"))
(defn neo4j-to-cql-instance []
(neo4j/with-session conn session
(let [result (neo4j/execute session
"MATCH (a:Author)-[:PUBLISHED]->(art:Article)-[:IN_TOPIC]->(t:Topic)
RETURN a.name, art.title, t.name")
authors (map-indexed #(str "a" %1) result)
articles (map-indexed #(str "art" %1) result)
topics (map-indexed #(str "t" %1) result)
instance-str (str "instance I = literal : S {
entities
Author -> {" (clojure.string/join " " authors) "}
Article -> {" (clojure.string/join " " articles) "}
Topic -> {" (clojure.string/join " " topics) "}
foreign_keys
published -> {" (clojure.string/join " " (map-indexed #(str "art" %1 "->a" %1) result)) "}
in_topic -> {" (clojure.string/join " " (map-indexed #(str "art" %1 "->t" %1) result)) "}
attributes
name -> {" (clojure.string/join " " (map-indexed #(str "a" %1 "->\"" (:a.name %2) "\"") result)) "}
title -> {" (clojure.string/join " " (map-indexed #(str "art" %1 "->\"" (:art.title %2) "\"") result)) "}
topic_name -> {" (clojure.string/join " " (map-indexed #(str "t" %1 "->\"" (:t.name %2) "\"") result)) "}
}")]
(.run env compiler instance-str)
(.getInstance env "I"))))
(def instance (neo4j-to-cql-instance))
;; Example output: Authors {a0="Jane Doe", a1="John Smith"}, Articles {art0="AI Impacts", art1="AI Ethics"}, etc.
Extract data from the CQL instance and load it into Datascript.
(defn cql-to-datascript [cql-instance]
(let [authors (.generators cql-instance "Author")
articles (.generators cql-instance "Article")
topics (.generators cql-instance "Topic")]
(concat
(map (fn [a] {:db/id (.id a) :author/name (.get a "name")}) authors)
(map (fn [art] {:db/id (.id art)
:article/title (.get art "title")
:published/by {:db/id (.id (.fk art "published"))}
:in/topic {:db/id (.id (.fk art "in_topic"))}}) articles)
(map (fn [t] {:db/id (.id t) :topic/name (.get t "topic_name")}) topics))))
(def ds-facts (cql-to-datascript instance))
;; => [{:db/id "a0" :author/name "Jane Doe"} {:db/id "art0" :article/title "AI Impacts" :published/by {:db/id "a0"} :in/topic {:db/id "t0"}} ...]
(def ds-schema {:published/by {:db/valueType :db.type/ref :db/cardinality :db.cardinality/one}
:in/topic {:db/valueType :db.type/ref :db/cardinality :db.cardinality/one}})
(def ds-conn (d/create-conn ds-schema))
(d/transact! ds-conn ds-facts)
Now, we’ll use Datascript’s Datalog exclusively for querying and reasoning, replacing Cypher.
Define rules for inference:
(def rules
'[[(collaborates ?auth1 ?auth2)
[?art1 :published/by ?auth1]
[?art2 :published/by ?auth2]
[?art1 :in/topic ?topic]
[?art2 :in/topic ?topic]
[(not= ?auth1 ?auth2)]]
[(contributes ?auth ?topic)
[?art :published/by ?auth]
[?art :in/topic ?topic]]])
(def collaborators
(d/q '[:find ?name1 ?name2
:in $ %
:where
[?auth1 :author/name ?name1]
[?auth2 :author/name ?name2]
(collaborates ?auth1 ?auth2)]
@ds-conn rules))
;; => [["Jane Doe" "John Smith"]]
(def contributions
(d/q '[:find ?name ?topic-name
:in $ %
:where
[?auth :author/name ?name]
[?topic :topic/name ?topic-name]
(contributes ?auth ?topic)]
@ds-conn rules))
;; => [["Jane Doe" "AI"] ["John Smith" "AI"]]
Serialize Datascript results into prompts for your reasoning models.
(defn build-prompt [results label]
(str "Knowledge Graph:\n"
(clojure.string/join "\n" (map #(str "- " label ": " %) results))
"\nQuery: What do these authors have in common?"))
The collab-prompt
definition is meant to take the results of a Datascript Datalog query
(in this case, the collaborators
query, which returns pairs of collaborating authors like [["Jane Doe" "John Smith"]]
),
format them into a readable string, and pass them to the build-prompt
function to create a prompt for your RAG models.
Here’s what it should look like and why:
(def collaborators
(d/q '[:find ?name1 ?name2
:in $ %
:where
[?auth1 :author/name ?name1]
[?auth2 :author/name ?name2]
(collaborates ?auth1 ?auth2)]
@ds-conn rules))
(def collab-prompt
(build-prompt
(map #(str (first %) " collaborates with " (second %)) collaborators)
"Collaboration"))
Assuming collaborators
returns [["Jane Doe" "John Smith"]]
, the result would be:
"Knowledge Graph:\n- Collaboration: Jane Doe collaborates with John Smith\nQuery: What do these authors have in common?"
-
The Query:
(d/q ...)
retrieves pairs of author names (?name1
,?name2
) where thecollaborates
rule holds true (i.e., they’ve published articles on the same topic, like "AI").- Result:
[["Jane Doe" "John Smith"]]
.
-
The Mapping:
(map #(str (first %) " collaborates with " (second %)) collaborators)
processes each pair:(first ["Jane Doe" "John Smith"])
→"Jane Doe"
(second ["Jane Doe" "John Smith"])
→"John Smith"
(str "Jane Doe" " collaborates with " "John Smith")
→"Jane Doe collaborates with John Smith"
- Result:
("Jane Doe collaborates with John Smith")
.
-
The Prompt:
build-prompt
takes this formatted list and a label ("Collaboration"), constructing:Knowledge Graph: - Collaboration: Jane Doe collaborates with John Smith Query: What do these authors have in common?
This prompt is the bridge between your Datascript reasoning and the RAG models (Grok 3, Qwen-QwQ, DeepSeek). It needs to be clear and concise so the models can interpret the relationships (e.g., collaboration) and answer the query effectively.
Here’s how it fits into the broader example:
(ns cql-datascript-rag
(:require [datascript.core :as d]
[neo4j-clj.core :as neo4j])
(:import [catdata.aql AqlEnv AqlOptions Schema Instance]
[catdata.aql.semantics AqlCompiler]))
;; ... (schema, instance, and Datascript setup as before) ...
(def rules
'[[(collaborates ?auth1 ?auth2)
[?art1 :published/by ?auth1]
[?art2 :published/by ?auth2]
[?art1 :in/topic ?topic]
[?art2 :in/topic ?topic]
[(not= ?auth1 ?auth2)]]])
(defn build-prompt [results label]
(str "Knowledge Graph:\n"
(clojure.string/join "\n" (map #(str "- " label ": " %) results))
"\nQuery: What do these authors have in common?"))
(def collaborators
(d/q '[:find ?name1 ?name2
:in $ %
:where
[?auth1 :author/name ?name1]
[?auth2 :author/name ?name2]
(collaborates ?auth1 ?auth2)]
@ds-conn rules))
(def collab-prompt
(build-prompt
(map #(str (first %) " collaborates with " (second %)) collaborators)
"Collaboration"))
(println collab-prompt)
;; Output:
;; Knowledge Graph:
;; - Collaboration: Jane Doe collaborates with John Smith
;; Query: What do these authors have in common?
- Grok 3: “Jane Doe and John Smith collaborate on AI-related work, likely due to their shared interest in the topic.”
- Qwen-QwQ: “They collaborate because their articles share the AI topic.”
- DeepSeek: “Their collaboration suggests a potential for joint AI research.”
- Grok 3: “Jane Doe and John Smith share a focus on AI, as their articles are both linked to the AI topic, suggesting a collaborative interest.”
- Qwen-QwQ: “Step 1: Jane’s article ‘AI Impacts’ is in AI. Step 2: John’s ‘AI Ethics’ is in AI. Step 3: They collaborate via shared topic.”
- DeepSeek: “Their collaboration on AI topics could lead to future joint research, possibly on ethical AI impacts.”
CQL’s category-theoretic features can enhance Datascript reasoning:
- Functors: Define a CQL query to map the schema to a simpler one, then load into Datascript:
(def query-str "query Q = literal : S -> T { entity Person -> {from a:Author attributes name -> name(a)} entity Subject -> {from t:Topic attributes topic_name -> topic_name(t)} }") (.run env compiler query-str) (def query-result (.eval env "Q" instance)) ;; Convert query-result to Datascript and query further
- Natural Transformations: Use Datascript rules to transform relationships (e.g.,
contributes
as a transformation ofpublished/by
andin/topic
).
(ns cql-datascript-rag
(:require [datascript.core :as d]
[neo4j-clj.core :as neo4j])
(:import [catdata.aql AqlEnv AqlOptions Schema Instance]
[catdata.aql.semantics AqlCompiler]))
(def env (AqlEnv.))
(def opts (AqlOptions.))
(def compiler (AqlCompiler. opts))
(def conn (neo4j/connect "bolt://localhost:7687" "neo4j" "password"))
;; Schema and instance
(def schema-str "typeside Ty = literal { types string } schema S = literal : Ty { entities Author Article Topic foreign_keys published : Article -> Author in_topic : Article -> Topic attributes name : Author -> string title : Article -> string topic_name : Topic -> string }")
(.run env compiler schema-str)
(def instance (neo4j-to-cql-instance))
;; Datascript setup
(def ds-facts (cql-to-datascript instance))
(def ds-schema {:published/by {:db/valueType :db.type/ref :db/cardinality :db.cardinality/one}
:in/topic {:db/valueType :db.type/ref :db/cardinality :db.cardinality/one}})
(def ds-conn (d/create-conn ds-schema))
(d/transact! ds-conn ds-facts)
;; Rules and query
(def rules '[[(collaborates ?auth1 ?auth2) [?art1 :published/by ?auth1] [?art2 :published/by ?auth2] [?art1 :in/topic ?topic] [?art2 :in/topic ?topic] [(not= ?auth1 ?auth2)]]])
(def result (d/q '[:find ?name1 ?name2 :in $ % :where (collaborates ?auth1 ?auth2) [?auth1 :author/name ?name1] [?auth2 :author/name ?name2]] @ds-conn rules))
;; Prompt
(println (build-prompt (map #(str (first %) " collaborates with " (second %)) result) "Collaboration")))
By adapting to Datascript, we’ve centralized querying and reasoning in a single Datalog engine, using CQL’s Java libraries for schema definition and Neo4j for data storage. This setup leverages category theory (via CQL) and Clojure’s functional power, delivering rich context to your RAG models. Want to tweak a rule, add a new query, or test this with a specific dataset?