name	omop-kg
description	OMOP CDM Knowledge Graph — ontology of 643 canonical concepts, 1067 relationships, 1099 doc chunks with embeddings. Extracted from CommonDataModel docs + The Book of OHDSI. Use when user asks about OMOP CDM tables, fields, concepts, ETL, cohorts, vocabularies, or needs to search OMOP documentation.
allowed-tools	Bash(curl https://omop-kg.apki.dev)

OMOP CDM Knowledge Graph

643 canonical concepts, 1057 raw extractions, 1067 canonical relationships, 1099 chunks, 66 pages, 1666 embeddings.

Sources: OHDSI CommonDataModel v5.4 docs (38 pages), The Book of OHDSI (23 chapters), LLM relationship refinements.

UI Browser

Interactive knowledge graph browser at http://localhost:3333 (bun --hot ui.ts from the project root).

/ — A-Z dictionary with stats (concepts, relationships, chunks, embeddings)
/concept/<name> — concept detail: description, pages, raw extractions, relationships, similar concepts (BM25+vector RRF), relevant chunks (BM25+vector RRF with highlights)
/search?q=<query> — hybrid search: concepts (BM25+vector) + chunks (BM25+vector), grouped by page with heading hierarchy
/page/<id> — page view: summary, audience, topics, defines, key takeaways, extracted concepts + relationships
/chunks/<page_id> — hierarchical chunk browser with markdown rendering, anchor links (#chunk-<id>)
/graph — d3.js force-directed graph (click=highlight, dblclick=ego-graph, empty=top connected nodes)
/api/sql — readonly SQL API (POST JSON {"sql":"..."}, returns JSON array)

Hybrid Search (recommended)

BM25 + vector RRF search across concepts and chunks in one call:

curl -s 'https://omop-kg.apki.dev/api/search?q=drug+era&limit=10'

Returns {query, concepts: [...], chunks: [...]} — each with bm25, cos, rrf scores and match (both/bm25/semantic).

SQL Query Tool

Readonly SQL endpoint. POST JSON with sql field, get JSON array back.

curl -s https://omop-kg.apki.dev/api/sql -d '{"sql":"SELECT name, type FROM cdm_canonical_concepts LIMIT 5"}'

Tables

cdm_canonical_concepts (643 rows) — deduplicated concepts

Column	Type	Description
id	TEXT PK	lower(name)
name	TEXT	Canonical name
kind	TEXT	`class` or `instance`
type	TEXT	See types below
category	TEXT	See categories below
description	TEXT	Best description
alt_names	TEXT[]	Alternative names merged into this
defined_pages	TEXT[]	Pages where concept is defined

cdm_concepts (1057 rows) — raw LLM extractions

Column	Type	Description
id	SERIAL PK
name	TEXT	Concept name as extracted
kind	TEXT	`class` or `instance`
type	TEXT	Concept type
category	TEXT	Domain category
description	TEXT	Description with doc quotes
relevance	TEXT	`defined` / `described` / `mentioned`
section	TEXT	Section heading
page_id	TEXT FK	Source page id

cdm_canonical_relationships (1067 rows) — deduplicated relationships

Column	Type	Description
id	SERIAL PK
source	TEXT	Source concept name
relation	TEXT	Relation type
target	TEXT	Target concept name
contexts	TEXT[]	All context descriptions
pages	TEXT[]	All source pages
count	INT	How many raw sources

cdm_relationships (1126 rows) — raw extracted relationships

Column	Type	Description
source	TEXT	Source concept
relation	TEXT	Relation type
target	TEXT	Target concept
context	TEXT	Brief explanation
page_id	TEXT FK	Source page id

cdm_pages (66 rows) — documentation pages with full text

Column	Type	Description
id	TEXT PK	Page slug (e.g. `faq`, `book-CommonDataModel`)
title	TEXT	Page title
summary	TEXT	LLM-generated summary
content	TEXT	Full markdown content
size_bytes	INT	Content length
meta	JSONB	`{audience, topics, defines, key_takeaways}`

cdm_chunks (1099 rows) — markdown-parsed documentation chunks

Column	Type	Description
id	SERIAL PK
page_id	TEXT FK	Source page
h1–h5	TEXT	Heading hierarchy
depth	INT	Heading depth
seq	INT	Order within page
type	TEXT	`paragraph` / `code` / `list` / `table` / `frontmatter` / `image`
lang	TEXT	Language for code blocks
content	TEXT	Full chunk text
size	INT	Content length

BM25 index on content: WHERE id @@@ 'content:drug AND content:era'

cdm_embeddings (1666 rows) — OpenAI text-embedding-3-large 1536d

Column	Type	Description
id	TEXT PK	`concept-name` or `chunk-<id>`
name	TEXT	Display name
source	TEXT	`concept` or `chunk`
source_id	INT	Chunk id if source=chunk
embedding	vector(1536)	Embedding vector

cdm_tables (39 rows) / cdm_fields (432 rows) — CDM v5.4 schema

SELECT field_name, datatype, is_required, is_fk, fk_table, fk_domain
FROM cdm_fields WHERE table_name = 'drug_era' ORDER BY id;

OMOP CDM v5.4 Tables (39)

Clinical Events (CDM schema): person*, observation_period*, visit_occurrence, visit_detail, condition_occurrence, drug_exposure, procedure_occurrence, device_exposure, measurement, observation, note, note_nlp, specimen, death, episode, episode_event, fact_relationship

Health System: location, care_site, provider

Health Economics: payer_plan_period, cost

Derived: drug_era, condition_era, dose_era

Vocabulary (VOCAB schema): concept, vocabulary, domain, concept_class, concept_relationship, relationship, concept_synonym, concept_ancestor, source_to_concept_map, drug_strength

Results: cohort, cohort_definition

Metadata: cdm_source, metadata

* = required tables (person, observation_period)

cdm_merge_epochs / cdm_merge_decisions — dedup audit trail

Types (11)

Type	Count
table	105
pattern	104
process	103
tool	68
field	59
vocabulary	54
convention	48
concept-domain	34
concept-class	30
standard	23
relationship-type	9

Categories (10)

Category	Count
analytics	193
vocabulary	108
clinical	84
etl	55
derived	38
metadata	36
quality	33
health-system	31
administration	28
financial	20

Relation Types (24)

Relation	Count
uses	305
has-part	174
related-to	131
constrains	84
part-of	71
constrained-by	50
references	47
is-a	37
derives-from	31
stored-by	26
used-by	26
has-subtype	19
alternative-to	18
maps-to	17
source-of	12

Reading Documentation Pages

List all pages

SELECT id, left(summary, 100) as summary, meta->>'audience' as audience
FROM cdm_pages WHERE content IS NOT NULL ORDER BY id;

Page table of contents (heading hierarchy)

SELECT h1, h2, h3, min(seq) as seq
FROM cdm_chunks WHERE page_id = 'book-Cohorts' AND type != 'frontmatter'
GROUP BY h1, h2, h3 ORDER BY min(seq);

Read a specific section by seq number

SELECT content FROM cdm_chunks WHERE page_id = 'book-Cohorts' AND seq = 26;

Read all chunks of a section by heading

SELECT seq, type, content FROM cdm_chunks
WHERE page_id = 'book-Cohorts' AND h2 = 'Implementing a Cohort Using ATLAS'
ORDER BY seq;

Example Queries

-- Find concept
SELECT name, type, category, left(description, 300)
FROM cdm_canonical_concepts WHERE name = 'Drug Era';

-- Fuzzy search
SELECT name, type, category
FROM cdm_canonical_concepts WHERE name ILIKE '%cohort%' ORDER BY name;

-- Relationships for a concept
SELECT source, relation, target, pages
FROM cdm_canonical_relationships
WHERE source = 'Drug Era' OR target = 'Drug Era'
ORDER BY relation;

-- CDM table fields
SELECT field_name, datatype, is_required, is_fk, fk_table, fk_domain
FROM cdm_fields WHERE table_name = 'condition_occurrence' ORDER BY id;

-- Table descriptions
SELECT name, left(description, 200) FROM cdm_tables ORDER BY name;

-- BM25 search in chunks
SELECT id, page_id, h1, h2, type
FROM cdm_chunks WHERE id @@@ 'content:drug AND content:era'
LIMIT 10;

-- Vector similar concepts
SELECT e2.name, round((1 - (e1.embedding <=> e2.embedding))::numeric, 3) as cos
FROM cdm_embeddings e1
JOIN cdm_embeddings e2 ON e1.id != e2.id AND e2.source = 'concept'
WHERE e1.id = 'drug era' AND e1.source = 'concept'
ORDER BY e1.embedding <=> e2.embedding LIMIT 10;

-- Vector similar chunks to a concept
SELECT c.page_id, c.h1, c.h2, c.type, round((1 - (e.embedding <=> ce.embedding))::numeric, 3) as cos
FROM cdm_embeddings e
JOIN cdm_embeddings ce ON ce.source = 'chunk'
JOIN cdm_chunks c ON c.id = ce.source_id
WHERE e.id = 'drug era' AND e.source = 'concept'
ORDER BY e.embedding <=> ce.embedding LIMIT 10;

-- Page summaries
SELECT id, left(summary, 200), meta->>'audience' as audience
FROM cdm_pages WHERE meta IS NOT NULL ORDER BY id;

-- Most connected concepts
SELECT name, type, category,
  (SELECT count(*) FROM cdm_canonical_relationships r WHERE r.source = c.name OR r.target = c.name) as rels
FROM cdm_canonical_concepts c ORDER BY rels DESC LIMIT 20;

-- Concepts defined on a page
SELECT name, kind, type, relevance
FROM cdm_concepts WHERE page_id = 'book-Cohorts' ORDER BY relevance, name;

niquola/SKILL.md

Select an option

No results found