| created | 2026-04-12 |
|---|---|
| did | did:repo:53936a6c815841cab48caa0ac46e37364a197e86 |
| github | https://gist.github.com/ChristopherA/151aefa6a6bde1ce4fa6b1182656cebe |
| purpose | Agent reference for wikilinks and named edges in plain-markdown knowledge graphs |
| copyright | ©2026 by @ChristopherA, licensed under CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/) |
| summary | How wikilinks and named edge predicates work in a plain-markdown knowledge graph. Covers syntax, YAML vs body predicates, node types (atomic and compound), annotated predicates, vocabulary curation (folksonomy vs ontology), classification predicates (conforms_to:: over is_a::) and predicate conflation, naming sovereignty across collaborating systems, and agent traversal patterns. |
- conforms_to::[[Reference Guide Contract]]
- authored_by::[[Christopher Allen]]↗
- has_status::[[Growing Stage]]↗
- in_domain::[[Deep Context Architecture]]↗
A collection of markdown files becomes a knowledge graph through two mechanisms: wikilinks connect files to each other, and named edges (predicates) label what those connections mean. The labels are where the power lives — they let an agent distinguish a citation from a counterargument from a loose association. But labels require vocabulary discipline, and that discipline operates differently within a single system than it does across systems, where the pressure to standardize can destroy exactly what naming was meant to preserve.
This guide describes [[Author-Declared Edges|author-declared edges]] — predicates written by hand, inline in the markdown itself. That contrasts with [[Inferred Edges|inferred edges]], where a tool extracts relationships automatically from text or conversation. Both have their place. The declared form is what makes [[Vocabulary Plurality|vocabulary plurality]] possible across collaborators who bring different traditions — and it makes [[Provenance Per Edge|provenance per edge]] (who said so, author or system; and which author) a design constraint worth taking seriously.
Everything here applies to any collection of markdown files — no special software required. A text editor and a terminal are sufficient.
- Wikilinks: Connections Between Markdown Files
- Named Edges: Typed, Directional Relationships
- YAML Frontmatter vs. Body Predicates
- Folksonomy vs Ontology: Growing a Predicate Vocabulary
- Notes, Context Nodes, and Compound Nodes
- Reading the Graph: Traversal for Agents
- Practical Operations
- Summary
A wikilink is a reference from one markdown file to another, written with double brackets:
[[Some Concept Name]]
The target is a filename (without the .md extension). When File A contains [[File B]], that creates an edge — a connection between two nodes.
Without wikilinks, a collection of markdown files is a flat set — hundreds of files with no connections between them. With wikilinks, each file connects to related files, forming chains and clusters that an agent or human can traverse to build understanding. Every file becomes a node, every wikilink an edge, and the collection becomes a [[Knowledge Graph|knowledge graph]] — no database, no schema, no tooling required.
A plain wikilink answers the question: "What is this file about?" A file containing [[Elliptic Curve Cryptography]] says this file is about elliptic curve cryptography. It connects the file to its subjects — but it does not describe the nature of the connection. For that, you need named edges.
If File A links to [[File B]], any tool that indexes the collection can discover:
- Outgoing links from File A (by reading the file)
- Incoming links to File B (by searching all files for
[[File B]])
Discovering incoming links requires a search across all files — the graph exists in the files but is not automatically indexed. A command like rg '\[\[File B\]\]' --type md finds every file that links to File B.
Wikilink targets are filenames, so the quality of the link depends on the quality of the name.
Avoid single-word wikilinks. A link like [[Security]] or [[Design]] is too broad to be useful. It collides with dozens of possible meanings, creates ambiguity about what the target file actually covers, and makes the graph noisy. Prefer multi-word descriptive names — what [[Ward Cunningham]]'s original wiki culture (the first wiki, [[WikiWikiWeb]], 1995) called [[Short Noun Phrases|short noun phrases]]:
| Avoid | Prefer |
|---|---|
[[Security]] |
[[Self-Sovereign Identity]] |
[[Design]] |
[[Pattern Language Design]] |
[[Trust]] |
[[Trust Establishment Protocol]] |
[[Model]] |
[[Principal-Agent Relationship in Augmented Knowledge Work]] |
Good wikilink names are 2-7 words, read naturally as inline text, and are specific enough that a second author writing about the same concept would independently generate the same title.
A wikilink can point to a specific heading inside a file using the # separator:
[[File Name#Section Heading]]
This creates an edge to a specific section rather than the whole file. It is the granularity step between linking to an entire document and transclusion (embedding specific content inline).
When to use heading links:
- Large files covering multiple concepts. A file about "Graph Maintenance" might have separate sections on predicate audits, ghost link discovery, and vocabulary curation. A heading link lets you point to the specific section that matters:
derived_from::[[Graph Maintenance#Vocabulary Curation]]. - Container files. Some files serve as indexes or collections where each heading is a distinct idea. Heading links let predicates point to the right idea without requiring everything to be split into separate files.
- Inline references in prose. When citing a specific argument or definition within a larger document, the heading link tells the reader (or agent) exactly where to look.
The heading must match exactly — including capitalization and spacing. If the target heading is renamed, the link breaks silently.
When a wikilink appears inline in prose, the full title can be awkward to read. A pipe alias displays shorter text while preserving the graph edge:
[[Principal-Agent Relationship in Augmented Knowledge Work|principal-agent relationship]]
This renders as "principal-agent relationship" in the text but links to the full-titled file. The syntax is [[Target|Display Text]].
When to use pipe aliases:
- The target title is too long for natural prose flow
- The grammatical context needs a different form (plural, possessive, lowercase)
- The concept has a well-known short name that readers expect
The trade-off: pipe aliases are fragile. When the target file is renamed, the link updates but the display text does not — it becomes stale. And the display text is invisible to simple wikilink searches (rg '\[\[Old Name\]\]' will not find a pipe alias that displays "Old Name" but targets a different file). Use them when readability demands it, not as a default.
Heading links and pipe aliases compose: [[File Name#Section|display text]] links to a specific section with custom display text.
A wikilink can point to a file that does not yet exist. This is a ghost link — a reference to a concept the author considers important enough to name but has not yet written about. Ghost links are a planning tool: they show where the graph wants to grow.
A wikilink can also point to a node that exists — just not in this repo. It might live in the author's personal vault, in a collaborator's wiki, or in a published work elsewhere. Without a marker, these are indistinguishable from ghost links: both look like [[Some Concept]] and both fail to resolve to a local file.
The convention is to append ↗ to mark a wikilink as an external reference:
[[A Spectrum of Consent]]↗
The arrow says: this node exists, just not here. It is a reference, not a gap. A reader or agent encountering the mark knows to look for the target in the author's broader ecosystem rather than treating it as graph growth the author hasn't pursued yet. Unmarked unresolved wikilinks remain ghost links — the graph signaling where it wants to grow.
The marker is a courtesy, not an enforcement. It is useful in two directions: it tells readers "this is not a broken link" when the gist, post, or document is read standalone, and it tells agents traversing the author's full corpus "this edge crosses a repo boundary." This guide itself uses ↗ throughout — every marked wikilink points to a node in the author's broader knowledge garden.
The granularity spectrum for referencing content in another file runs from coarse to fine:
- Wikilink — points to the whole file
- Heading link — points to a section within a file
- Transclusion — embeds specific content from another file inline, from a section down to a single paragraph or character range
Transclusion means the referenced content appears in place — not as a link to follow but as content that renders where it is referenced. This enables a [[Single Source of Truth|single source of truth]]: write a definition once, transclude it wherever it is needed, and updates propagate automatically.
The concept is architecturally powerful (it descends from [[Ted Nelson]]'s [[Project Xanadu]]) but adds significant complexity: tracking what is transcluded from where, handling updates when source content changes, and managing the boundary between "my content" and "content I am displaying from elsewhere." For most knowledge graphs in plain markdown, wikilinks and heading links provide sufficient granularity. Transclusion remains a future capability rather than current practice.
Wikilinks connect files, but they don't explain the connection. A [[Named Edge|named edge]] (also called a predicate or [[Typed Relation|typed relation]]) labels the kind of relationship between two files. The syntax is:
- predicate_name::[[Target Node]]
The predicate names the relationship. The double colon (::) separates the predicate from the target. The wikilink identifies the destination.
A plain wikilink says "these two files are connected." A named edge says "these two files are connected in this specific way."
| Syntax | Question answered | Example |
|---|---|---|
[[Elliptic Curve Cryptography]] |
"What is this about?" | This file is about elliptic curve cryptography |
derived_from::[[Applied Cryptography Handbook]] |
"How does this relate?" | This file was derived from that source |
contradicts::[[Centralized Key Management]] |
"How does this relate?" | This file contradicts that approach |
Plain wikilinks create unlabeled edges. Named edges create labeled edges. The label is what makes the graph semantically rich — an agent traversing the graph can distinguish citations from counterarguments from loose associations.
Named edges are written as list items in the body of the markdown file, typically grouped near the top (after any YAML frontmatter) or in a Relations section at the bottom:
---
created: 2026-03-05
summary: "A brief description"
---
- conforms_to::[[Pattern Form Contract]]
- has_status::[[Seed Stage]]
- in_domain::[[Deep Context Architecture]]
# File Title
Content begins here...Or in a dedicated Relations section:
## Relations
- relates_to::[[Some Concept]]
- derived_from::[[Source Document]]
- contradicts::[[Opposing View]]Both placements work. The classification predicates (conforms_to::, has_status::, in_domain::) typically go at the top; semantic predicates often go in a Relations section at the bottom. Why predicates belong in the body rather than YAML frontmatter is addressed in the YAML vs Body Predicates section below.
A predicate line can carry an indented annotation that explains why the relationship matters — context that the predicate name alone cannot carry:
## Relations
- relates_to::[[Predicate Maintenance Recipes Over Tools]]
- The maintenance pattern that catches vocabulary drift when convention enforcement misses it.
- relates_to::[[Vocabulary Lifecycle Through Tending]]
- The lifecycle model explains the degradation mechanism this anti-pattern names; the seed/weed/fertilize framework provides the ongoing discipline to prevent it.
- extracted_from::[[Compound Nodes for Knowledge Management]]
- The vocabulary gloss section, lines 91-115.The annotation is an indented list item directly beneath the predicate. It is not a separate predicate — it is context about the relationship.
Why this matters for agents: A predicate like relates_to::[[Some Pattern]] tells you there is a connection but not why it is important. The annotation tells you what to expect if you follow the link, and whether following it is worth the cost. An agent building context can read annotations to decide which edges to traverse without reading the target files. This is a form of [[Progressive Disclosure|progressive disclosure]]: the predicate gives direction, the annotation gives rationale, and the target file gives depth.
Annotations are optional but recommended. Classification predicates at the top of a file rarely need them — conforms_to::[[Pattern Form Contract]] is self-explanatory. Semantic predicates in a Relations section benefit from them, especially when the relationship is non-obvious or when the target file is large.
Always use multi-word predicates with underscores. Single-word predicates are too vague, collision-prone, and ambiguous.
| Avoid | Prefer | Why |
|---|---|---|
source:: |
derived_from:: |
"Source" could mean anything — the origin? the format? the repository? |
type:: |
conforms_to:: |
"Type" is overloaded; conforms_to::[[X Form Contract]] names contract-compliance, not identity |
status:: |
has_status:: |
More specific, reads as a sentence |
link:: |
relates_to:: |
"Link" describes the mechanism, not the relationship |
parent:: |
extends:: |
"Parent" implies hierarchy; "extends" describes the relationship |
Multi-word predicates read as sentence fragments: "this node conforms_to Pattern Form Contract," "this node derived_from Source Document," "this node contradicts Opposing View."
A node can carry multiple lines with the same predicate name. This is how multi-valued relationships work:
- in_domain::[[Decentralized Identity]]
- in_domain::[[Self-Sovereign Identity]]
- assisted_by::[[Alice]]
- assisted_by::[[Paco]]
- assisted_by::[[Guillermo de Baskerville]]Each line is a separate edge in the graph. A file with two in_domain:: lines belongs to two domains. A file with three assisted_by:: lines has three assistants. Do not assume one predicate per type — check for all instances.
Classification (what kind of thing is this?):
conforms_to::[[X Form Contract]]— the structural contract this node satisfies (see [[Form Type|form type]]↗). Prefer overis_a::[[X Form]]— a node conforms to a specification, it is not identical to onehas_status::[[Status Name]]— lifecycle stagein_domain::[[Domain Name]]— knowledge areain_precinct::[[Precinct Name]]— organizational unit that determines which structural contracts apply. A node in a garden precinct carries form-type obligations; a node in a household precinct serves operational capture
Provenance (where did this come from?):
derived_from::[[Source Document]]— synthesized from that sourceextracted_from::[[Source Document]]— pulled out of that document. This is a [[Construction Predicate|construction predicate]]: when the source is later archived or reincarnated as a living node, upgrade toimplements::,embodies::, or another specific predicate once the mature relationship is clearinformed_by::[[Reference Work]]— drew on that referenceabstracted_from::[[Specific Instance]]— generalized from that instancemotivated_by::[[Case or Experience]]— this pattern arose from that casegrounded_in::[[Value or Principle]]— this principle derives from that valueestablished_by::[[Experience or Event]]— this conviction emerged from that experience
Structural (how does this relate?):
relates_to::[[Related Concept]]— general connection (use a more specific predicate when one fits)implements::[[Pattern or Decision]]— enacts that patternembodies::[[Principle Form]]— this protocol or practice expresses that principleextends::[[Base Concept]]— builds on that conceptcontradicts::[[Opposing View]]— in tension with that viewconstrains::[[Agent or Process]]— this boundary limits that agent's authoritycomposes_with::[[Related Concept]]— works together with that concept
Lifecycle (what happened over time?):
supersedes::[[Old Version]]— replaced thatevolved_into::[[New Version]]— this became that (the reverse direction ofsupersedes::)validated_by::[[Supporting Evidence]]— confirmed by that evidenceinvalidated_by::[[Counter-Evidence Case]]— this case broke that assumption
Generative (what does this produce or require?):
proposes::[[Candidate Hypothesis]]— this inquiry puts forward that hypothesisdirected_at::[[Person or Group]]— this question requires that person's or group's judgment. A [[Boundary Marker|boundary marker]]: it says who must decide, not just who is involvedresolved_by::[[Answer or Case]]— this inquiry was answered by that case, pattern, or referencegenerates::[[Produced Node]]— this node produced that node through investigation or useanticipates::[[Future Scenario]]— this scenario imagines consequences of those forces or driverssignaled_by::[[Observable Development]]— this scenario would be confirmed by that observable developmentprepares_for::[[Decision or Strategy]]— this scenario informs that decision or strategyrenders_as::[[Runtime Artifact]]— this design node has a runtime rendition at that path or format (e.g., a persona definition renders as an agent configuration file)
Named edges are directional. The file containing the predicate is the source; the wikilink target is the destination. File A containing derived_from::[[File B]] means "A was derived from B" — not the reverse. Direction is a convention, not enforced by syntax.
Named edges are one-directional in the files. If File A says extends::[[File B]], File B has no automatic awareness that it is extended by File A. Discovering incoming edges requires searching across all files:
rg 'extends::\[\[File B\]\]' --type mdThis is a limitation of the plain-text approach. The graph exists in the files but incoming edges are not free — they require a search.
Three target forms work with predicates:
| Syntax | Use |
|---|---|
predicate::[[Internal File]] |
Link between files in the same collection |
predicate::[Display Text](https://...) |
Link to an external resource with readable label |
predicate::https://... |
Bare URL for quick external references |
Predicates are graph edges — but markdown files have two places to put metadata: the [[YAML Frontmatter|YAML frontmatter]] block and the body. The decision of what goes where follows a litmus test.
Is the value a fixed scalar, or a connection to a concept that could have its own file?
- Scalars go in YAML frontmatter: dates, summaries, word counts, slugs. These are properties of this file. They don't point to other nodes in the graph.
- Relationships go as body predicates: type declarations, domain membership, provenance chains, structural connections. These point to concepts with their own definitions — concepts that are (or could be) separate files.
| Mechanism | Question answered | Example | Graph edge? |
|---|---|---|---|
| YAML field | "What are this file's scalar properties?" | created: 2026-03-05, summary: "..." |
No |
| Wikilink | "What is this about?" | [[Elliptic Curve Cryptography]] in prose |
Yes — unlabeled |
| Body predicate | "How does this relate?" | derived_from::[[Source Document]] |
Yes — labeled |
created: 2026-03-05— a date. Dates don't have their own files. YAML.conforms_to::[[Pattern Form Contract]]— "Pattern Form Contract" has its own file with a definition and structural specification. Body predicate.summary: "A brief description"— a text property of this file, not a connection. YAML.in_domain::[[Deep Context Architecture]]— "Deep Context Architecture" has its own file. Body predicate.attendee::[[Person Name]]— a person with their own file. Body predicate (notattendees: [name, name]in YAML).publication_year: 2008— a scalar number. YAML.cites_work_by::[[Person Name]]— a person. Body predicate.
Tags (tags: [pattern, seed, deep-context] in YAML) create [[Flat Taxonomy|flat sets]] — a bag of items with no connections between them. Replacing tags with predicates and wikilinks converts those flat sets into graph edges:
| Tags (flat set) | Predicates (graph edges) |
|---|---|
tags: [type/pattern] |
conforms_to::[[Pattern Form Contract]] |
tags: [status/seed] |
has_status::[[Seed Stage]] |
tags: [deep-context] |
in_domain::[[Deep Context Architecture]] |
The tag #deep-context on 50 files creates a bag with no hub. The predicate in_domain::[[Deep Context Architecture]] on 50 files creates 50 edges pointing to a navigable page that can itself link outward. Tags classify; links connect.
Putting relationships in YAML frontmatter (as arrays or key-value pairs) hides them from the graph. YAML is for machines that parse structured metadata; body predicates are for agents and humans who read files and follow connections. A predicate in the body is visible content that participates in the knowledge graph. A relationship buried in YAML is invisible to anyone reading the file linearly and requires specialized parsing to extract.
The preceding sections cover the mechanics: how to write predicates and where to place them. This section addresses the harder question — how to manage the vocabulary those predicates draw from, and why that management operates differently within a system than across systems.
Predicates are freeform strings. Nothing enforces a controlled vocabulary — an author can write any predicate they want. This creates a tension: how much structure should the vocabulary have?
Pure [[Folksonomy|folksonomy]] (bottom-up): Let predicates emerge from use. Authors invent whatever predicates feel natural. Over time, patterns emerge — some predicates get used frequently, others are one-offs. Periodic review normalizes the vocabulary.
- Advantage: Low friction. Authors capture relationships naturally without consulting a vocabulary list.
- Risk: Semantic drift.
knows,met,connected_to, andspoke_withall mean the same thing but are stored as different predicates. Queries for one type miss instances stored under another. This compounds: each redundant predicate becomes precedent for more invention, degrading retrieval reliability.
Pure [[Ontology|ontology]] (top-down): Define a fixed vocabulary upfront. Every predicate is declared in a schema. Undeclared predicates are rejected.
- Advantage: Consistent, queryable, no drift.
- Risk: Rigid. Authors spend time consulting the vocabulary instead of writing. New relationships that don't fit the schema get forced into ill-fitting predicates or go unrecorded.
Start with a small core vocabulary (10-20 predicates organized by category). Let it grow through use. Prune periodically. This requires ongoing vocabulary curation — not just enforcement at creation time:
-
Awareness: Know what predicates exist and what each one means. A documented vocabulary makes the intended distinctions visible.
-
Review: Periodically audit the graph for drift — redundant predicates that mean the same thing, ambiguous predicates used inconsistently, or predicates whose meaning has shifted.
-
Consolidation: When redundant predicates appear, merge them. This requires judgment about which term best captures the meaning, and a sweep to update existing uses.
-
Clarification: When a predicate is used ambiguously (does
relates_tomean "is influenced by" or "shares a topic with"?), either tighten its definition or split it into more specific predicates. -
Enforcement: Reject undeclared predicates at creation time — via linting, skill-level convention, or review. Enforcement catches new drift but does not fix existing drift.
Enforcement alone is insufficient. A system can reject undeclared predicates and still accumulate confusion if the declared vocabulary has overlapping or unclear terms. The discipline is curation; enforcement is one tool within it.
The same reasoning that applies to single-word wikilinks applies to predicates, but the failure mode is worse:
- Collision:
type::could mean form type, content type, media type, or classification type. In a graph with hundreds of nodes, ambiguity in predicates corrupts every query that traverses them. - [[Precedent Poisoning|Precedent poisoning]]: Once a vague predicate exists in the graph, agents (human or automated) find it during traversal and treat it as precedent — "this graph uses
source::, so I should too." One uncurated predicate becomes a template for hundreds. - No disambiguation path: A single-word predicate has no internal structure to disambiguate.
derived_from::can be tightened toabstracted_from::orextracted_from::because the compound term carries enough meaning to split.source::cannot — it must be replaced entirely.
Multi-word predicates with underscores (derived_from, relates_to, has_status) resist these failures because they carry enough semantic content to be self-documenting, distinguishable, and splittable.
is_a:: is the shortest honest question a graph asks: what kind of thing is this? It is also, in most declarations, a semantic shortcut that over-claims.
is_a::[[Gloss Form]] reads as identity — this node is a Gloss Form. But a node is never identical to the form it follows. The Form document is a specification: required sections, expected predicates, structural conventions. A node conforms to that specification. The Form is a contract; the node is an instance that meets the contract. Collapsing the two into "is a" loses the distinction.
conforms_to::[[Gloss Form Contract]] names the relationship honestly:
- The target is a contract (the
Contractsuffix makes the form's nature explicit — it is a specification, not a category) - The node complies with that contract (the predicate describes conformance, not identity)
- The node can conform to multiple contracts at once —
conforms_to::[[Gloss Form Contract]]alongsideconforms_to::[[Wilderness Glossary Entry]]is coherent, where twois_a::declarations would read as contradiction
That last property matters most for plural-vocabulary collaboration. Different communities define different contracts for overlapping node kinds. A node satisfying both belongs equally to both traditions — not one true identity with foreign synonyms, but genuine multi-contract conformance.
Use conforms_to::[[X Form Contract]] in place of is_a::[[X Form]]. The predicate is longer but names what is actually happening. The target's Contract suffix signals that the form is a specification, not a class.
"For magic consists in this, the true naming of a thing." — Ursula K. Le Guin, [[A Wizard of Earthsea (Le Guin 1968)]]
Choosing conforms_to:: over is_a:: is not cosmetic. It is a claim about who names relationships in a graph and what a name does.
Naming acts in two directions. A system can name us — enrolling authors as subjects of its schema, normalizing every vocabulary into a shared ontology because the graph holds the predicate set and the authors don't. Or authors can name for themselves — declaring their own edges in their own vocabulary and interoperating as peers rather than being extracted into a single normalized form. The first is power-over. The second is peer-sovereignty in the sense the [[Self-Sovereign Identity|self-sovereign identity]]↗ tradition meant it: not isolation, but the ability to negotiate as peers rather than being enrolled as subjects.
Le Guin and Vernor Vinge's [[True Names (Vinge 1981)]] arrive at the same philosophical spine from opposite directions — Le Guin working inward toward self-knowledge, Vinge projecting outward toward networks and infrastructure. Both describe naming as the act by which power is claimed or surrendered; the names a system adopts shape every inference that flows through it. In a knowledge graph, the predicate vocabulary is the place where [[Naming Sovereignty|naming sovereignty]] is either exercised or surrendered. An edge labeled is_a:: has already decided, usually without deliberation, that node-to-form is a relation of identity. An edge labeled conforms_to:: decides instead that node-to-form is a relation of compliance. Both are sovereign choices. The first is smuggled through convention; the second is made in the open.
Every collaborative graph answers the question [[Whose Vocabulary Wins|whose vocabulary wins]]? — usually silently. A system that infers all its edges has answered it: the system's vocabulary wins, and authors' distinctions dissolve into the normalization. A system built on author-declared predicates answers it differently: both embodies:: and grounded_in:: survive side by side, and the agent's role shifts from extractor (the system names the relationships; authors are its subjects) to translator (authors name in their own vocabularies; the agent interoperates between them without collapsing either). That shift — from extractor to translator — is peer-sovereignty in agent-mediated form.
is_a:: is not unique. Any predicate that answers more than one question at once smuggles a decision past deliberation. [[Predicate Conflation|Conflation]] is the failure mode; [[Unconflation Discipline|unconflation]] is the discipline that pulls the separate questions apart so each can be answered honestly.
is_a::[[Gloss Form]] fuses three different questions into one declaration:
- Structural contract compliance ("meets the specification")
- Category membership ("belongs to the class")
- Type identity ("is fundamentally this kind of thing")
has_status::[[Growing Stage]] fuses at least five:
- Authorial maturity — how developed is the node? (Seed / Growing / Evergreen)
- Epistemic confidence — how settled is the claim? (Tentative / Confident / Canonical)
- Curation state — has it been reviewed, linked, integrated? (Uncurated / Curated / Annotated)
- Public visibility — who can see it? (Private / Shared / Published)
- Lifecycle state — is it active, superseded, pruned?
These axes are orthogonal. A node can be Evergreen (mature) but Tentative (uncertain), Canonical (settled) but Private (unshared), Growing (in progress) but Pruned (abandoned mid-flight). Forcing one has_status:: line to carry all five means the author chooses which axis wins and the rest are smuggled in by implication — or omitted entirely. Splitting into has_maturity::, has_confidence::, has_curation_state::, has_visibility::, and has_lifecycle_state:: multiplies predicates not out of verbosity but because each axis is a separate question that deserves its own answer.
The same pattern repeats. in_domain:: conflates topic (intellectual subject), organizational home (which precinct owns it), and traversal scope (what an agent should pull). relates_to:: is the catch-all that fuses every semantic relationship that didn't earn a more specific predicate — addressed separately below as [[The relates_to Trap|the relates_to:: trap]]. Wherever a short predicate carries a stack of distinct relationships, it is a candidate for unconflation.
Conflation has a temporal form: [[Choosing Too Soon|choosing too soon]]. is_a:: commits to an identity before the relationship is understood well enough to name honestly. Committing early and committing wrongly are intertwined — the second is_a:: line contradicts the first, so revision means retraction. conforms_to:: stays revisable: a second conforms_to::[[Wilderness Glossary Entry]] accumulates alongside conforms_to::[[Gloss Form Contract]] without contradicting it. Predicates that can accumulate outlast predicates that assert identity.
The sovereignty argument carries straight through. Conflation is not just modelling sloppiness — it is a place where the graph decides on behalf of the author which axis matters, which relationship is primary, which commitment is terminal. Every conflated predicate is a small instance of power-over: the vocabulary imposes a resolution the author never chose. Every unconflated predicate is a small act of peer-sovereignty: the author names each axis in its own right and leaves the others free to vary. Unconflation is naming discipline; it is also political hygiene.
Every knowledge graph with named edges converges on some classification predicate — is_a::, conforms_to::, has_form::, structured_as::. The syntactic slot is universal. What goes in that slot — the predicate name itself and the target vocabulary behind it — is system-specific.
An agent entering a new collection should discover both:
- Which classification predicate the system uses — audit leading predicates with
rg -o '^- [a-z_]+::\[\[' --type md | sort | uniq -c | sort -rn. - What targets that predicate ranges over — read the Form documents (or their equivalents) to learn the contracts nodes conform to.
The is_a:: / conforms_to:: distinction is itself one such difference. A system using is_a:: is not wrong; it is using the shortcut this guide chose to unfold.
This applies to all predicates, not just classification. One system's derived_from:: might be another system's informed_by::. One system might use predicates the other has never seen: explores::, raises::, grounded_in::, assisted_by::. When working across systems, the first task is [[Vocabulary Discovery|vocabulary discovery]]↗ — what predicates does this collection actually use? — not vocabulary enforcement.
relates_to:: is the most over-used predicate in any knowledge graph. It is the correct fallback when no more specific predicate fits, but agents (human and automated) default to it reflexively when a more specific predicate exists.
Before writing relates_to::, ask: "Can I name the kind of relationship?" If a file was derived from a source, use derived_from::. If it extends another concept, use extends::. If it contradicts something, use contradicts::. relates_to:: should be a last resort, not a first instinct.
In a graph with hundreds of nodes, a query for "everything that relates to X" returns noise. A query for "everything derived from X" returns signal. Every relates_to:: that could have been a more specific predicate is a missed opportunity for the graph to carry meaning.
When you need to express a relationship and are unsure which predicate to use:
- Check the existing vocabulary. Search the collection for predicates already in use:
rg -o '^- [a-z_]+::' --type md | sort -u. Use what exists before inventing. - Check if a more specific predicate fits. If the vocabulary has
relates_to::but the relationship is actually provenance, usederived_from::orextracted_from::. - If nothing fits, invent — but deliberately. Choose a multi-word name that reads as a sentence fragment. Document the new predicate's meaning if the collection has a vocabulary reference.
- Do not invent synonyms. If
derived_from::exists, do not createsourced_from::for the same meaning. Synonym predicates are the primary mechanism of vocabulary drift.
[[Vocabulary Lifecycle Through Tending|Vocabulary maintenance is gardening work, not engineering work]]↗. The metaphor is precise: a vocabulary is a living system. Some terms thrive and spread. Others wither from disuse. Weeds (redundant or ambiguous predicates) crowd out useful terms if left untended. Pruning improves the whole garden.
Three gardening activities apply directly:
- Weeding — removing malformed, redundant, or ambiguous predicates
- Seeding — introducing specific new predicates where only broad ones exist
- Fertilizing — enriching predicates with clearer definitions and documented scope
A vocabulary that is planted but never tended becomes overgrown. A vocabulary that is over-controlled never adapts. The middle path — plant thoughtfully, tend regularly — produces a vocabulary that grows with the knowledge it describes.
Everything above addresses vocabulary management within a system — one knowledge graph, one community, one shared set of predicates. But when multiple systems meet, a different dynamic takes over, and the instinct to standardize can destroy exactly the value each system built.
Different communities develop different vocabularies because [[Naming Carries Relational Weight|naming is architectural, not decorative]]↗. When you name a concept, a role, or a relationship, the name activates a web of associations — etymology, cultural resonance, prior use, semantic neighbors. Those associations become the context through which every subsequent inference runs. Choosing "estate" over "workspace" encodes stewardship and generational continuity. Choosing "autonomia" over "autonomy" encodes formal precision in a tradition-neutral namespace. Choosing "gardener" over "content curator" encodes a living-systems relationship to knowledge. Each naming tradition carries design decisions that would be destroyed by flattening them into a [[Shared Language Community|shared vocabulary]]↗.
The pressure to converge is real. When communities collaborate, [[Mutual Intelligibility|mutual intelligibility]] pulls toward a single shared vocabulary — it is easier to work together if everyone uses the same terms. But convergence has costs that are not equally distributed:
-
[[Consensus Creates Priesthoods|Consensus creates priesthoods]]. When a group agrees on a vocabulary, the people who participated in creating it have natural authority. Newcomers face a choice: learn the existing vocabulary (which they did not help shape) or go create their own community with their own language. This is why the same concepts get reinvented under new names every decade — not from ignorance of prior work, but from [[Vocabulary Alienation|alienation by prior vocabulary]].
-
Standardized language becomes dogma. When a vocabulary is constrained by authority — a [[Standards Body|standards body]], a schema enforcer, a dominant project — it stops evolving with the community's understanding. Spain's [[Real Academia Española|Royal Academy of Language]] cannot keep up with street usage in Buenos Aires. A fixed ontology cannot keep up with the evolving understanding of practitioners who are doing the work.
-
[[Vocabulary Flattening|Flattening]] destroys encoded meaning. If three systems use
derived_from::,informed_by::, andtiene_origen_en::for overlapping but not identical concepts, merging them into one predicate does not unify — it erases the distinctions each system needed to make. The merged term is less precise than any of the originals.
The alternative to convergence is [[Mutual Intelligibility|mutual intelligibility]] through translation. Each system keeps its own vocabulary. [[Translation Layer|Translation layers]] — glosses, citations, typed cross-references — make each system's terms legible to the others without requiring anyone to abandon their naming. An agent working across systems learns to read multiple vocabularies, not to enforce one.
This matters for agents because the instinct to "normalize" or "standardize" vocabulary when encountering a foreign [[Knowledge Graph|knowledge graph]] is exactly the wrong move. The first task when entering a new system is [[Vocabulary Discovery|vocabulary discovery]] — what predicates does this collection use? what do they mean here? — not vocabulary enforcement. Different classification predicates, different predicate inventories, different naming conventions are not inconsistencies to fix. They are design decisions to respect.
The deeper principle: naming is a form of [[Naming Sovereignty|sovereignty]]. Who gets to name a thing determines how everyone else thinks about it. A vocabulary that accommodates diversity — that translates rather than converges — preserves the [[Self-Sovereign Identity|autonomy]]↗ of each community to name its own concepts according to its own values and traditions.
Vocabulary diversity is about which predicates a system uses. But naming conventions shape something more basic — how those predicates and their targets are spelled. Two systems can share a concept and still fail to interoperate because one writes [[Self-Sovereign Identity]] and the other writes [[self-sovereign-identity]].
Different knowledge systems use different naming conventions for wikilink targets. An agent working across systems should be aware of common patterns:
Descriptive titles with spaces (most common in personal knowledge systems):
[[Self-Sovereign Identity]]
[[Principal-Agent Relationship in Augmented Knowledge Work]]
[[Informal Edges Poison the Graph]]
Slug-style with hyphens (common in systems influenced by web URLs or non-English languages):
[[convergencia-independiente-como-validacion]]
[[protecting-my-perspective]]
Descriptive titles with spaces are generally preferred — they read as natural language in prose and in graph visualizations. Slug-style names with hyphens trade readability for URL compatibility. When building a knowledge graph primarily for human and agent consumption (not web routing), prefer spaces.
Title Case vs. sentence case varies by system. What matters for graph integrity is consistency within a system — if the same concept is named two different ways, the graph fractures into disconnected subgraphs.
When entering a new collection, discover the local naming convention by reading existing files before creating new ones. Do not impose a convention from a different system.
Predicates attach to notes — markdown files in the collection. But not every markdown file participates in the graph equally, and not every node is a single file.
Any markdown file in the collection is a note. It becomes a [[Context Node|context node]] when it participates in the typed graph through predicate edges — when it carries conforms_to::, relates_to::, or other predicates that connect it to the rest of the graph. A note with only YAML frontmatter and no predicates exists in the collection but has no edges for an agent to follow. Adding classification predicates is what makes a file visible to graph traversal.
A [[Form Contract|form contract]] is a specification — required sections, expected predicates, and structural conventions that nodes following it must satisfy. Pattern Form Contract, Decision Form Contract, Citation Form Contract, Model Form Contract, Inquiry Form Contract: each names a contract, not a category. There is one Pattern Form Contract but potentially many pattern nodes, each of which conforms to it.
A node declares which contract it meets with a classification predicate. Where some classification predicate is universal syntax (every graph has one), the particular predicate is a vocabulary choice each system makes. This guide recommends conforms_to::[[X Form Contract]] because it names what is actually happening (compliance with a specification) rather than asserting identity. The full argument for that shift, and why conflation and choosing-too-soon are the deeper failure modes, is developed in the Folksonomy vs Ontology section below.
When entering a new system, reading the form-contract definitions tells you what to expect from every node that satisfies them.
Not every concept fits in a single file. A [[Compound Node|compound node]] uses a folder to contain multiple files that together constitute one concept:
Some Concept/
├── Some Concept.md ← lead file
├── Analysis.md ← sibling file
├── Renditions/
│ └── source-article.md ← format-transformed copy of external source
└── Archives/
└── original-slides.pdf ← preserved binary original
The [[Lead File|lead file]] shares the folder's name and serves as the primary access point — the file that gives an agent or human the most useful context first. When a wikilink targets [[Some Concept]], the lead file is where traversal arrives. Sibling files within the compound folder carry related analysis, extracted insights, or supporting material that would make the lead file unwieldy.
This means a wikilink target can resolve to a folder, not just a single file. An agent encountering a compound node should read the lead file first, then decide whether sibling files are worth the reading budget.
External sources and binary files cannot carry predicates directly. Three conventions bring them into the graph:
[[Rendition|Renditions]] are format-transformed markdown copies of external sources — searchable, readable, carrying their own predicates. A PDF article becomes a markdown rendition in Renditions/. Each rendition carries derived_from:: pointing to the canonical source. This is the mechanism by which external knowledge enters the typed graph.
[[Archive|Archives]] preserve binary originals — PDFs, slide decks, recordings — within Archives/ in a compound node folder. They exist for reference but are not themselves graph nodes (they carry no predicates).
[[Sidecar File|Sidecars]] (.sidecar.md) serve as metadata envelopes for binaries that cannot carry their own frontmatter. A sidecar links to its binary via artifact:: and to the canonical source via derived_from::. Create sidecars only when a binary is actively cited or needs to be discoverable by agent traversal.
Some predicates are temporary scaffolding — they describe a relationship that is true now but should be upgraded as the graph matures. The primary example is extracted_from::: when a node is first pulled out of a source document, the extraction provenance matters. But once the source is archived or the extracted node develops into a living concept, the predicate should be upgraded to implements::, extends::, or embodies:: — whichever specific predicate names the mature relationship. Do not upgrade to relates_to::; that reverts the edge from provenance to the catch-all (see The relates_to:: trap below).
Any predicate that describes a construction act rather than a semantic relationship is a candidate for upgrade. [[Temporary Predicate Scaffolding|Recognizing construction predicates]]↗ prevents the graph from accumulating stale provenance edges that describe how nodes were built rather than how they relate.
The preceding sections cover how a graph is written — its vocabulary, its node anatomy, its naming conventions. But an agent's most common task is reading an existing graph — entering a collection of files it didn't create and building understanding from the edges it finds.
When an agent encounters a new file, the predicates tell it what kind of file this is and how it connects before reading the body content. This enables progressive disclosure:
-
Read the classification predicates (
conforms_to::,has_status::,in_domain::) — these tell you the node's contract, maturity, and knowledge area. You now know whether this is a pattern, a citation, a decision, or something else entirely, without reading a single paragraph. -
Scan the semantic predicates —
derived_from::,extends::,contradicts::tell you the file's relationships. Combined with any annotations, you can decide which edges are worth following. -
Read the body — only after predicates have given you orientation. For large files, the predicates may tell you that following a different edge is more productive than reading this file in full.
This is the opposite of how prose is normally read (top to bottom, all of it). [[Predicate First Reading|Predicate-first reading]] lets an agent navigate the graph efficiently, spending reading budget on the files that matter most.
An agent researching a topic traverses the graph by:
- Starting at a known node (e.g., a domain page or a node the user pointed to)
- Reading its predicates to discover connections
- Following the most relevant edges (guided by predicate names and annotations)
- Reading those target files' predicates to discover the next layer
- Continuing until context is sufficient or budget is exhausted
Each hop costs reading budget. Annotations on predicates help the agent decide which hops are worth taking — "this connection explains the mechanism" is more useful than "this connection exists."
To find everything that points to a specific node (not just what that node points from), search for the target across all files:
# Find all files that link to "Target Node" in any way
rg '\[\[Target Node\]\]' --type md
# Find all files that link to "Target Node" via a specific predicate
rg 'derived_from::\[\[Target Node\]\]' --type mdIncoming edges are often more informative than outgoing edges. A node's outgoing predicates tell you where it came from and what it connects to. Its incoming predicates (discovered by search) tell you what depends on it, what it validates, and how central it is to the graph.
Reading the graph is inference; maintaining it is upkeep. The recipes below cover the mechanical side — the rg commands, rename scripts, and audit patterns that keep a typed graph healthy over time.
Named edges live in plain text, so maintenance uses standard text-processing tools. No graph database or specialized software required.
Find all files using a specific predicate:
rg 'derived_from::' --type mdFind all files linking to a specific target:
rg '\[\[Target Node\]\]' --type mdRename a predicate across all files:
rg -l 'old_predicate::' --type md | xargs sed -i '' 's/old_predicate::/new_predicate::/g'Audit predicate vocabulary (frequency count):
rg -o '^- [a-z_]+::' --type md | sort | uniq -c | sort -rnDiscover ghost links (referenced but nonexistent files):
Extract all [[targets]] from predicate lines, compare against existing filenames.
Audience-vocabulary check (when writing for a different system): If a file will be read outside its home system, scan for system-specific vocabulary in examples and predicate targets. Terms that are meaningful inside one knowledge graph (e.g., a specific domain name, a local form type) become opaque outside it. Replace with terms the target audience recognizes, or define them inline.
When adding wikilinks and predicates to a file that doesn't have them yet, follow these steps:
Step 1: Top — YAML frontmatter and classification predicates. Add YAML frontmatter for scalar properties (created, author, summary). Below the frontmatter closing ---, add classification predicates: conforms_to:: (what contract does this node meet?), has_status:: (how mature?), in_domain:: (what knowledge area?), and any structural predicates like principal:: or belongs_to::.
Step 2: Body — inline wikilinks. Read through the prose and identify concepts that are (or should be) first-class nodes — concepts important enough to have their own file with their own definition. Wikilink those. Use pipe aliases where the display text should differ from the canonical name. Leave generic terms, adjectives, and passing references as plain text — not every noun deserves a wikilink.
Step 3: Bottom — Relations section. Add a ## Relations section at the end. List the file's key relationships as annotated predicates. Choose specific predicates (implements::, embodies::, grounded_in::) over generic relates_to:: when the relationship type is clear. Annotate each predicate with a sentence explaining why the relationship matters.
Step 4: Cross-check — inline gaps. After adding Relations predicates, scan the body for topical gaps. If a concept appears in Relations but is never mentioned inline where it's topically relevant (e.g., in a project list, a technical context section, or ongoing commitments), add it there too. Relations declare structural connections; inline wikilinks place concepts in narrative context. Both are needed.
Self-demonstration check. If the file you're writing describes conventions, it should use those conventions. A reference guide about predicates should have its own YAML frontmatter, classification predicates, and Relations section. A style guide should follow its own style. This is easy to miss because the author is focused on explaining, not applying.
| Concept | Syntax | Purpose |
|---|---|---|
| Wikilink | [[Multi-Word Name]] |
Unlabeled connection between files |
| Named edge | predicate_name::[[Target Node]] |
Labeled, directional relationship |
| Ghost link | [[Nonexistent File]] |
Planning signal — the graph wants to grow here |
| Classification predicate | conforms_to::, has_status::, in_domain::, in_precinct:: |
What contract does this node satisfy? |
| Semantic predicate | derived_from::, extends::, contradicts:: |
How does this relate? |
| Generative predicate | proposes::, resolved_by::, generates:: |
What does this produce or require? |
| Construction predicate | extracted_from:: (temporary) |
Scaffolding that should be upgraded as the graph matures |
| Context node | Note with predicates | A file that participates in the typed graph |
| Compound node | Folder with lead file | Multiple files constituting a single concept |
| Rendition | Markdown in Renditions/ |
External source transformed into a graph-participatory file |
The core principles:
- Avoid single-word wikilinks — they are too broad to be useful
- Avoid single-word predicates — they collide, poison precedent, and resist disambiguation
- Predicates go in file bodies, not YAML frontmatter — they participate in the graph
- YAML frontmatter is for scalars (dates, summaries); body predicates are for relationships (connections to concepts with their own files)
- Annotate predicates when the relationship needs context — annotations enable progressive disclosure
- Every graph has some classification predicate (
conforms_to::,is_a::,has_form::); the syntactic slot is universal but the predicate name and target vocabulary are system-specific — discover both before writing - Prefer
conforms_to::[[X Form Contract]]overis_a::[[X Form]]— a node conforms to a specification, it is not identical to one; conformance stays revisable and pluralizable where identity forces premature commitment - Unconflate predicates that answer more than one question —
has_status::fuses maturity, confidence, curation, visibility, and lifecycle; split each into its own predicate so every axis can be named honestly - Prefer specific predicates over
relates_to::— specificity is what makes the graph queryable - Vocabulary requires curation, not just enforcement — tend the garden
- [[Vocabulary Collision Navigation|Vocabulary diversity across systems is a feature, not a problem]]↗ — translate, don't converge
- Read predicates first, body second — classification and relationship predicates orient an agent before the prose content does
- The graph lives in plain text — a text editor and
rgare sufficient tools
-
informed_by::[[Typed Relations as Simple Graphs in Plain Markdown]]↗
- The foundational specification for the
predicate::[[target]]convention this guide explains.
- The foundational specification for the
-
informed_by::[[Classification via Predicates Not Tags]]↗
- The decision that established the YAML-vs-predicate litmus test.
-
informed_by::[[Informal Edges Poison the Graph]]↗
- The anti-pattern that motivates the vocabulary curation guidance.
-
informed_by::[[Deep Context Graph Vocabulary]]↗
- The semantic predicate catalog this guide draws its predicate inventory from — all five categories with full definitions.
-
informed_by::[[Vocabulary Collision Navigation]]↗
- The pattern that explains how multiple naming traditions navigate shared conceptual territory without converging — the source for the cross-system vocabulary diversity section.
-
informed_by::[[Naming Carries Relational Weight]]↗
- The conviction that naming is architectural, not decorative — names encode design decisions and activate semantic fields that shape downstream inference.
-
informed_by::[[Renditions and Archives Replace Sources]]↗
- The decision that established renditions and archives as how external sources enter the graph.
-
informed_by::[[Sidecar Files as Metadata Envelopes]]↗
- The decision that binary files get metadata through sidecar markdown files carrying predicates on behalf of binaries that cannot carry their own.
-
engages_with::[[Predicate Vocabulary Stabilization]]↗
- The open question of when to formalize a predicate vocabulary — directly relevant to the folksonomy/ontology discussion here.
-
draws_on::[[Vocabulary Lifecycle Through Tending]]↗
- The gardening metaphor section draws on this model's seed/weed/fertilize framework.
-
grounded_in::[[Peters (2008) Tag Gardening for Folksonomy Enrichment]]
- Academic grounding for the gardening metaphor applied to vocabulary maintenance.
-
contends_with::[[Shared Language Community]]↗
- Shared language is tribal — it helps groups form and take shortcuts, but also creates barriers for outsiders who didn't participate in its creation.
-
contends_with::[[Progressive Canonization]]
- The proposal that sovereign nodes should gradually converge on shared language — and the counter-argument that convergence creates priesthoods and alienates newcomers.
-
informed_by::[[Deep Context Shared Languages]]
- The foundational observation that shared language enables deep context within communities — hashtags, allegories, and shorthand that compress meaning for insiders.
-
informed_by::[[Temporary Predicate Scaffolding]]↗
- The pattern of construction predicates that should be upgraded as the graph matures — provenance edges describing how nodes were built rather than how they relate.
-
composes_with::[[Lead File Selection Guidance]]↗
- How to identify the primary access point of a compound node — the file an agent should read first.
-
situated_in::[[Deep Context as an Architecture for Captured Reasoning]]↗
- The architectural vision within which this guide's conventions operate — typed edges as captured reasoning, not just organization.