Added YAML frontmatter to all 577 scraped markdown files in ~/content/links/ so QMD can provide meaningful search results with source URLs and summaries.
helpers.py - Two new functions:
strip_frontmatter(text)- strips YAML frontmatter from markdown, returns bodyensure_frontmatter(content_path, url, summary)- reads file, strips existing frontmatter, rewrites with current metadata
run_sync_links.py - After scraping and summarization (before git commits), calls ensure_frontmatter on all processed content files. New content gets frontmatter automatically.
run_show_link.py - Strips embedded frontmatter before display to avoid duplication (show-link generates its own frontmatter output).
run_backfill_frontmatter.py + cli.py - New backfill-frontmatter command that adds frontmatter to all existing files. One-time use, idempotent.
updated: 577
skipped: 0
total: 577
---
url: https://example.com/article
summary: "One-sentence summary from links.yaml"
---| Aspect | Status |
|---|---|
qmd.yaml exists |
Yes (name: content-links, multi: false) |
| Frontmatter on files | Done (577/577) |
| HTML exclusion | Already handled by .stignore whitelist |
| Syncthing scope | Only *.md, qmd.yaml, links.yaml sync |
| QMD context annotations | Not yet set up (run on artbird after sync) |
After Syncthing delivers the updated files to artbird:
# On artbird - register context annotation
qmd context add qmd://content-links "Personal link collection - saved web articles, tweets, GitHub repos, tools, and resources. Each document is the full scraped content of a saved URL with frontmatter containing the source URL and summary."Then search works:
knowctl semantic-search content-links "RAG pipeline"cwd: /Users/mike/code/arthack
session-id: f17ba06c-e387-4237-b24a-146ba8082ed0
path: /Users/mike/docs/links-qmd-indexing-2026-03-22.mdcd /Users/mike/code/arthack && claude --resume f17ba06c-e387-4237-b24a-146ba8082ed0