goranefbl · April 18, 2026 13:57
diff --git a/gistfile1.txt b/gistfile1.txt
 # AI Knowledge Assistant – Product + Documents RAG Platform

 ## Purpose

 Build a multi-tenant AI knowledge assistant that can:

 - answer questions using organization-specific content
 - use both structured product content and uploaded PDFs
 - ingest website content from WordPress via REST API
 - keep content fresh via sync jobs and webhooks
 - provide full traceability for every answer
 - allow admins to manage what the AI knows
 - support an internal admin testing experience
 - support embedding on external websites such as WordPress

 This is **not** just a chatbot UI. It is an **admin-controlled AI knowledge system** with ingestion, synchronization, retrieval, traceability, and embed capabilities.

 ---

 ## Phase 1 Scope

 Phase 1 must include:

 1. Authentication using Better Auth
 2. Admin panel
 3. Organization-scoped knowledge management
 4. PDF upload and ingestion
 5. WordPress content ingestion via REST API
 6. WooCommerce product ingestion
 7. Webhook-driven freshness where possible
 8. Background workers for ingestion/re-indexing
 9. Chat testing inside admin
 10. Public/full-page embed endpoint for website use
 11. Traceable answers that show exactly which sources were used

 Phase 1 does **not** need:

 - complex analytics dashboards
 - billing/subscriptions
 - advanced role hierarchy beyond admin/member if unnecessary
 - fine-tuning custom models
 - direct live querying of WordPress or WooCommerce during user chat
 - long-term conversation memory or summarization (a short sliding window of recent turns IS included — see Extensions §8)
 - direct editing of chunk-level embeddings
 - multi-language support (deferred; admin and end users operate in a single language)

 ---

 ## Core Product Concept

 The system should be modeled around **knowledge**, not around a specific industry.

 Do **not** design the product around WooCommerce products only.

 The correct abstraction is:

 ```text
 Sources → Documents → Chunks → Retrieval → LLM Answer
 ```

 This lets the same product work for:

 - a WooCommerce store with product descriptions
 - an events business with event descriptions
 - a plugin company with technical docs and PDFs
 - a blog-heavy website with article search and guidance

 Admins should feel they are managing **what the AI knows**, not embeddings or vector internals.

 ---

 ## High-Level Architecture

 ```text
 Next.js App
 ├── Admin UI
 ├── Public Chat Page
 ├── Embedded Chat Page
 ├── API Routes
 └── Better Auth

 Background Worker
 ├── PDF ingestion jobs
 ├── WordPress sync jobs
 ├── WooCommerce sync jobs
 ├── Re-index jobs
 └── webhook follow-up jobs

 PostgreSQL
 ├── auth tables
 ├── sources
 ├── documents
 ├── chunks
 ├── products or knowledge records
 ├── chat logs / retrieval logs
 └── settings

 Redis + BullMQ
 ├── ingestion queue
 ├── sync queue
 ├── webhook queue
 └── reindex queue

 LLM Provider
 └── OpenAI
 ```

 ---

 ## Retrieval Model (RAG)

 This product uses **Retrieval-Augmented Generation (RAG)**.

 ### Correct RAG flow

 1. User asks a question
 2. The question is converted to an embedding vector
 3. The system searches stored chunk embeddings in Postgres using pgvector
 4. The system returns the most relevant chunks
 5. Those chunks, along with relevant metadata, are sent to the LLM
 6. The LLM generates an answer using only that context
 7. The API returns:
   - the answer
   - the list of sources/chunks used

 ### Important rule

 Do **not** fetch live WordPress pages or live WooCommerce products during chat requests.

 All external content must be ingested ahead of time into our database.

 That means:

 - WordPress REST API is a **content source**
 - WooCommerce REST API or webhooks are **content sources**
 - PDFs are **content sources**

 But the chat system always queries **our own database**, not the remote systems directly.

 ---

 ## Vector Search – Detailed Explanation

 Claude should understand exactly how vector search is expected to work.

 ### Goal

 Vector search finds text by **semantic meaning**, not just exact keywords.

 Example:

 - User asks: “Which plugin helps me reward customers for inviting friends?”
 - The system may retrieve chunks mentioning:
  - referral programs
  - invite friends
  - customer rewards
  - affiliate-like cashback for referrals

 Even if the exact wording does not match.

 ### Embeddings

 Every chunk of text is converted into a vector using an embedding model.

 Example:

 ```text
 "Referral rewards are given after the referred order is completed"
 → [0.123, -0.918, 0.442, ...]
 ```

 The same is done for the user question.

 ### Storage

 Each chunk record stores:

 - content text
 - embedding vector
 - document ID
 - metadata

 ### Query flow

 When the user asks a question:

 1. Generate embedding for the question
 2. Compare that vector against stored chunk vectors in Postgres
 3. Return the nearest chunks using pgvector similarity search
 4. Send only the top relevant chunks to the LLM

 ### Why chunking matters

 Do **not** embed whole PDFs or whole pages as one large unit.

 Instead:

 - split content into chunks of around 500–800 tokens
 - include small overlap between chunks

 This improves:

 - retrieval precision
 - traceability
 - answer quality

 ### What vector search returns

 Vector search returns the top relevant chunks, typically 3–8 depending on tuning.

 Each returned result should include:

 - chunk content
 - score/distance
 - document title
 - document ID
 - source type
 - source URL if available
 - external ID if available

 ### Important distinction

 - Vector search = retrieval
 - LLM = answer generation

 The vector database does **not** answer questions by itself.
 It only returns the most relevant context.

 ### Quality rules

 - only search active documents
 - optionally filter by organization
 - optionally filter by source type
 - optionally boost product-type documents if the mode is product recommendation
 - return metadata with every retrieval result for debugging and traceability

 ---

 ## Why This Is a Knowledge Platform, Not a Hardcoded Product Chatbot

 The data model and admin UI must be generic enough so that one organization can use it for:

 - WooCommerce products
 - WordPress posts/pages
 - uploaded PDFs
 - pasted manual knowledge

 Another organization may use it for:

 - event descriptions
 - help center content
 - manuals
 - training documents

 Therefore, the frontend admin should not primarily display “products” as the central abstraction.

 The central abstraction should be:

 - Sources
 - Documents
 - Test Chat

 A WooCommerce product is just one type of document.

 A WordPress page is just one type of document.

 A PDF is just one type of document.

 ---

 ## Tech Stack

 ### Required

 - Next.js (App Router)
 - Node.js
 - PostgreSQL
 - pgvector
 - Better Auth
 - Redis
 - BullMQ
 - OpenAI API

 ### Optional utilities

 - pdf-parse or equivalent for PDFs
 - html-to-text or equivalent for stripping HTML
 - zod for validation
 - drizzle or direct SQL (avoid heavy abstraction if possible)

 ### Recommendation

 Do not over-abstract the core retrieval logic behind a large framework if it reduces debuggability.

 Keep these steps explicit:

 - embed()
 - searchChunks()
 - buildPrompt()
 - callLLM()

 ---

 ## Multi-Tenancy

 The platform should be organization-scoped from the start.

 Every important record must belong to an organization:

 - sources
 - documents
 - chunks
 - settings
 - chat tests / logs
 - webhook configs
 - sync runs

 Even if Phase 1 only has a few organizations, this is the correct foundation.

 ---

 ## Authentication

 Use **Better Auth** with email/password and database-backed sessions.

 ### Auth requirements

 - login page
 - protected admin routes
 - organization-aware user access
 - role field available for future use

 ### User creation

 For now, users may be inserted manually into the database or seeded via a script.

 No public registration flow is required unless explicitly added later.

 ---

 ## Core Data Model

 The most important design decision is to introduce a top-level **Source** abstraction.

 ### Sources

 A source represents where knowledge comes from.

 Examples:

 - WooCommerce source
 - WordPress source
 - PDF source collection
 - Manual content source

 Suggested fields:

 - id
 - organization_id
 - name
 - type (`woocommerce`, `wordpress`, `pdf`, `manual`)
 - status (`active`, `disabled`)
 - config JSONB
 - last_sync_at
 - created_at
 - updated_at

 Examples of `config`:

 - WordPress base URL, auth, selected post types
 - WooCommerce store URL and API credentials
 - PDF settings if needed
 - manual source metadata

 ### Documents

 Documents are the normalized content records that the AI actually knows about.

 Every ingested thing becomes a document.

 Examples:

 - a WooCommerce product description
 - a WordPress page
 - a blog article
 - a PDF file
 - a manually created knowledge entry

 Suggested fields:

 - id
 - organization_id
 - source_id
 - title
 - type (`woo_product`, `wp_post`, `wp_page`, `pdf`, `manual`, `event`, etc.)
 - status (`active`, `disabled`, `draft`, `deleted`)
 - source_url
 - external_id
 - raw_content
 - normalized_content
 - override_content (nullable; for future editable overrides)
 - sync_hash
 - last_synced_at
 - metadata JSONB
 - created_at
 - updated_at

 Notes:

 - `external_id` stores remote IDs like Woo product ID or WP post ID
 - `source_url` stores the public URL if available
 - `raw_content` may contain HTML or extracted text
 - `normalized_content` is the cleaned text used for chunking
 - `override_content` is optional for later, allowing admin-written replacements
 - `sync_hash` helps detect content changes

 ### Chunks

 Chunks are the searchable retrieval units.

 Suggested fields:

 - id
 - organization_id
 - document_id
 - content
 - embedding vector
 - chunk_index
 - token_count
 - metadata JSONB
 - created_at

 Chunk metadata may include:

 - title
 - source_type
 - source_url
 - external_id
 - document_type
 - product SKU if applicable
 - product/category tags if applicable

 ### Settings

 Organization-level assistant settings.

 Suggested fields:

 - id
 - organization_id
 - system_prompt
 - mode (`recommendation`, `support`, `search`)
 - retrieval_limit
 - response_style
 - created_at
 - updated_at

 ### Retrieval Logs / Chat Logs

 These are important for debugging and trust.

 Suggested fields:

 - id
 - organization_id
 - user_id nullable
 - session_id nullable
 - query
 - final_answer
 - used_document_ids JSONB
 - used_chunk_ids JSONB
 - retrieval_debug JSONB
 - mode
 - created_at

 These logs let us inspect why the assistant answered the way it did.

 ---

 ## Product-Specific Knowledge vs Generic Knowledge

 Do not create a completely separate system for products.

 Instead, products should be represented as documents with structured metadata.

 For example, a WooCommerce product document can include metadata like:

 - product_id
 - sku
 - price
 - categories
 - tags
 - permalink
 - stock status if needed
 - short description
 - full description

 This lets retrieval work across:

 - product descriptions
 - PDFs
 - manuals
 - blog posts

 In future, ranking can prefer products when the question appears commercial.

 ---

 ## Ingestion Sources

 Phase 1 needs these source types:

 ### 1. PDF Upload

 Admins can upload PDFs in the admin panel.

 Flow:

 1. upload PDF
 2. save file (local or object storage)
 3. create document record
 4. extract text
 5. clean text
 6. chunk text
 7. create embeddings
 8. store chunks

 ### 2. WordPress Content via REST API

 Use the WordPress REST API as an ingestion source.

 Do **not** query it live during user chat.

 Use it to fetch content on sync.

 Endpoints may include:

 - `/wp-json/wp/v2/posts`
 - `/wp-json/wp/v2/pages`
 - optionally custom post types

 The content returned is usually HTML and must be cleaned into text.

 We should support:

 - manual sync
 - scheduled sync
 - webhook-triggered sync where possible

 ### 3. WooCommerce Products

 Use WooCommerce as an ingestion source for product knowledge.

 Product content can come from:

 - name
 - short description
 - full description
 - attributes
 - categories/tags
 - possibly FAQ/meta if needed later

 We should support:

 - initial full sync
 - incremental sync
 - webhook-triggered updates

 ### 4. Manual Knowledge Entries

 Admin can create knowledge entries directly inside the app.

 Useful for:

 - support notes
 - “things the AI should say”
 - event descriptions
 - internal definitions

 These become documents and get chunked/indexed like any other source.

 ---

 ## Content Normalization

 All content sources must be normalized before chunking.

 ### PDF normalization

 - extract text
 - remove repeated headers/footers when possible
 - normalize whitespace

 ### WordPress normalization

 - fetch rendered HTML from REST API
 - remove HTML tags
 - strip navigation-like artifacts if present
 - normalize whitespace
 - preserve useful headings if possible

 ### WooCommerce normalization

 - combine selected product fields into a single normalized text representation
 - preserve product name prominently
 - optionally include structured metadata in metadata JSONB, not necessarily inline in text

 ### Manual content normalization

 - save text as-is after basic cleanup

 ---

 ## Chunking Strategy

 Suggested defaults:

 - chunk target size: 500–800 tokens
 - overlap: 50–100 tokens

 Each document is split into ordered chunks.

 Store `chunk_index` to preserve sequence.

 The chunking utility should aim to split on logical boundaries where possible:

 - headings
 - paragraphs
 - list boundaries

 Avoid splitting in the middle of sentences if possible.

 ---

 ## Embedding Strategy

 Use OpenAI embeddings.

 Suggested initial model:

 - `text-embedding-3-small`

 Rules:

 - generate embeddings for every chunk
 - generate embedding for every user query
 - store embeddings in pgvector
 - re-embed chunks when content changes

 ---

 ## Retrieval Strategy

 The retrieval pipeline should be explicit and debuggable.

 ### Standard retrieval flow

 1. load organization settings
 2. embed the user query
 3. query chunks within the same organization
 4. restrict to active documents only
 5. optionally filter by mode or source type
 6. return top relevant chunks with metadata
 7. build prompt from those chunks
 8. call LLM
 9. return answer and sources

 ### Retrieval rules

 - default top-k should be configurable
 - retrieval results should include similarity score/distance
 - future ranking can bias:
  - product docs for recommendation mode
  - PDF/manual docs for support mode
  - blog/article docs for search mode

 ### Important rule

 Never send too much context to the model.
 Prefer a curated set of top chunks over large text dumps.

 ---

 ## Traceability Requirements

 This is a core product requirement.

 Every answer should be explainable.

 ### API response should include

 - answer text
 - source list
 - optionally retrieval debug data for admin-only test mode

 ### Source item structure

 Each source item should include at least:

 - document_id
 - document_title
 - chunk_id
 - source_type
 - source_url if available
 - snippet excerpt

 ### Admin test mode should additionally show

 - similarity score or rank
 - full retrieved chunks
 - which prompt mode was used
 - final prompt preview if needed for debugging
 - whether content came from products, pages, PDFs, or manual knowledge

 ---

 ## Freshness and Sync Strategy

 We need both sync jobs and webhooks.

 ### Why both are needed

 - sync jobs are reliable baseline reconciliation
 - webhooks give near-real-time freshness

 ### WooCommerce freshness

 Register product-related webhooks where appropriate, such as:

 - product.created
 - product.updated
 - possibly product.deleted

 On webhook:

 1. validate webhook
 2. identify product
 3. fetch latest product data if necessary
 4. upsert corresponding document
 5. delete old chunks
 6. regenerate chunks and embeddings

 Also support scheduled reconciliation sync to catch missed webhooks.

 ### WordPress freshness

 WordPress does not provide a strong native webhook system by default.

 Recommended approaches:

 #### Preferred
 Use a plugin such as WP Webhooks to notify our system when posts/pages change.

 #### Fallback
 Run scheduled sync using REST API and compare `modified` timestamps or a content hash.

 On webhook or sync update:

 1. find changed page/post
 2. fetch latest content from REST API
 3. upsert document
 4. replace chunks

 ### PDF freshness

 PDFs are manually managed.

 Freshness actions:

 - upload new file
 - replace file
 - delete/disable file
 - re-index file

 ---

 ## Admin UX Philosophy

 Admins should manage **knowledge**, not vectors.

 Do not expose embeddings or low-level AI jargon as the primary interface.

 The core admin experience should center around:

 1. Sources
 2. Documents
 3. Test Chat
 4. Settings

 ---

 ## Admin Frontend Information Architecture

 ### 1. Dashboard

 Simple overview:

 - total active sources
 - total active documents
 - total chunks
 - last sync status
 - recent ingestion errors
 - quick links to test chat and manage content

 ### 2. Sources Page

 Purpose: manage integrations and knowledge sources.

 Display a list of sources such as:

 - WooCommerce Store
 - WordPress Site
 - PDFs
 - Manual Content

 Each source card or row should show:

 - source name
 - type
 - status
 - last sync time
 - sync health
 - actions:
  - sync now
  - configure
  - disable
  - view documents

 Important: sources are the top-level admin abstraction for data origin.

 ### 3. Documents Page

 Purpose: manage actual knowledge items.

 This is the most important page.

 Table/list fields:

 - title
 - source
 - type
 - status
 - last synced
 - actions

 Actions:

 - view detail
 - enable/disable
 - resync/reindex
 - delete
 - optionally edit (future)

 This page must work well across industries.

 Examples visible in the same UI:

 - a WooCommerce product description
 - a WordPress blog article
 - a PDF manual
 - a manual knowledge note
 - an event description

 Do not hardcode the page title or UX around “products” only.

 ### 4. Document Detail Page

 Purpose: inspect what the assistant knows for a specific item.

 Should show:

 - title
 - source info
 - source type
 - source URL
 - external ID
 - last synced
 - document status

 Sections:

 #### Content Preview
 Show normalized content preview.

 #### Chunks
 Show chunk list in order.

 For each chunk, display:

 - chunk index
 - snippet
 - token count if available
 - metadata
 - optional embedding debug only if needed for internal dev

 #### Actions
 - reindex
 - disable
 - delete
 - future: override/edit content

 This page is crucial for debugging misinformation.

 ### 5. Test Chat Page

 This is a must-have for Phase 1.

 Purpose:

 - allow admin to test the assistant before embedding publicly
 - inspect sources used
 - validate whether retrieval quality is good

 UI should include:

 - chat input
 - response output
 - sources panel
 - debug panel

 Debug panel may show:

 - retrieved documents
 - retrieved chunks
 - similarity scores
 - prompt mode
 - final prompt preview (optional)
 - source types involved

 This is where traceability becomes usable.

 ### 6. Settings Page

 At minimum include:

 - system prompt editor
 - mode selector
 - retrieval limit
 - maybe source preferences later

 The system prompt must be editable in admin.

 ---

 ## Should Admin Be Able to Edit Content?

 ### Recommendation for Phase 1

 Use **read-only synced documents** plus **manual knowledge entries**.

 That means:

 - WooCommerce and WordPress content are primarily synced from source
 - admin can enable/disable/reindex them
 - admin can add manual entries for exceptions or important clarifications

 ### Optional future feature

 Add `override_content` to documents so admin can override imported content without editing the source platform.

 But this does not need to be in MVP unless explicitly requested.

 ---

 ## Public Chat / Embed Strategy

 The system must support both:

 1. admin-only test chat
 2. public/full-page embedded chat

 ### Recommended Phase 1 embed approach

 Expose a public full-page chat route that can be embedded via iframe into WordPress.

 Example:

 ```html
 <iframe src="https://app.example.com/embed/{organization-or-assistant-key}"></iframe>
 ```

 This is much simpler and faster than building a full JavaScript widget first.

 ### Phase 1 also includes a JS widget

 In addition to the full-page iframe, ship a lightweight JavaScript widget:

 ```html
 <script src="https://app.example.com/widget.js?key=..."></script>
 ```

 The widget renders a floating chat bubble that opens into a chat panel on any page. Both embed modes (iframe and widget) point at the same public chat API. See Extensions §11 for detailed widget requirements.

 ### Why not just give WP an API endpoint?

 That is possible, but then the WordPress side must build and maintain the UI.
 For MVP, our app should own the chat UI.

 ---

 ## Public Assistant Security Model

 Public embed routes should not expose organization internals.

 Use an assistant/public token or embed key tied to an organization or assistant configuration.

 The public route should know:

 - which organization’s data to query
 - which prompt/settings to use

 Do not use the admin session for embeds.

 ---

 ## Prompting Strategy

 The system prompt must be organization-configurable.

 ### Initial modes

 - `recommendation`
 - `support`
 - `search`

 ### Example behavior

 #### recommendation
 Favor recommending relevant products/services when available.

 #### support
 Favor accurate technical/support answers from PDFs/manual docs.

 #### search
 Favor website/blog/article discovery and concise answers with references.

 ### Prompt rules

 The prompt should emphasize:

 - only use provided context
 - if the answer is not in context, say so
 - do not fabricate
 - keep answer aligned with chosen mode
 - cite or summarize relevant source-backed details

 ---

 ## Background Jobs

 Use BullMQ workers for non-trivial operations.

 Required jobs:

 - ingest-pdf
 - sync-wordpress
 - sync-woocommerce
 - reindex-document
 - process-webhook

 Do not block the request-response cycle with heavy embedding work if avoidable.

 Typical flow:

 1. API request creates source or upload
 2. enqueue job
 3. worker processes
 4. source/document statuses update

 ---

 ## Recommended Status Fields

 ### Source status

 - active
 - disabled
 - syncing
 - error

 ### Document status

 - active
 - disabled
 - syncing
 - error
 - deleted

 These statuses should drive admin visibility and retrieval filters.

 Only active documents should participate in retrieval.

 ---

 ## Suggested API Surface

 This is indicative, not strict.

 ### Auth
 - `POST /api/auth/...` via Better Auth

 ### Sources
 - `GET /api/admin/sources`
 - `POST /api/admin/sources`
 - `GET /api/admin/sources/:id`
 - `PUT /api/admin/sources/:id`
 - `POST /api/admin/sources/:id/sync`

 ### Documents
 - `GET /api/admin/documents`
 - `GET /api/admin/documents/:id`
 - `POST /api/admin/documents/manual`
 - `POST /api/admin/documents/:id/reindex`
 - `PUT /api/admin/documents/:id/status`
 - `DELETE /api/admin/documents/:id`

 ### Uploads
 - `POST /api/admin/upload/pdf`

 ### Webhooks
 - `POST /api/webhooks/woocommerce/:sourceId`
 - `POST /api/webhooks/wordpress/:sourceId`

 ### Chat
 - `POST /api/chat` (admin/internal)
 - `POST /api/embed/:key/chat` (public embed)
 - optional `GET /embed/:key` full-page public chat UI

 ### Settings
 - `GET /api/admin/settings`
 - `PUT /api/admin/settings`

 ---

 ## WordPress Ingestion Design

 ### Source configuration

 A WordPress source should support config like:

 - base URL
 - REST auth if needed
 - selected content types:
  - posts
  - pages
  - maybe custom post types later
 - sync mode:
  - manual
  - scheduled
  - webhook + scheduled fallback

 ### Sync flow

 1. fetch content from REST API
 2. extract:
   - id
   - title
   - slug
   - link
   - modified date
   - content.rendered
 3. clean HTML into normalized text
 4. upsert document
 5. delete old chunks
 6. regenerate chunks + embeddings

 ### Live chat rule

 Again: the WP REST API is for ingestion only, not runtime Q&A.

 ---

 ## WooCommerce Ingestion Design

 ### Source configuration

 A WooCommerce source should support config like:

 - store URL
 - API credentials
 - optional source filters if needed later

 ### Sync flow

 For each product:

 1. fetch product fields
 2. build normalized text from relevant fields
 3. upsert document with type `woo_product`
 4. replace chunks

 Potential product text composition:

 - product name
 - short description
 - full description
 - selected attributes or tags if helpful

 Do not clutter normalized text with too much raw structured data.
 Store structured values in metadata where appropriate.

 ---

 ## Manual Knowledge Design

 Manual entries are important because synced source content is not always enough.

 Allow admin to create a manual document with:

 - title
 - content
 - type `manual`
 - status active/disabled

 This gives the organization a way to teach the AI extra information without editing the external platforms.

 ---

 ## Traceability UX Requirements

 Traceability is a core promise.

 ### Public chat

 Public users may see:

 - answer
 - concise “Sources” list

 ### Admin test chat

 Admins should see richer traceability:

 - answer
 - sources
 - retrieved chunks
 - similarity rank/score
 - prompt mode
 - debug metadata

 This is what lets the team inspect misinformation and decide whether to:

 - disable a document
 - fix source content
 - add manual knowledge
 - adjust prompt/settings

 ---

 ## What We Reuse Conceptually From the Existing SaaS Architecture

 The existing SaaS architecture principles are highly reusable:

 ### Reusable principles

 - multi-tenant scoping
 - source-of-truth mindset
 - background jobs
 - explicit sync model
 - webhook + scheduled reconciliation
 - auditability
 - admin operational visibility

 ### Equivalent mapping

 Old inventory-oriented concepts map to AI knowledge concepts like this:

 - store integration → source integration
 - synced product/order record → document
 - stock movement logs → retrieval/chat logs
 - sync jobs → ingestion jobs
 - health dashboards → source sync status and ingestion status
 - admin debug tooling → test chat + source inspection

 ---

 ## Implementation Philosophy

 Claude should build this with these priorities:

 1. clarity over over-engineering
 2. debuggability over magic
 3. source-first knowledge management
 4. generic abstractions instead of industry-specific assumptions
 5. strong admin visibility from day one

 The admin should be able to answer:

 - What does the AI know?
 - Where did that answer come from?
 - Which document caused bad information?
 - How do I disable or fix it?
 - Is my WordPress/WooCommerce content synced and fresh?

 ---

 ## Deliverables Expected From Claude

 Claude should build:

 ### Backend / infrastructure
 - Better Auth integration
 - organization-aware DB schema
 - source/document/chunk models
 - pgvector support
 - ingestion workers
 - chat API
 - embed/public chat API
 - webhook endpoints
 - sync logic for WordPress and WooCommerce

 ### Admin UI
 - login
 - dashboard
 - sources page
 - documents list
 - document detail
 - test chat
 - settings page

 ### Public UI
 - full-page embeddable chat route

 ### Behavior
 - PDF upload and ingestion
 - WordPress sync via REST API
 - WooCommerce sync
 - chunking and embedding
 - retrieval + answer generation
 - traceable source-backed responses

 ---

 ## Non-Negotiable Rules

 1. Do not query live WordPress or WooCommerce during chat requests
 2. Only query our own indexed data during chat
 3. Always return traceability info for admin testing
 4. Only active documents participate in retrieval
 5. All data must be organization-scoped
 6. Keep retrieval pipeline explicit and inspectable
 7. Build the system as a generic knowledge platform, not a Woo-only chatbot

 ---

 ## Extensions: Production Requirements for Real Use Cases

 This section overrides and extends earlier phase definitions based on the two confirmed Phase 1 clients:

 1. A **WooCommerce store** — product recommendations and commerce Q&A
 2. A **WordPress nightlife events site** — what's on tonight / this weekend / which venue / ticket links

 These extensions are **required for Phase 1**, not optional. Where this section conflicts with earlier text, this section wins.

 ---

 ### 1. Time-Aware Retrieval (events)

 Purpose: event content is only useful before the event ends.

 #### Required document fields (first-class columns, not JSONB)

 - `event_start` (timestamptz, nullable)
 - `event_end` (timestamptz, nullable)
 - `venue` (text, nullable)
 - `city` (text, nullable)

 #### Status behavior

 - A document with `event_end < now()` auto-transitions to `expired` status.
 - `expired` documents are excluded from retrieval by default.
 - Expired documents remain visible in admin so recurring events can be restored/extended.

 #### Query-time behavior

 - Parse temporal intent in the user query *before* vector search:
  - "tonight" → today, from now → end of day
  - "tomorrow" → next calendar day
  - "this weekend" → upcoming Saturday + Sunday
  - weekday names → upcoming occurrence of that weekday
  - "next week" → Monday–Sunday of next week
  - explicit dates → parsed to a range
 - If temporal intent is detected: apply `WHERE event_start BETWEEN ... AND ...` *before* vector search.
 - If no temporal intent: default to future events only (`event_end >= now()`).

 #### Non-event documents

 - For documents without `event_start` (products, PDFs, manual knowledge, blog posts), temporal filters are skipped — missing dates count as "always valid", not "expired".

 ---

 ### 2. Custom Post Types, ACF, and Arbitrary Meta Fields

 The client is the site administrator on the WP side. Assume they can expose anything via REST. The ingestion config must take advantage of that.

 #### WordPress source configuration (extended)

 Each WordPress source must let the admin configure:

 - Base URL
 - Auth (application password or bearer token)
 - Selected post types — arbitrary (`post`, `page`, `tribe_events`, `event`, `product`, custom slugs)
 - **Per-post-type field mapping:**
  - **Fields concatenated into normalized text** (e.g. `title`, `content.rendered`, `acf.description`, `meta.venue_description`)
  - **Fields mapped to structured metadata columns** (e.g. `acf.event_date` → `event_start`, `acf.venue_name` → `venue`, `acf.ticket_url` → `metadata.ticket_url`, `acf.price` → `metadata.price`, `_thumbnail` → `primary_image_url`)
  - **Fields ignored**

 This mapping lives on the source record (JSONB config) and is editable in admin.

 #### WooCommerce meta / attributes

 Same pattern: admin maps product attributes, ACF, and custom meta into either normalized text or structured metadata. Examples:

 - `attributes.color`, `attributes.size` → `metadata.tags` (filterable)
 - ACF product fields → text or metadata
 - Custom meta (warranty, return policy snippets) → text
 - `images[0].src` → `primary_image_url`

 ---

 ### 3. Sitemap Ingestion

 #### Scope: per-sitemap, not whole-site

 Do **not** crawl the whole site from `/sitemap.xml`. That pulls noise: `/cart`, `/checkout`, `/my-account`, archive pages, paginated loops.

 Instead:

 - Admin specifies one or more specific sitemap URLs.
 - Most WP SEO plugins (Yoast, Rank Math, SEOPress) split sitemaps per post type:
  - `sitemap-posts.xml`
  - `sitemap-pages.xml`
  - `sitemap-products.xml`
  - `sitemap-events.xml` / `sitemap-tribe_events.xml`
 - Admin picks exactly which sitemaps are in scope.

 #### Optional URL filters

 Each sitemap config may include:

 - `include_patterns` (regex or glob allowlist)
 - `exclude_patterns` (regex or glob denylist)

 Applied after sitemap fetch, before ingestion.

 #### Sitemap is a fallback, not the primary path

 When REST API exposes the content type cleanly, REST is preferred — richer fields, faster incremental sync. Use sitemap when:

 - A post type isn't exposed in REST
 - Orphaned URLs aren't returned by REST filters
 - Rendered HTML is easier to extract than raw post content

 Sitemap crawl: fetch URL → extract main content HTML → clean to text → ingest.

 ---

 ### 4. Hybrid Search (Vector + Keyword)

 Pure cosine similarity misses exact matches: product names, SKUs, DJ names, event titles, specific dates.

 #### Implementation

 - Maintain pgvector embeddings (semantic)
 - Maintain a Postgres `tsvector` column on chunks (and/or documents) for full-text search (lexical)
 - On every query:
  1. Run vector search (top-k, e.g. k=20)
  2. Run `tsvector` search (top-k, e.g. k=20)
  3. Merge with **Reciprocal Rank Fusion (RRF)** using the canonical formula: `score = Σ 1 / (k + rank_i)` across result lists, with `k = 60`.
  4. Take top N after fusion (e.g. N=6) into the LLM prompt.

 #### Weighting

 - Default: equal weight RRF
 - `recommendation` mode: boost documents of type `woo_product`
 - `events` mode (new — see §9): boost documents with `event_start` inside the query's date window

 ---

 ### 5. Query-Time Metadata Filters

 RAG alone cannot answer "techno events under €30 this weekend" or "red dresses under €50 in stock."

 #### Required

 Retrieval must accept structured filters applied to document/chunk metadata:

 - Date range (`event_start` between X and Y)
 - Price range (`metadata.price` between X and Y, if indexed)
 - Category / tag (from metadata)
 - Stock status (Woo)
 - City / venue

 Filters are applied as SQL `WHERE` clauses *before* vector + keyword search.

 #### Extracting filters from the query

 Split by filter type — do not run everything through an LLM, and do not rely on regex for everything:

 1. **Rule-based parser for temporal intent** (`tonight`, `tomorrow`, weekday names, "this weekend", explicit dates). Deterministic, no LLM call, near-zero latency. Dates are simple and high-frequency — a rule layer is more reliable than an LLM here.
 2. **LLM-based query planner for everything else** (price ranges, categories, stock, city, complex intent). A small prompt extracts `{price_range, categories, stock_only, city, ...}` from the user question.

 Both run in parallel; results are merged into a single filter object. Log the extracted filters in retrieval debug.

 ---

 ### 6. Structured Responses (Cards)

 The chat API must return structured output, not just prose.

 #### Response shape

 ```json
 {
  "answer": "Three events match 'techno Friday'.",
  "cards": [
    {
      "type": "event",
      "document_id": "...",
      "title": "Amelie Lens @ Warehouse",
      "image": "https://.../image.jpg",
      "url": "https://.../event/amelie-lens",
      "date": "2026-04-24T22:00:00Z",
      "venue": "Warehouse, Belgrade",
      "price": "€25",
      "cta": { "label": "Get tickets", "url": "https://tickets.example/..." }
    },
    {
      "type": "product",
      "document_id": "...",
      "title": "Red Summer Dress",
      "image": "https://.../dress.jpg",
      "url": "https://shop.example/product/red-dress",
      "price": "€39.99",
      "in_stock": true,
      "cta": { "label": "Add to cart", "url": "..." }
    }
  ],
  "sources": [ ... ]
 }
 ```

 #### Card population

 Cards are built from the **top-N documents after hybrid retrieval + RRF fusion** — not from whichever documents the LLM happens to "reference" in its prose (parsing answer text for citations is fragile and inconsistent).

 **Do not ask the LLM to generate card JSON** — it hallucinates fields.

 1. LLM generates the `answer` text only.
 2. Server code builds `cards[]` from the metadata of the top-N fused retrieval results (the same documents passed to the LLM as context).
 3. Both are returned together.

 Card types supported in Phase 1: `event`, `product`, `article` (generic link with title/excerpt/image).

 ---

 ### 7. Images and Media

 Add to the documents table:

 - `primary_image_url` (text, nullable)

 Populated during ingestion:

 - Woo: `images[0].src`
 - WP: featured image URL (requires `_embed=1` or explicit media fetch)
 - Events: featured image or ACF-mapped hero image
 - PDF / manual: null (unless admin attaches one)

 The embed UI renders cards with images where available.

 ---

 ### 8. Conversation Memory (Sliding Window + Query Rewriting)

 Phase 1 includes short conversation memory. Long-term / summary memory is deferred.

 #### Sliding window

 - Keep the last **N = 6 messages** (3 user + 3 assistant exchanges) in the LLM prompt.
 - Drop older messages from prompt context (they're still stored in chat logs for admin review).
 - 6 is a good default. Expose as an org setting if needed later.

 #### Do not send full history every turn

 - Cost and latency grow linearly with history length.
 - Older turns drag retrieval in wrong directions.
 - 6 messages cover "cheaper ones", "tomorrow?", "is it wheelchair accessible?", "tell me more about the second one".

 #### Query rewriting for retrieval (important)

 Embedding "cheaper ones" alone retrieves nothing useful. Before retrieval:

 1. If the latest user message is a follow-up (short, referential: "cheaper", "tomorrow?", "the second one"), run a cheap LLM call that rewrites it into a standalone question using the last 2–4 turns.
   - Example: "cheaper ones" + prior context "events this Friday" → "cheaper techno events this Friday"
 2. Embed and search using the rewritten query.
 3. The rewritten query also feeds the filter extractor (§5).

 This is small but high-impact. Log both the original and rewritten query in chat logs.

 ---

 ### 9. System Prompt Presets

 Admins are typically non-technical. A blank textarea is a trap. Ship presets.

 #### What a preset is

 A preset bundles:

 - A **mode** (`recommendation`, `support`, `search`, plus the new `events`)
 - A **pre-written system prompt template** with `{brand}` and other placeholders
 - **Default retrieval settings** (top-k, filters, boosts)
 - **Default card types** to render

 #### Presets shipped in Phase 1

 **1. E-commerce / Shopping Assistant** — for the Woo client
 - Mode: `recommendation`
 - Prompt (excerpt): *"You are a shopping assistant for {brand}. Recommend only products present in the provided context. Reference each recommendation by name. Never invent SKUs, prices, or stock status. If nothing matches, say so and suggest the closest alternative. Keep replies short."*
 - Card type: `product`
 - Boost: `woo_product` documents

 **2. Events Concierge** — for the nightlife client
 - Mode: `events`
 - Prompt (excerpt): *"You are an events concierge for {brand}. Recommend events matching the user's date and interest, using only the provided context. Always show events with date, venue, and ticket link. If no events match the requested date, say so clearly and suggest the closest alternatives in date. Never invent events or venues. Keep replies short."*
 - Card type: `event`
 - Default filter: future events only, unless user explicitly asks about past events
 - Boost: `event` documents

 **3. Support / Helpdesk**
 - Mode: `support`
 - Prompt (excerpt): *"You are a support agent for {brand}. Answer using the provided documentation only. If the answer is not in context, say 'I don't have that information' and suggest contacting support. Quote the relevant snippet when helpful. Never guess."*
 - Card type: `article`
 - Boost: `pdf`, `manual`, `wp_page`

 Admin picks a preset, edits `{brand}` and any wording they want, saves. They can always go fully custom later.

 ---

 ### 10. Manual Knowledge Authoring UX

 Manual documents are how admins teach the AI things that aren't on their site yet: dress code, age policy, refund policy, event FAQs, venue directions, "please never recommend the discontinued X line".

 #### Editor requirements

 - Markdown or rich-text editor (not a `<textarea>`)
 - Fields: title, body, optional tags, optional expiration date
 - Save creates a document with `type = 'manual'`; chunks + embeds immediately
 - Edit re-chunks and re-embeds on save

 #### Organization / discovery

 - Tag manual docs (e.g. `policy`, `faq`, `venue-info`)
 - Filter the documents page by tag and by `type = 'manual'`
 - Allow duplicating an existing manual doc as a template

 ---

 ### 11. Embeddable JS Widget (Phase 1)

 The iframe full-page embed still ships, but the JS widget is also Phase 1.

 #### Widget requirements

 - A single `<script src="https://app.example.com/widget.js?key=...">` tag
 - Floating chat bubble (position configurable)
 - Opens into a chat panel
 - Renders cards (event, product, article) with images and CTAs
 - Branded to match the org's primary color + logo (configured in admin)
 - Mobile responsive
 - No jQuery / framework bloat — vanilla JS or Preact. Target **<100KB gzipped** for the full widget bundle (realistic for chat UI + cards; push lower if practical).

 #### Widget vs iframe

 - iframe = full-page dedicated embed
 - Widget = chat bubble overlay on existing pages

 Both call the same public chat API.

 ---

 ### 12. Public Embed Security

 #### Required Phase 1 protections

 1. **Origin allowlist**: each public key has an allowed origins list. Chat API checks `Origin`/`Referer`. Mismatch = reject.
 2. **Rate limiting**:
   - Per key: e.g. 60 req/min, 1000/day (defaults, org-configurable)
   - Per IP: e.g. 10 req/min
   - Return 429 with `Retry-After`
 3. **Key rotation**: admin can rotate; old key stops after a grace period.
 4. **Input size caps**: max user message length (e.g. 1000 chars), max history length sent.
 5. **No admin-debug leakage**: public response must not include retrieval debug info.
 6. **CORS**: configured to match origin allowlist, not `*`.

 #### Optional Phase 1

 - CAPTCHA after N abusive requests from the same IP
 - Basic prompt-injection filter on user input (strip obvious patterns; log-only)

 ---

 ### 13. User Feedback and Admin Feedback Review

 #### Public chat UI

 After each assistant response, show 👍 / 👎 buttons. On click:

 - `POST /api/embed/:key/feedback` with `{ message_id, rating, optional_comment }`
 - Persist to `chat_feedback` table

 #### Schema: `chat_feedback`

 - `id`
 - `organization_id`
 - `chat_log_id` (links to the retrieval/chat log)
 - `rating` (`up` | `down`)
 - `comment` (text, nullable)
 - `user_session_id` (nullable)
 - `created_at`

 #### Admin "Feedback" page

 A dedicated admin page, newest first:

 - Columns: timestamp, rating, user query, assistant answer, comment
 - Row click → full retrieval trace (same view as admin Test Chat debug panel)
 - Filters: rating = down, date range, source types used
 - Quick actions from the row:
  - Disable one of the cited documents
  - Open the cited document for editing
  - Add a manual knowledge entry to correct the answer
  - Mark feedback as "addressed"

 This is the closed loop: real users flag bad answers → admin sees context → admin fixes knowledge → next time is better.

 ---

 ### 14. Cost and Usage Controls

 Each organization should have:

 - Monthly token budget (embeddings + completions)
 - Current usage counter, reset monthly
 - Soft warning at 80%, hard stop at 100% (configurable)
 - Per-request token log (tokens in, tokens out, model, cost estimate)
 - Admin view: "Usage this month" with simple chart and cost estimate

 This protects against runaway traffic and makes the economics obvious.

 ---

 ### 15. Updated Data Model Summary

 Compared to the earlier Core Data Model, add:

 #### documents (new columns)

 - `document_role` text not null — behavior role (`product` | `event` | `article` | `support` | `manual`). Drives retrieval boosting, card rendering, and UI filtering. See §18.
 - `source_modified_at` timestamptz null — the `modified` timestamp as reported by the source (WP `modified`, Woo `date_modified`). Drives freshness conflict resolution. See §23.
 - `event_start` timestamptz null
 - `event_end` timestamptz null
 - `venue` text null
 - `city` text null
 - `primary_image_url` text null
 - `tags` text[] null (manual docs + mapped from source)
 - `expires_at` timestamptz null (manual docs with expiry)

 #### sources (extended config JSONB)

 - For WP: `post_types[]`, `field_mapping{}`, `sitemap_urls[]`, `include_patterns[]`, `exclude_patterns[]`
 - For Woo: `attribute_mapping{}`, `meta_mapping{}`

 #### new table: `chat_feedback`

 As described in §13.

 #### new table: `usage_ledger` (or columns on organization)

 Monthly tokens and costs.

 #### new table: `assistant_keys`

 Public embed keys with origin allowlist, rate limits, rotation state.

 ---

 ### 16. Updated Retrieval Pipeline

 Replace the earlier retrieval flow with:

 1. Load organization settings and mode preset.
 2. **Rewrite query** if it's a short follow-up (using last 2–4 turns).
 3. **Extract filters** from rewritten query (date range, price, category, stock, city).
 4. **Pre-filter SQL**: select active, non-expired documents in the org, matching filters.
 5. Run **vector search** (pgvector) on the pre-filtered set, top 20.
 6. Run **keyword search** (tsvector) on the pre-filtered set, top 20.
 7. **Fuse** with RRF, take top 6.
 8. Apply mode-specific boosts (product docs for `recommendation`, event docs for `events`).
 9. Build prompt: preset system prompt + last 6 conversation messages + retrieved chunks.
 10. Call LLM → generate `answer`.
 11. Build `cards[]` server-side from metadata of retrieved documents actually referenced.
 12. Return `{ answer, cards, sources, debug? }`.
 13. Log to `chat_logs`: original query, rewritten query, extracted filters, retrieved chunk IDs, scores, final prompt (admin debug only).

 ---

 ### 17. Updated Non-Negotiable Rules

 Add to the original list:

 8. Expired event documents must not appear in retrieval unless explicitly requested.
 9. Manual knowledge and synced content retrieve through the same pipeline — no separate path.
 10. Card structured data is built server-side from top-N fused retrieval results, not generated by the LLM and not derived from LLM answer text.
 11. Public embed endpoints must enforce origin allowlist + rate limits.
 12. Every chat response must include traceable source IDs.
 13. Every chat request MUST persist retrieval debug: original query, rewritten query, extracted filters, retrieved chunk IDs, fusion scores, final prompt.
 14. Failed ingestion (embedding error, normalization failure) marks the document `status = error`. Prior successful chunks are preserved and keep serving retrieval; partial chunk writes are never committed.
 15. No LLM call is made when retrieval returns zero qualifying chunks — return a deterministic fallback response instead (§20).
 16. All public and admin API routes are versioned from day one (`/api/admin/v1/...`, `/api/embed/v1/...`). v1 contracts are frozen once deployed; breaking changes go to v2.
 17. Freshness conflicts resolve on `source_modified_at`, not `last_synced_at`. Late-arriving webhooks carrying older data are ignored.

 ---

 ### 18. Document Role (behavior, separate from type)

 `type` tracks **origin** (`woo_product`, `wp_post`, `wp_page`, `pdf`, `manual`, `event`, custom CPT slugs).
 `document_role` tracks **behavior** — how retrieval, ranking, and card UI should treat it.

 Allowed values:

 - `product` — renders as product card, boosted in `recommendation` mode
 - `event` — renders as event card, date filters apply, boosted in `events` mode
 - `article` — renders as article card, used in `search` mode
 - `support` — renders as article card, boosted in `support` mode
 - `manual` — admin-authored; admin picks the effective role on creation

 Why the split matters:

 - A WP CPT `tribe_events` post and a manually-entered event both get `document_role = 'event'` and flow through the same retrieval/card path.
 - A WP `page` documenting return policy gets `document_role = 'support'` even though `type = wp_page`.
 - Retrieval boosting, filtering, and card rendering switch on `document_role`, not on `type`.

 How it's set:

 - Source field mapping (§2) specifies the role per post type during ingestion.
 - Manual docs let the admin choose a role at creation.
 - Sensible defaults: `woo_product` → `product`; `tribe_events` / `event` CPT → `event`; PDFs → `support`; WP posts → `article`; WP pages → `article` unless the admin overrides.

 ---

 ### 19. Mode → Role Priority Ranking

 Modes determine which roles are boosted in hybrid retrieval (applied **after** RRF fusion, not as a hard filter):

 | Mode | Priority order (highest first) |
 | ---- | ------------------------------ |
 | `recommendation` | `product` > `support` > `article` > `manual` |
 | `events` | `event` > `article` > `manual` |
 | `support` | `support` > `manual` > `article` |
 | `search` | neutral (no role boost) |

 Boost is applied by multiplying the fused score by a role factor (e.g. preferred role ×1.3, next ×1.1, others ×1.0), then re-ranking the top-N. Do **not** exclude non-preferred roles — a product FAQ (role `support`) should still appear for a recommendation-mode query if it's genuinely the best match, just ranked slightly lower than equivalent products.

 ---

 ### 20. No-Results Fallback

 If hybrid retrieval returns zero chunks above the configured similarity threshold, or the pre-filter removes everything:

 1. **Do not call the LLM with empty context** — it will hallucinate.
 2. Return a deterministic fallback:
   - `answer`: preset-configured fallback text (e.g. *"I don't have information about that. Want me to search more broadly?"*)
   - `cards`: empty
   - `sources`: empty
   - `debug.reason`: `"no_results_above_threshold"` or `"pre_filter_empty"`
 3. Optionally re-run retrieval **without filters** and offer the top results as "closest matches" in a second response, clearly labeled as such ("No exact matches — here are some alternatives").

 Log every zero-result event to a dedicated admin view. These are the highest-value signals for what knowledge is missing.

 ---

 ### 21. Re-Indexing Triggers

 A document must be re-embedded when:

 - `sync_hash` of the source content changes (update detected on sync)
 - The source's field mapping changes (admin edits what's included in normalized text)
 - The chunking strategy changes (global setting change)
 - The embedding model changes (global setting change)
 - Admin clicks "Reindex" manually
 - A previously failed embedding is retried

 Re-embedding flow:

 1. Mark document `status = syncing`
 2. Normalize content from the latest source data
 3. Chunk
 4. Embed all chunks (atomic: all-or-nothing)
 5. **In one transaction**: delete old chunks, insert new chunks
 6. Mark document `status = active`
 7. On failure at any step: leave old chunks in place, set `status = error` (§22)

 Do not re-embed on every sync — only when `sync_hash` differs. Idempotent syncs are cheap.

 ---

 ### 22. Partial Ingestion and Error Handling

 #### Document-level failure

 If ingestion fails at any step (fetch, normalize, chunk, embed):

 - Set `status = error`
 - Store reason in `metadata.last_error` and `metadata.last_error_at`
 - **Preserve prior chunks** — the previous successful version continues serving retrieval until the next successful reindex.
 - Surface the error in admin (sources page + document detail)

 Exception: if this is the first-ever ingestion (no prior chunks), the document enters `error` with zero chunks and is excluded from retrieval.

 #### Chunk-level failure

 If one chunk fails to embed but others succeed (e.g. transient API error):

 - Retry with exponential backoff up to N times (e.g. N=3)
 - If it still fails: mark document `status = error`, do **not** replace prior chunks
 - Never partially commit — all chunks for a document are inserted together or none are

 #### Sync-level failure

 If a source sync fails (e.g. WP API down):

 - Existing documents stay as-is and remain in retrieval
 - Mark the source `status = error` and record the reason
 - Next scheduled sync retries automatically

 ---

 ### 23. Freshness Conflict Resolution

 Webhooks and scheduled syncs can race. Resolve by **source-side** timestamp, not our ingestion timestamp.

 Rule:

 - Each document stores `source_modified_at` — the `modified` timestamp as reported by the source (WP `modified`, Woo `date_modified`).
 - An incoming update (from webhook or sync) is applied only if its `source_modified_at` is **strictly newer** than the stored value.
 - Stale webhooks (delivered late, carrying older data) are silently ignored and logged.
 - If the incoming update lacks `source_modified_at` (rare), fall back to comparing our `last_synced_at`.

 Why not `last_synced_at`: `last_synced_at` is *when we synced*, not *when the source changed*. A late webhook could otherwise clobber newer content we already have.

 ---

 ### 24. API Versioning

 All public and admin routes are versioned from day one:

 - `/api/admin/v1/...`
 - `/api/embed/v1/:key/chat`
 - `/api/embed/v1/:key/feedback`
 - `/api/webhooks/v1/woocommerce/:sourceId`
 - `/api/webhooks/v1/wordpress/:sourceId`

 Rules:

 - v1 contracts are frozen once deployed. Breaking changes go to v2.
 - Additive changes (new optional fields in responses, new optional request parameters) do not require a version bump.
 - `widget.js` pins a default API version but can be overridden per embed during migration periods.

 ---

 ## Nice-to-Have Future Features (Not Required for Phase 1)

 - content overrides on synced documents
 - source tagging and filtering beyond manual-doc tags
 - deep analytics on user questions (beyond up/down feedback)
 - per-mode retrieval tuning dashboard
 - role permissions beyond admin/member
 - long-term conversation memory / summarization across sessions
 - source confidence tuning
 - assistant personas per organization beyond presets
 - CAPTCHA / advanced abuse protection

 ---

 ## Final Summary

 We are building a **multi-tenant, admin-controlled AI knowledge assistant platform**.

 It ingests:

 - WooCommerce products
 - WordPress content
 - PDFs
 - manual knowledge

 It normalizes them into:

 - sources
 - documents
 - chunks

 It answers questions through:

 - vector retrieval from pgvector
 - LLM generation using retrieved context only

 It provides:

 - full traceability
 - admin testing
 - source/document management
 - public embeddable chat

 This should be implemented as a generic, reusable knowledge system with explicit ingestion and retrieval pipelines.
No results found