smart search analysis.md

What the source says

1. Image embeddings are precomputed and stored

Background Smart Search jobs encode each asset once and save the embedding:

server/src/services/smart-info.service.ts
- handleEncodeClip(...) calls machineLearningRepository.encodeImage(...)
- then stores it with searchRepository.upsert(...)

So the library is indexed ahead of time.

2. But a live text smart search still uses the ML service

When you search with text, Immich does not only hit the DB.

server/src/services/search.service.ts
- searchSmart(...) calls machineLearningRepository.encodeText(dto.query, ...)
- unless that exact query embedding is already in the in-memory embeddingCache

So for a text query, Immich first turns your search text into a CLIP embedding at query time.

3. That text embedding is produced by the machine-learning backend

server/src/repositories/machine-learning.repository.ts
- encodeText(...) sends a /predict request with ModelTask.SEARCH + ModelType.TEXTUAL
machine-learning/immich_ml/main.py
- /predict accepts text
- runs inference via run_inference(...)

4. GPU can accelerate that live text-embedding step

machine-learning/immich_ml/models/clip/textual.py
- the textual CLIP encoder supports ONNX Runtime accel providers:
  - CUDAExecutionProvider
  - ROCMExecutionProvider
  - OpenVINOExecutionProvider

So if you have ML hardware acceleration enabled, a live text search can use it for the text encoder.

5. The actual similarity lookup is then done in Postgres

server/src/queries/search.repository.sql
- smart search ranks results with:
  - order by smart_search.embedding <=> $6

That means the DB compares the query vector against stored image vectors.

Bottom line

The most accurate description is:

Indexing phase: ML encodes images once and stores embeddings.
Live text search phase: ML encodes the search text into an embedding.
Retrieval phase: Postgres/VectorChord does the vector similarity search.

So the comment saying “smart search just uses the index; there’s little processing once the index is there” is only partly true.

It’s true that image embeddings are precomputed.
It’s not true that live text search is purely DB-only: Immich still runs the textual CLIP model for text queries, unless that exact query is cached.

Why some people see “zero delay” on CPU

That can still be true in practice because:

the text model may already be loaded
the query embedding may be cached
the chosen CLIP model may be small/fast
the DB search itself can be quick

There’s even an in-process query embedding cache in:

server/src/services/search.service.ts
- private embeddingCache = new LRUMap<string, string>(100);

One important nuance

If the search uses queryAssetId instead of text, Immich reuses the stored asset embedding from the DB, so that path does not need live text encoding.

Docs that also confirm this

docs/docs/features/searching.md
- says larger CLIP models are slower both when indexing and when searching
docs/docs/administration/system-settings.md
- says a more powerful GPU can help with batch imports or for faster search
docs/docs/features/ml-hardware-acceleration.md
- explicitly says to check GPU provider logs when a Smart Search job starts or when you search with text in Immich
same doc also notes:
- ARM NN does not improve search latency
- but Smart Search jobs still use ARM NN

Answer to the original “no GPU / no iGPU” question

Yes, Immich will still work on CPU only. GPU/iGPU is optional acceleration, not a requirement.

What you should expect on CPU-only:

Smart Search indexing: slower
Face recognition: slower
Live text search: still works, usually okay, but model-dependent
Video transcoding: CPU fallback, can spike CPU usage

apetersson/smart search analysis.md

Select an option

No results found