Last active
October 30, 2025 02:24
-
-
Save virattt/b140fb4bf549b6125d53aa153dc53be6 to your computer and use it in GitHub Desktop.
rag-reranking-gpt-colbert.ipynb
query_embedding = model(**query_encoding).last_hidden_state.squeeze(0) is correct since it returns a vector per token, whilst
query_embedding = model(**query_encoding).last_hidden_state.mean(dim=1) returns a single vector averaged over all tokens.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@virattt Do you know the difference between using:
query_embedding = model(**query_encoding).last_hidden_state.squeeze(0)query_embedding = model(**query_encoding).last_hidden_state.mean(dim=1)I have tested both and seems that the
squeeze(0)returns better quality similar documents (maybe it's just the use-case I tried)