Skip to content

Instantly share code, notes, and snippets.

@acheshkov
Last active March 4, 2024 08:03
Show Gist options
  • Save acheshkov/61b4a9c0baed04b98bba073b6e5be24b to your computer and use it in GitHub Desktop.
Save acheshkov/61b4a9c0baed04b98bba073b6e5be24b to your computer and use it in GitHub Desktop.
About Retrieval From Project

Task: Given a graph of snippets $G$ and query $q$ find a relevant subset of nodes

Notes:

  • The project is represented as a graph of snippets
  • If nodes are tokens and there are all possible edges in the graph, we have an equivalent to input sequence for transformer model

Model 1

Calculate relevance between nodes (snippets) and query.

Possible Downsides:

  1. relevance of snippet and query is independent of other snippets
  2. granularity of a snippet in the graph is unclear
  3. not clear how to integrate such a retrieval system into end-to-end NN

Possible Positives:

  1. Easy to parallelize
  2. Embeddings of snippets may be precalculated
  3. Retrieval is $O(n)$, $n$ is number of snippets in the project

Downside-1

Firstly, I'd like to know, how this scenario is common. How frequently the relevance between snippet and query may depend on other snippets in the project? Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment