Notes:
- The project is represented as a graph of snippets
- If nodes are tokens and there are all possible edges in the graph, we have an equivalent to input sequence for transformer model
Calculate relevance between nodes (snippets) and query.
Possible Downsides:
- relevance of snippet and query is independent of other snippets
- granularity of a snippet in the graph is unclear
- not clear how to integrate such a retrieval system into end-to-end NN
Possible Positives:
- Easy to parallelize
- Embeddings of snippets may be precalculated
- Retrieval is
$O(n)$ ,$n$ is number of snippets in the project
Firstly, I'd like to know, how this scenario is common. How frequently the relevance between snippet and query may depend on other snippets in the project? Any thoughts?