Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save DominguesM/8552c396e10c83cb2b5b6bb19a9888ac to your computer and use it in GitHub Desktop.
Save DominguesM/8552c396e10c83cb2b5b6bb19a9888ac to your computer and use it in GitHub Desktop.
Information Retrieval Papers
- ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT (Khattab et al., 2020)
https://arxiv.org/abs/2004.12832
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Lewis et al., 2020)
https://arxiv.org/abs/2005.11401
- Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval (Xiong et al., 2020)
https://arxiv.org/abs/2007.00808
- Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation (Hofstätter et al., 2020)
https://arxiv.org/abs/2010.02666
- BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models (Thakur et al., 2021)
https://arxiv.org/abs/2104.08663
- SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking (Formal et al., 2021)
https://arxiv.org/abs/2107.05720
- InPars: Data Augmentation for Information Retrieval using Large Language Models (Bonifacio et al., 2022)
https://arxiv.org/abs/2202.05144
- Transformer Memory as a Differentiable Search Index (Tay et al., 2022)
https://arxiv.org/abs/2202.06991
- RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder (Xiao et al., 2022)
https://arxiv.org/abs/2205.12035
- Promptagator: Few-shot Dense Retrieval From 8 Examples (Dai et al., 2022)
https://arxiv.org/abs/2209.11755
- MTEB: Massive Text Embedding Benchmark (Muennighoff et al., 2022)
https://arxiv.org/abs/2210.07316
- Task-aware Retrieval with Instructions (Asai et al., 2022)
https://arxiv.org/abs/2211.09260
- Text Embeddings by Weakly-Supervised Contrastive Pre-training (Wang et al., 2022)
https://arxiv.org/abs/2212.03533
- BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation (Chen et al., 2024)
https://arxiv.org/abs/2402.03216
- Generative Representational Instruction Tuning (Muennighoff et al., 2024)
https://arxiv.org/abs/2402.09906
- LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders (BehnamGhader et al., 2024)
https://arxiv.org/abs/2404.05961
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment