Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save hugobowne/5722786efa7ccc79fd14fef3a45c03be to your computer and use it in GitHub Desktop.
Save hugobowne/5722786efa7ccc79fd14fef3a45c03be to your computer and use it in GitHub Desktop.
# Small and Large Language Models
This summary provides insights from recent papers related to small and large language models (LLMs) and their applications, performance, and efficiency improvements.
## Summary of Work
1. **Dynamic Scoring with Enhanced Semantics for HOI Detection**
- Focus on Vision-Language Models (VLMs) improving human-object interaction detection without heavy training.
- Utilizes small visual cues and textual features for robust multimodal understanding.
2. **DynaSearcher: Small-Scale Model Enhanced Search**
- Multi-step retrieval agents powered by LLMs face issues like factual inconsistency and inefficient search.
- Introduces dynamic knowledge graph augmentation with reinforcement learning.
- Achieves competitive accuracy using small models and limited compute resources.
3. **R-Stitch: Efficient Chain-of-Thought Reasoning**
- Improves reasoning in LLMs by hybrid decoding using small and large models.
- Small models generate tokens by default, large models intervene selectively.
- Up to 85% inference speed-up with minimal accuracy loss.
4. **R4ec: Reasoning, Reflection, and Refinement in Recommendation Systems**
- Utilizes LLMs for recommendation systems with System-2 like slow, reflective thinking.
- Combines actor and reflection models to enhance recommendations.
5. **Reinforcement Learning Fine-Tunes Sparse Subnetworks in LLMs**
- RL fine-tuning changes only a small fraction of parameters in LLMs.
- Sparse subnetworks consistently updated, enabling efficient fine-tuning.
## Papers
- [Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456)](http://arxiv.org/pdf/2507.17456v1)
- [DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365)](http://arxiv.org/pdf/2507.17365v1)
- [R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307)](http://arxiv.org/pdf/2507.17307v1)
- [R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249)](http://arxiv.org/pdf/2507.17249v1)
- [Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107)](http://arxiv.org/pdf/2507.17107v1)
This selection highlights practical approaches to using both small and large language models, emphasizing efficiency, multi-step reasoning, reinforcement learning adaptations, and real-world applications.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment