Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save hugobowne/f6c988644ee59be22d3a2cf2b0016ba2 to your computer and use it in GitHub Desktop.
Save hugobowne/f6c988644ee59be22d3a2cf2b0016ba2 to your computer and use it in GitHub Desktop.
# Summary of Research on Small Large Language Models
This summary provides an overview of recent research related to the application and development of small large language models (LLMs) or techniques leveraging small models in conjunction with larger ones to improve efficiency, accuracy, or scalability.
## Summary of Work
1. **Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection**
- This work explores the usage of vision-language models which can include smaller models for enhanced semantic understanding without extensive training, showing competitive results especially on rare interactions.
2. **DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning**
- Demonstrates that complex multi-step retrieval systems can achieve state-of-the-art accuracy while using only small-scale models combined with efficient techniques such as reinforcement learning and knowledge graphs, thereby matching larger LLMs in performance but with fewer computational resources.
3. **R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning**
- Proposes a hybrid decoding method switching between small language models (SLM) and large language models (LLM) based on confidence levels, significantly reducing inference latency by up to 85% while maintaining accuracy, showing the potential of small models to accelerate large model tasks.
4. **R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems**
- Employs a System-2 like reasoning framework with an actor and reflection model, leveraging potentially smaller models for iterative improvement, showing practical impact in recommendation systems and revenue enhancement.
5. **Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models**
- Finds that reinforcement learning fine-tuning primarily updates only a small subset of parameters (5-30%), leaving most weights fixed, indicating that effective small subnetworks within large models can be adaptively fine-tuned, opening avenues for parameter-efficient training.
## Papers
1. Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456v1) - [Link](http://arxiv.org/pdf/2507.17456v1)
2. DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365v1) - [Link](http://arxiv.org/pdf/2507.17365v1)
3. R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307v1) - [Link](http://arxiv.org/pdf/2507.17307v1)
4. R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249v1) - [Link](http://arxiv.org/pdf/2507.17249v1)
5. Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107v1) - [Link](http://arxiv.org/pdf/2507.17107v1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment