Created
July 24, 2025 14:15
-
-
Save hugobowne/5722786efa7ccc79fd14fef3a45c03be to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Small and Large Language Models | |
This summary provides insights from recent papers related to small and large language models (LLMs) and their applications, performance, and efficiency improvements. | |
## Summary of Work | |
1. **Dynamic Scoring with Enhanced Semantics for HOI Detection** | |
- Focus on Vision-Language Models (VLMs) improving human-object interaction detection without heavy training. | |
- Utilizes small visual cues and textual features for robust multimodal understanding. | |
2. **DynaSearcher: Small-Scale Model Enhanced Search** | |
- Multi-step retrieval agents powered by LLMs face issues like factual inconsistency and inefficient search. | |
- Introduces dynamic knowledge graph augmentation with reinforcement learning. | |
- Achieves competitive accuracy using small models and limited compute resources. | |
3. **R-Stitch: Efficient Chain-of-Thought Reasoning** | |
- Improves reasoning in LLMs by hybrid decoding using small and large models. | |
- Small models generate tokens by default, large models intervene selectively. | |
- Up to 85% inference speed-up with minimal accuracy loss. | |
4. **R4ec: Reasoning, Reflection, and Refinement in Recommendation Systems** | |
- Utilizes LLMs for recommendation systems with System-2 like slow, reflective thinking. | |
- Combines actor and reflection models to enhance recommendations. | |
5. **Reinforcement Learning Fine-Tunes Sparse Subnetworks in LLMs** | |
- RL fine-tuning changes only a small fraction of parameters in LLMs. | |
- Sparse subnetworks consistently updated, enabling efficient fine-tuning. | |
## Papers | |
- [Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456)](http://arxiv.org/pdf/2507.17456v1) | |
- [DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365)](http://arxiv.org/pdf/2507.17365v1) | |
- [R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307)](http://arxiv.org/pdf/2507.17307v1) | |
- [R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249)](http://arxiv.org/pdf/2507.17249v1) | |
- [Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107)](http://arxiv.org/pdf/2507.17107v1) | |
This selection highlights practical approaches to using both small and large language models, emphasizing efficiency, multi-step reasoning, reinforcement learning adaptations, and real-world applications. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment