hugobowne · July 24, 2025 14:12
diff --git a/Summary of Research on Small Large Language Models b/Summary of Research on Small Large Language Models
 # Summary of Research on Small Large Language Models

 This summary provides an overview of recent research related to the application and development of small large language models (LLMs) or techniques leveraging small models in conjunction with larger ones to improve efficiency, accuracy, or scalability.

 ## Summary of Work

 1. **Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection**
   - This work explores the usage of vision-language models which can include smaller models for enhanced semantic understanding without extensive training, showing competitive results especially on rare interactions.

 2. **DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning**
   - Demonstrates that complex multi-step retrieval systems can achieve state-of-the-art accuracy while using only small-scale models combined with efficient techniques such as reinforcement learning and knowledge graphs, thereby matching larger LLMs in performance but with fewer computational resources.

 3. **R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning**
   - Proposes a hybrid decoding method switching between small language models (SLM) and large language models (LLM) based on confidence levels, significantly reducing inference latency by up to 85% while maintaining accuracy, showing the potential of small models to accelerate large model tasks.

 4. **R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems**
   - Employs a System-2 like reasoning framework with an actor and reflection model, leveraging potentially smaller models for iterative improvement, showing practical impact in recommendation systems and revenue enhancement.

 5. **Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models**
   - Finds that reinforcement learning fine-tuning primarily updates only a small subset of parameters (5-30%), leaving most weights fixed, indicating that effective small subnetworks within large models can be adaptively fine-tuned, opening avenues for parameter-efficient training.

 ## Papers

 1. Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456v1) - [Link](http://arxiv.org/pdf/2507.17456v1)
 2. DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365v1) - [Link](http://arxiv.org/pdf/2507.17365v1)
 3. R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307v1) - [Link](http://arxiv.org/pdf/2507.17307v1)
 4. R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249v1) - [Link](http://arxiv.org/pdf/2507.17249v1)
 5. Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107v1) - [Link](http://arxiv.org/pdf/2507.17107v1)
	# Summary of Research on Small Large Language Models

	This summary provides an overview of recent research related to the application and development of small large language models (LLMs) or techniques leveraging small models in conjunction with larger ones to improve efficiency, accuracy, or scalability.

	## Summary of Work

	1. Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection
	- This work explores the usage of vision-language models which can include smaller models for enhanced semantic understanding without extensive training, showing competitive results especially on rare interactions.

	2. DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning
	- Demonstrates that complex multi-step retrieval systems can achieve state-of-the-art accuracy while using only small-scale models combined with efficient techniques such as reinforcement learning and knowledge graphs, thereby matching larger LLMs in performance but with fewer computational resources.

	3. R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning
	- Proposes a hybrid decoding method switching between small language models (SLM) and large language models (LLM) based on confidence levels, significantly reducing inference latency by up to 85% while maintaining accuracy, showing the potential of small models to accelerate large model tasks.

	4. R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems
	- Employs a System-2 like reasoning framework with an actor and reflection model, leveraging potentially smaller models for iterative improvement, showing practical impact in recommendation systems and revenue enhancement.

	5. Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models
	- Finds that reinforcement learning fine-tuning primarily updates only a small subset of parameters (5-30%), leaving most weights fixed, indicating that effective small subnetworks within large models can be adaptively fine-tuned, opening avenues for parameter-efficient training.

	## Papers

	1. Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456v1) - [Link](http://arxiv.org/pdf/2507.17456v1)
	2. DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365v1) - [Link](http://arxiv.org/pdf/2507.17365v1)
	3. R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307v1) - [Link](http://arxiv.org/pdf/2507.17307v1)
	4. R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249v1) - [Link](http://arxiv.org/pdf/2507.17249v1)
	5. Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107v1) - [Link](http://arxiv.org/pdf/2507.17107v1)