hugobowne · July 24, 2025 14:15
diff --git a/Small and Large Language Models - Summary of Recent Research b/Small and Large Language Models - Summary of Recent Research
 # Small and Large Language Models

 This summary provides insights from recent papers related to small and large language models (LLMs) and their applications, performance, and efficiency improvements.

 ## Summary of Work

 1. **Dynamic Scoring with Enhanced Semantics for HOI Detection**
   - Focus on Vision-Language Models (VLMs) improving human-object interaction detection without heavy training.
   - Utilizes small visual cues and textual features for robust multimodal understanding.

 2. **DynaSearcher: Small-Scale Model Enhanced Search**
   - Multi-step retrieval agents powered by LLMs face issues like factual inconsistency and inefficient search.
   - Introduces dynamic knowledge graph augmentation with reinforcement learning.
   - Achieves competitive accuracy using small models and limited compute resources.

 3. **R-Stitch: Efficient Chain-of-Thought Reasoning**
   - Improves reasoning in LLMs by hybrid decoding using small and large models.
   - Small models generate tokens by default, large models intervene selectively.
   - Up to 85% inference speed-up with minimal accuracy loss.

 4. **R4ec: Reasoning, Reflection, and Refinement in Recommendation Systems**
   - Utilizes LLMs for recommendation systems with System-2 like slow, reflective thinking.
   - Combines actor and reflection models to enhance recommendations.

 5. **Reinforcement Learning Fine-Tunes Sparse Subnetworks in LLMs**
   - RL fine-tuning changes only a small fraction of parameters in LLMs.
   - Sparse subnetworks consistently updated, enabling efficient fine-tuning.

 ## Papers

 - [Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456)](http://arxiv.org/pdf/2507.17456v1)
 - [DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365)](http://arxiv.org/pdf/2507.17365v1)
 - [R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307)](http://arxiv.org/pdf/2507.17307v1)
 - [R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249)](http://arxiv.org/pdf/2507.17249v1)
 - [Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107)](http://arxiv.org/pdf/2507.17107v1)

 This selection highlights practical approaches to using both small and large language models, emphasizing efficiency, multi-step reasoning, reinforcement learning adaptations, and real-world applications.
	# Small and Large Language Models

	This summary provides insights from recent papers related to small and large language models (LLMs) and their applications, performance, and efficiency improvements.

	## Summary of Work

	1. Dynamic Scoring with Enhanced Semantics for HOI Detection
	- Focus on Vision-Language Models (VLMs) improving human-object interaction detection without heavy training.
	- Utilizes small visual cues and textual features for robust multimodal understanding.

	2. DynaSearcher: Small-Scale Model Enhanced Search
	- Multi-step retrieval agents powered by LLMs face issues like factual inconsistency and inefficient search.
	- Introduces dynamic knowledge graph augmentation with reinforcement learning.
	- Achieves competitive accuracy using small models and limited compute resources.

	3. R-Stitch: Efficient Chain-of-Thought Reasoning
	- Improves reasoning in LLMs by hybrid decoding using small and large models.
	- Small models generate tokens by default, large models intervene selectively.
	- Up to 85% inference speed-up with minimal accuracy loss.

	4. R4ec: Reasoning, Reflection, and Refinement in Recommendation Systems
	- Utilizes LLMs for recommendation systems with System-2 like slow, reflective thinking.
	- Combines actor and reflection models to enhance recommendations.

	5. Reinforcement Learning Fine-Tunes Sparse Subnetworks in LLMs
	- RL fine-tuning changes only a small fraction of parameters in LLMs.
	- Sparse subnetworks consistently updated, enabling efficient fine-tuning.

	## Papers

	- [Dynamic Scoring with Enhanced Semantics for Training-Free Human-Object Interaction Detection (arXiv:2507.17456)](http://arxiv.org/pdf/2507.17456v1)
	- [DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning (arXiv:2507.17365)](http://arxiv.org/pdf/2507.17365v1)
	- [R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning (arXiv:2507.17307)](http://arxiv.org/pdf/2507.17307v1)
	- [R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems (arXiv:2507.17249)](http://arxiv.org/pdf/2507.17249v1)
	- [Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models (arXiv:2507.17107)](http://arxiv.org/pdf/2507.17107v1)

	This selection highlights practical approaches to using both small and large language models, emphasizing efficiency, multi-step reasoning, reinforcement learning adaptations, and real-world applications.