Skip to content

Instantly share code, notes, and snippets.

@ruvnet
Last active March 10, 2024 17:00
Show Gist options
  • Save ruvnet/809d0312c1c599ba29721c93a20a741c to your computer and use it in GitHub Desktop.
Save ruvnet/809d0312c1c599ba29721c93a20a741c to your computer and use it in GitHub Desktop.
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models

Here are some of the key benefits of combining multiple LoRA adapters using techniques like Mixture-of-LoRAs (MoA):

  1. Efficient multi-task learning: MoA allows training multiple task-specific LoRA adapters separately, and then combining them flexibly for multi-task inference. This avoids interference between heterogeneous tasks during training.

  2. Parameter efficiency: LoRA adapters are very parameter efficient compared to fine-tuning the entire model. Combining multiple LoRAs is much more efficient than combining multiple fine-tuned models.

  3. Improved single-task performance: The MoA routing mechanism allows the model to learn complementary knowledge across tasks that can actually improve single-task performance compared to a LoRA trained on just that task's data. The adapters specialize while also benefiting from positive transfer.

  4. Flexible composition: LoRA adapters can be arbitrarily mixed and matched after training to create models spanning different capability combinations as needed. Adapters can be added, removed or updated individually.

  5. Fast adaptation: Each LoRA adapter can be efficiently updated on new data from its task, without having to retrain the entire multi-task model. This allows fast adaptation to new domains.

  6. Computational efficiency: MoA uses a simple but effective routing mechanism to select the relevant adapters for an input, avoiding computation in irrelevant adapters. The routing parameters are lightweight.

In summary, MoA provides an elegant way to get the benefits of both specialization and positive transfer in a modular, computationally efficient multi-task architecture based on composable LoRA adapters. This opens up applications in flexibly combining domain-specific expertise in large language models.

Citations: [1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/2592765/353e7a64-bcf9-492f-aeb5-e72f42d8654e/Mixture of Lora .pdf [2] https://www.reddit.com/r/LocalLLaMA/comments/1b6zxn8/does_merging_of_based_model_with_lora_weight/ [3] https://openreview.net/forum?id=uWvKBCYh4S [4] https://arxiv.org/abs/2403.03432 [5] https://huggingface.co/blog/peft_merging [6] https://arxiv.org/html/2403.03432v1 [7] https://arxiv.org/abs/1907.07804v2 [8] https://arxiv.org/pdf/2311.03285.pdf [9] https://openreview.net/pdf?id=uWvKBCYh4S [10] https://openaccess.thecvf.com/content/CVPR2023/papers/Chen_Mod-Squad_Designing_Mixtures_of_Experts_As_Modular_Multi-Task_Learners_CVPR_2023_paper.pdf [11] https://www.databricks.com/blog/efficient-fine-tuning-lora-guide-llms [12] https://www.reddit.com/r/LocalLLaMA/comments/1agntgh/introducing_lorax_v07_mixture_of_loras_linear/ [13] http://d-scholarship.pitt.edu/43700/13/Final_ETD_v2_fixed_comments_1.pdf [14] https://www.datacamp.com/tutorial/mastering-low-rank-adaptation-lora-enhancing-large-language-models-for-efficient-adaptation [15] https://paperswithcode.com/paper/multimodal-instruction-tuning-with [16] https://openreview.net/forum?id=HJgdo6VFPH [17] https://cameronrwolfe.substack.com/p/easily-train-a-specialized-llm-peft [18] https://maszhongming.github.io/Multi-LoRA-Composition/ [19] https://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-54.pdf [20] https://huggingface.co/docs/diffusers/en/training/lora [21] huggingface/peft#643 [22] https://www.cs.ubc.ca/~schmidtm/MLRG/Multi-task%20Learning.pdf [23] https://arxiv.org/html/2402.15414v1 [24] https://towardsdatascience.com/multi-task-architectures-9bee2e080456

Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Combining Multiple LoRA Adapters with peft\n",
"\n",
"This notebook shows how to efficiently combine multiple LoRA (Low-Rank Adaptation) adapters using the [peft](https://github.com/huggingface/peft) library to enable multi-task learning in large language models.\n",
"\n",
"LoRA allows adapting pretrained models like LLaMA to new tasks by training only a small set of weights. This enables parameter-efficient fine-tuning.\n",
"\n",
"By combining multiple LoRA adapters, we can leverage knowledge from different specialized adapters to improve performance on downstream tasks that require multiple skills."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Install peft library\n",
"!pip install peft"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from peft import PeftModel, PeftConfig \n",
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
"\n",
"# Load base model and tokenizer\n",
"model_name = \"decapoda-research/llama-7b-hf\"\n",
"model = AutoModelForCausalLM.from_pretrained(model_name)\n",
"tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
"\n",
"# Load individual LoRA adapters\n",
"adapter_names = [\"alpaca-lora\", \"code-lora\", \"summarize-lora\"]\n",
"adapters = {name: PeftModel.from_pretrained(model, name) for name in adapter_names}\n",
"\n",
"# Combine adapters using peft's `add_weighted_adapter` method\n",
"model.add_weighted_adapter(adapter_names, weights=[1.0, 0.8, 0.5])\n",
"\n",
"# Test the combined model\n",
"prompt = \"\"\"Summarize the following Python code:\n",
"\n",
"def fibonacci(n):\n",
" if n <= 0:\n",
" return 0\n",
" elif n == 1:\n",
" return 1\n",
" else:\n",
" return fibonacci(n-1) + fibonacci(n-2)\n",
"\"\"\"\n",
"\n",
"input_ids = tokenizer.encode(prompt, return_tensors=\"pt\")\n",
"output = model.generate(input_ids, max_new_tokens=50)\n",
"print(tokenizer.decode(output[0]))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment