This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # ------ install moshi ------ | |
| # git clone https://github.com/kyutai-labs/moshi.git | |
| # cd moshi && git checkout 0395bd6c9a95e899c397a68c75f300f3b5409b2c | |
| # uv pip install -e . | |
| # ---------------------------- | |
| import torch | |
| from moshi import run_inference | |
| args = { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # ------ install moshi ------ | |
| # git clone https://github.com/kyutai-labs/moshi.git | |
| # cd moshi && git checkout 0395bd6c9a95e899c397a68c75f300f3b5409b2c | |
| # uv pip install -e . | |
| # ---------------------------- | |
| import torch | |
| from moshi import run_inference | |
| args = { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # TEST GREEDY FLOAT 32 | |
| # make sure to clone [email protected]:eustlb/csm.git and checkout compare-trfms | |
| import sys | |
| sys.path.insert(0, "./csm") | |
| from generator import load_csm_1b, Segment | |
| from datasets import load_dataset, Audio | |
| from huggingface_hub import hf_hub_download |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # TEST GREEDY FLOAT 32 | |
| # make sure to clone [email protected]:eustlb/csm.git and checkout compare-trfms | |
| import sys | |
| sys.path.insert(0, "./csm") | |
| from generator import load_csm_1b, Segment | |
| from huggingface_hub import hf_hub_download | |
| import torch | |
| import torchaudio |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # TEST GREEDY FLOAT 32 | |
| # make sure to clone [email protected]:eustlb/csm.git and checkout compare-trfms | |
| import sys | |
| sys.path.insert(0, "./csm") | |
| from generator import load_csm_1b, Segment | |
| from datasets import load_dataset, Audio | |
| from huggingface_hub import hf_hub_download | |
| import torch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # TEST GREEDY FLOAT 32 | |
| # make sure to clone [email protected]:eustlb/csm.git and checkout compare-trfms | |
| import sys | |
| sys.path.insert(0, "./csm") | |
| from generator import load_csm_1b, Segment | |
| from datasets import load_dataset, Audio | |
| from huggingface_hub import hf_hub_download | |
| import torch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import torch | |
| from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor | |
| from datasets import load_dataset | |
| device = "cuda:0" if torch.cuda.is_available() else "cpu" | |
| torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32 | |
| model_id = "openai/whisper-large-v3-turbo" | |
| model = AutoModelForSpeechSeq2Seq.from_pretrained( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from datasets import load_dataset, Audio | |
| from transformers import ( | |
| CsmForConditionalGeneration, | |
| TrainingArguments, | |
| CsmProcessor, | |
| Trainer | |
| ) | |
| processor = CsmProcessor.from_pretrained("eustlb/csm-1b") | |
| model = CsmForConditionalGeneration.from_pretrained("eustlb/csm-1b") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from typing import IO | |
| from datatrove.io import get_datafolder | |
| from datatrove.executor import SlurmPipelineExecutor | |
| from datatrove.pipeline.readers import ParquetReader | |
| from datatrove.pipeline.writers import ParquetWriter | |
| from datatrove.utils.typeshelper import StatHints | |
| class ParquetReaderInMemory(ParquetReader): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| traces-*/ |