https://github.com/RVC-Boss/GPT-SoVITS
-
去 https://www.runpod.io/console/pods 開一個新嘅 pod,我用嘅係 RTX 4000 Ada,顯存 20GB。注意 disk volume 最好至少 80 GB,模板就用 PyTorch 2.1 就得。
-
連上個 Pod 之後按照呢度 https://docs.anaconda.com/miniconda/install/#quick-command-line-install 安裝 miniconda
mkdir -p ~/minicond~a3 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 rm ~/miniconda3/miniconda.sh ~/miniconda3/bin/conda init bash ~/miniconda3/bin/conda init zsh
-
跟住新開一個終端跑下面嘅命令嚟配置環境
apt update apt install git-lfs # 複製 repo 安裝環境依賴 # git clone https://github.com/RVC-Boss/GPT-SoVITS git clone https://github.com/hon9kon9ize/GPT-SoVITS-Cantonese cd GPT-SoVITS git lfs install git lfs pull # 安裝環境依賴 conda create -n GPTSoVits python=3.9 conda activate GPTSoVits bash install.sh ```
然後進入下一步,準備預訓練模型。
到到呢步,你會有個GPT-SoVITS/GPT_SoVITS/pretrained_models
, 不過入面係空嘅。你可以刪咗佢先,然後用下面嘅命令嚟將 https://huggingface.co/lj1995/GPT-SoVITS 入面嘅預訓練模型下載落去:
cd GPT_SoVITS/pretrained_models
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/s1bert25hz-2kh-longer-epoch%3D68e-step%3D50232.ckpt
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/s2D488k.pth
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/s2G488k.pth
mkdir chinese-hubert-base
mkdir chinese-roberta-wwm-ext-large
mkdir gsv-v2final-pretrained
cd chinese-hubert-base
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/chinese-hubert-base/config.json
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/chinese-hubert-base/preprocessor_config.json
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/chinese-hubert-base/pytorch_model.bin
cd chinese-roberta-wwm-ext-large
# wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/chinese-roberta-wwm-ext-large/config.json
# wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/chinese-roberta-wwm-ext-large/pytorch_model.bin
# wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/chinese-roberta-wwm-ext-large/tokenizer.json
wget https://huggingface.co/hon9kon9ize/bert-large-cantonese/resolve/main/config.json
wget https://huggingface.co/hon9kon9ize/bert-large-cantonese/resolve/main/pytorch_model.bin
wget https://huggingface.co/hon9kon9ize/bert-large-cantonese/resolve/main/tokenizer.json
cd gsv-v2final-pretrained
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/gsv-v2final-pretrained/s1bert25hz-5kh-longer-epoch%3D12-step%3D369668.ckpt
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/gsv-v2final-pretrained/s2D2333k.pth
wget https://huggingface.co/lj1995/GPT-SoVITS/resolve/main/gsv-v2final-pretrained/s2G2333k.pth
跟住跑去webui.py
同埋GPT_SoVITS/inference_webui.py
嗰度,將最下面嘅lauch()
嗰度改成share=True
,噉樣方便等陣直接喺自己瀏覽器度操作。然後進入下一步準備訓練數據。
我哋呢度用嘅係張悦楷語音數據集
-
首先跑下面嘅命令拉取數據:
cd /workspace # 複製個數據集落嚟 git clone --filter=blob:none --sparse https://huggingface.co/datasets/laubonghaudoi/zoengjyutgaai_saamgwokjinji cd zoengjyutgaai_saamgwokjinji git sparse-checkout init --cone git sparse-checkout set wav git checkout # 將啲 wav 搬過去 mv wav/ ../GPT-SoVITS
-
跟住返去
GPT-SoVITS
,將個wav/
文件夾重命名成一個數據集名,例如zoengjyutgaai/
。 -
然後將入面嘅
metadata.csv
改成下面呢個格式嘅metadata.list
文件:/workspace/GPT-SoVITS/zoengjyutgaai/001/001_001.wav|zoengjyutgaai|yue|各位朋友,喺講《三國演義》之前啊,我唸一首詞畀大家聽下吓。 /workspace/GPT-SoVITS/zoengjyutgaai/001/001_002.wav|zoengjyutgaai|yue|滾滾長江東逝水,浪花淘盡英雄。 ...
注意係冇表頭嘅,直接第一行就擺數據。
呢個時候啲數據就準備好嘞,可以下一步開始訓練。
cd /workspace/GPT-SoVITS
python3 webui.py
跟住打開個 gradio 頁面,直接去 1-GPT-SOVITS-TTS 嗰個 tab 度,可以見到下面有三個子 tab。
- 首先寫低個 Experiment/model name 譬如
exp1
,跟住喺 Text labelling file 嗰度寫zoengjyutgaai/metadata.list
,跟住 Audio dataset folder 嗰度直接留空唔寫。 - 然後直接撳最下面嘅Start one-click formatting,佢就會開始預處理數據。注意個第二步會花比較長時間,要慢慢等。你亦都可以分開三粒掣撳
- 跟住就去1B-Fine-tuned training個標籤頁度,揀好 batch size、epoch、text model learning rate weighting、save frequency 幾個參數,然後撳掣開始訓練 SoVITS。作為參考,我用 RTX 4000 Ada 嘅 20GB 顯存可以開到 batch size=16。一般 epcoh 都唔會超過 20。
- 訓練完 SoVITS 之後,就繼續揀好下面嘅參數訓練 GPT。