Last active
April 9, 2024 12:59
-
-
Save ouor/ca133620d64cfef95cfc80911370424b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Sun Jan 29 13:06:17 2023 \n", | |
"+-----------------------------------------------------------------------------+\n", | |
"| NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 11.4 |\n", | |
"|-------------------------------+----------------------+----------------------+\n", | |
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n", | |
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n", | |
"| | | MIG M. |\n", | |
"|===============================+======================+======================|\n", | |
"| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |\n", | |
"| N/A 42C P0 25W / 70W | 0MiB / 15109MiB | 5% Default |\n", | |
"| | | N/A |\n", | |
"+-------------------------------+----------------------+----------------------+\n", | |
" \n", | |
"+-----------------------------------------------------------------------------+\n", | |
"| Processes: |\n", | |
"| GPU GI CI PID Type Process name GPU Memory |\n", | |
"| ID ID Usage |\n", | |
"|=============================================================================|\n", | |
"| No running processes found |\n", | |
"+-----------------------------------------------------------------------------+\n" | |
] | |
} | |
], | |
"source": [ | |
"! nvidia-smi" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### ★Set variables" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"speaker_name = 'test' # enter speaker name here\n", | |
"dataset_url = 'https://drive.google.com/u/0/uc?id=1sya8W09n1EauPvVVLe0xuCcME9GqkPBM'\n", | |
"# enter google drive url include \"uc\"\n", | |
"ngrok_token = '1q2w3e4r5t6y7u8i9o0p' # enter your ngrok token here" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Clone and Install requirement, Initial models" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"'diff-svc'에 복제합니다...\n", | |
"remote: Enumerating objects: 741, done.\u001b[K\n", | |
"remote: Counting objects: 100% (222/222), done.\u001b[K\n", | |
"remote: Compressing objects: 100% (59/59), done.\u001b[K\n", | |
"remote: Total 741 (delta 184), reused 163 (delta 163), pack-reused 519\u001b[K\n", | |
"오브젝트를 받는 중: 100% (741/741), 62.11 MiB | 19.87 MiB/s, 완료.\n", | |
"델타를 알아내는 중: 100% (346/346), 완료.\n", | |
"받기:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease [1,581 B]\n", | |
"무시:2 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64 InRelease\n", | |
"기존:3 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu2004/x86_64 Release\n", | |
"기존:4 http://archive.ubuntu.com/ubuntu focal InRelease \n", | |
"받기:5 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB] \n", | |
"받기:6 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB] \n", | |
"오류:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease\n", | |
" 다음 서명들은 공개키가 없기 때문에 인증할 수 없습니다: NO_PUBKEY A4B469963BF863CC\n", | |
"받기:8 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB] [33m\u001b[33m\u001b[33m\u001b[33m\n", | |
"받기:9 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1,290 kB]3m\n", | |
"받기:10 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [1,882 kB]\n", | |
"받기:11 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [2,920 kB]\n", | |
"받기:12 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [2,009 kB][33m\n", | |
"받기:13 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [2,442 kB]33m\u001b[33m\u001b[33m\n", | |
"받기:14 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [988 kB]\n", | |
"패키지 목록을 읽는 중입니다... 완료% \u001b[0m \u001b[0m \u001b[33m\u001b[33m\u001b[33m\n", | |
"\u001b[1;33mW: \u001b[0mGPG 오류: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease: 다음 서명들은 공개키가 없기 때문에 인증할 수 없습니다: NO_PUBKEY A4B469963BF863CC\u001b[0m\n", | |
"\u001b[1;31mE: \u001b[0mThe repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease' is no longer signed.\u001b[0m\n", | |
"\u001b[33mN: \u001b[0mUpdating from such a repository can't be done securely, and is therefore disabled by default.\u001b[0m\n", | |
"\u001b[33mN: \u001b[0mSee apt-secure(8) manpage for repository creation and user configuration details.\u001b[0m\n", | |
"패키지 목록을 읽는 중입니다... 완료%\n", | |
"의존성 트리를 만드는 중입니다 \n", | |
"상태 정보를 읽는 중입니다... 완료\n", | |
"패키지 zip는 이미 최신 버전입니다 (3.0-11build1).\n", | |
"패키지 build-essential는 이미 최신 버전입니다 (12.8ubuntu1.1).\n", | |
"패키지 unzip는 이미 최신 버전입니다 (6.0-25ubuntu1.1).\n", | |
"패키지 ffmpeg는 이미 최신 버전입니다 (7:4.2.7-0ubuntu0.1).\n", | |
"패키지 libpython3.9-dev는 이미 최신 버전입니다 (3.9.5-3ubuntu0~20.04.1).\n", | |
"패키지 python3.9-dev는 이미 최신 버전입니다 (3.9.5-3ubuntu0~20.04.1).\n", | |
"0개 업그레이드, 0개 새로 설치, 0개 제거, 177개 업그레이드 안 함.\n", | |
"Requirement already satisfied: gdown in /usr/local/lib/python3.8/dist-packages (4.6.0)\n", | |
"Requirement already satisfied: tensorflow in /usr/local/lib/python3.8/dist-packages (2.8.0)\n", | |
"Requirement already satisfied: pyyaml in /usr/local/lib/python3.8/dist-packages (5.4.1)\n", | |
"Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from gdown) (3.9.0)\n", | |
"Requirement already satisfied: six in /usr/lib/python3/dist-packages (from gdown) (1.14.0)\n", | |
"Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.8/dist-packages (from gdown) (4.11.1)\n", | |
"Requirement already satisfied: requests[socks] in /usr/lib/python3/dist-packages (from gdown) (2.22.0)\n", | |
"Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from gdown) (4.64.1)\n", | |
"Requirement already satisfied: protobuf>=3.9.2 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (3.19.4)\n", | |
"Requirement already satisfied: keras-preprocessing>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.1.2)\n", | |
"Requirement already satisfied: flatbuffers>=1.12 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (2.0)\n", | |
"Requirement already satisfied: keras<2.9,>=2.8.0rc0 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (2.8.0)\n", | |
"Requirement already satisfied: tf-estimator-nightly==2.8.0.dev2021122109 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (2.8.0.dev2021122109)\n", | |
"Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.13.3)\n", | |
"Requirement already satisfied: grpcio<2.0,>=1.24.3 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.43.0)\n", | |
"Requirement already satisfied: numpy>=1.20 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.22.2)\n", | |
"Requirement already satisfied: typing-extensions>=3.6.6 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (4.1.1)\n", | |
"Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (0.2.0)\n", | |
"Requirement already satisfied: gast>=0.2.1 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (0.5.3)\n", | |
"Requirement already satisfied: tensorboard<2.9,>=2.8 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (2.8.0)\n", | |
"Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.1.0)\n", | |
"Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from tensorflow) (45.2.0)\n", | |
"Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (3.6.0)\n", | |
"Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (3.3.0)\n", | |
"Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (0.24.0)\n", | |
"Requirement already satisfied: libclang>=9.0.1 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (13.0.0)\n", | |
"Requirement already satisfied: absl-py>=0.4.0 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.0.0)\n", | |
"Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.8/dist-packages (from tensorflow) (1.6.3)\n", | |
"Requirement already satisfied: wheel<1.0,>=0.23.0 in /usr/lib/python3/dist-packages (from astunparse>=1.6.0->tensorflow) (0.34.2)\n", | |
"Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.8/dist-packages (from tensorboard<2.9,>=2.8->tensorflow) (2.0.3)\n", | |
"Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.8/dist-packages (from tensorboard<2.9,>=2.8->tensorflow) (1.8.1)\n", | |
"Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.8/dist-packages (from tensorboard<2.9,>=2.8->tensorflow) (0.6.1)\n", | |
"Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.8/dist-packages (from tensorboard<2.9,>=2.8->tensorflow) (0.4.6)\n", | |
"Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.8/dist-packages (from tensorboard<2.9,>=2.8->tensorflow) (2.6.0)\n", | |
"Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.8/dist-packages (from tensorboard<2.9,>=2.8->tensorflow) (3.3.6)\n", | |
"Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.8/dist-packages (from beautifulsoup4->gdown) (2.3.2.post1)\n", | |
"Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.8/dist-packages (from requests[socks]->gdown) (1.7.1)\n", | |
"Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow) (5.0.0)\n", | |
"Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.8/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow) (0.2.8)\n", | |
"Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.8/dist-packages (from google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow) (4.8)\n", | |
"Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.8/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.9,>=2.8->tensorflow) (1.3.1)\n", | |
"Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.8/dist-packages (from markdown>=2.6.8->tensorboard<2.9,>=2.8->tensorflow) (4.11.1)\n", | |
"Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.8/dist-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.9,>=2.8->tensorflow) (3.7.0)\n", | |
"Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.8/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.9,>=2.8->tensorflow) (0.4.8)\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.8/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.9,>=2.8->tensorflow) (3.2.0)\n", | |
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n", | |
"\u001b[0m\u001b[33mWARNING: You are using pip version 22.0.3; however, version 22.3.1 is available.\n", | |
"You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n", | |
"\u001b[0mRequirement already satisfied: torchcrepe in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 1)) (0.0.17)\n", | |
"Requirement already satisfied: praat-parselmouth==0.4.1 in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 2)) (0.4.1)\n", | |
"Requirement already satisfied: scikit-image in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 3)) (0.19.3)\n", | |
"Requirement already satisfied: ipython in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 4)) (8.0.1)\n", | |
"Requirement already satisfied: ipykernel in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 5)) (6.9.1)\n", | |
"Requirement already satisfied: pyloudnorm in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 6)) (0.1.1)\n", | |
"Requirement already satisfied: webrtcvad in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 7)) (2.0.10)\n", | |
"Requirement already satisfied: h5py in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 8)) (3.6.0)\n", | |
"Requirement already satisfied: einops in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 9)) (0.6.0)\n", | |
"Requirement already satisfied: pycwt in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 10)) (0.3.0a22)\n", | |
"Requirement already satisfied: torchmetrics==0.5 in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 11)) (0.5.0)\n", | |
"Requirement already satisfied: pytorch_lightning==1.3.3 in /usr/local/lib/python3.8/dist-packages (from -r requirements_short.txt (line 12)) (1.3.3)\n", | |
"Requirement already satisfied: numpy>=1.7.0 in /usr/local/lib/python3.8/dist-packages (from praat-parselmouth==0.4.1->-r requirements_short.txt (line 2)) (1.22.2)\n", | |
"Requirement already satisfied: torch>=1.3.1 in /usr/local/lib/python3.8/dist-packages (from torchmetrics==0.5->-r requirements_short.txt (line 11)) (1.10.2+cu113)\n", | |
"Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from torchmetrics==0.5->-r requirements_short.txt (line 11)) (21.3)\n", | |
"Requirement already satisfied: tqdm>=4.41.0 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (4.64.1)\n", | |
"Requirement already satisfied: tensorboard!=2.5.0,>=2.2.0 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2.8.0)\n", | |
"Requirement already satisfied: fsspec[http]>=2021.4.0 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2023.1.0)\n", | |
"Requirement already satisfied: future>=0.17.1 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.18.3)\n", | |
"Requirement already satisfied: PyYAML<=5.4.1,>=5.1 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (5.4.1)\n", | |
"Requirement already satisfied: pyDeprecate==0.3.0 in /usr/local/lib/python3.8/dist-packages (from pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.3.0)\n", | |
"Requirement already satisfied: scipy in /usr/local/lib/python3.8/dist-packages (from torchcrepe->-r requirements_short.txt (line 1)) (1.8.0)\n", | |
"Requirement already satisfied: librosa==0.9.1 in /usr/local/lib/python3.8/dist-packages (from torchcrepe->-r requirements_short.txt (line 1)) (0.9.1)\n", | |
"Requirement already satisfied: resampy in /usr/local/lib/python3.8/dist-packages (from torchcrepe->-r requirements_short.txt (line 1)) (0.4.2)\n", | |
"Requirement already satisfied: audioread>=2.1.5 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (3.0.0)\n", | |
"Requirement already satisfied: scikit-learn>=0.19.1 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (1.0.2)\n", | |
"Requirement already satisfied: numba>=0.45.1 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (0.56.4)\n", | |
"Requirement already satisfied: pooch>=1.0 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (1.6.0)\n", | |
"Requirement already satisfied: decorator>=4.0.10 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (5.1.1)\n", | |
"Requirement already satisfied: soundfile>=0.10.2 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (0.11.0)\n", | |
"Requirement already satisfied: joblib>=0.14 in /usr/local/lib/python3.8/dist-packages (from librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (1.1.0)\n", | |
"Requirement already satisfied: pillow!=7.1.0,!=7.1.1,!=8.3.0,>=6.1.0 in /usr/local/lib/python3.8/dist-packages (from scikit-image->-r requirements_short.txt (line 3)) (9.0.1)\n", | |
"Requirement already satisfied: tifffile>=2019.7.26 in /usr/local/lib/python3.8/dist-packages (from scikit-image->-r requirements_short.txt (line 3)) (2023.1.23.1)\n", | |
"Requirement already satisfied: imageio>=2.4.1 in /usr/local/lib/python3.8/dist-packages (from scikit-image->-r requirements_short.txt (line 3)) (2.25.0)\n", | |
"Requirement already satisfied: PyWavelets>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from scikit-image->-r requirements_short.txt (line 3)) (1.4.1)\n", | |
"Requirement already satisfied: networkx>=2.2 in /usr/local/lib/python3.8/dist-packages (from scikit-image->-r requirements_short.txt (line 3)) (3.0)\n", | |
"Requirement already satisfied: pygments in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (2.11.2)\n", | |
"Requirement already satisfied: black in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (22.1.0)\n", | |
"Requirement already satisfied: pickleshare in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (0.7.5)\n", | |
"Requirement already satisfied: stack-data in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (0.2.0)\n", | |
"Requirement already satisfied: setuptools>=18.5 in /usr/lib/python3/dist-packages (from ipython->-r requirements_short.txt (line 4)) (45.2.0)\n", | |
"Requirement already satisfied: traitlets>=5 in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (5.1.1)\n", | |
"Requirement already satisfied: jedi>=0.16 in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (0.18.1)\n", | |
"Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (0.1.3)\n", | |
"Requirement already satisfied: backcall in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (0.2.0)\n", | |
"Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (4.8.0)\n", | |
"Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from ipython->-r requirements_short.txt (line 4)) (3.0.28)\n", | |
"Requirement already satisfied: jupyter-client<8.0 in /usr/local/lib/python3.8/dist-packages (from ipykernel->-r requirements_short.txt (line 5)) (7.1.2)\n", | |
"Requirement already satisfied: tornado<7.0,>=4.2 in /usr/local/lib/python3.8/dist-packages (from ipykernel->-r requirements_short.txt (line 5)) (6.1)\n", | |
"Requirement already satisfied: nest-asyncio in /usr/local/lib/python3.8/dist-packages (from ipykernel->-r requirements_short.txt (line 5)) (1.5.4)\n", | |
"Requirement already satisfied: debugpy<2.0,>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from ipykernel->-r requirements_short.txt (line 5)) (1.5.1)\n", | |
"Requirement already satisfied: matplotlib in /usr/local/lib/python3.8/dist-packages (from pycwt->-r requirements_short.txt (line 10)) (3.5.1)\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.8/dist-packages (from fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (3.8.3)\n", | |
"Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2.22.0)\n", | |
"Requirement already satisfied: parso<0.9.0,>=0.8.0 in /usr/local/lib/python3.8/dist-packages (from jedi>=0.16->ipython->-r requirements_short.txt (line 4)) (0.8.3)\n", | |
"Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.8/dist-packages (from jupyter-client<8.0->ipykernel->-r requirements_short.txt (line 5)) (2.8.2)\n", | |
"Requirement already satisfied: entrypoints in /usr/local/lib/python3.8/dist-packages (from jupyter-client<8.0->ipykernel->-r requirements_short.txt (line 5)) (0.4)\n", | |
"Requirement already satisfied: pyzmq>=13 in /usr/local/lib/python3.8/dist-packages (from jupyter-client<8.0->ipykernel->-r requirements_short.txt (line 5)) (22.3.0)\n", | |
"Requirement already satisfied: jupyter-core>=4.6.0 in /usr/local/lib/python3.8/dist-packages (from jupyter-client<8.0->ipykernel->-r requirements_short.txt (line 5)) (4.9.2)\n", | |
"Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging->torchmetrics==0.5->-r requirements_short.txt (line 11)) (3.0.7)\n", | |
"Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.8/dist-packages (from pexpect>4.3->ipython->-r requirements_short.txt (line 4)) (0.7.0)\n", | |
"Requirement already satisfied: wcwidth in /usr/local/lib/python3.8/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython->-r requirements_short.txt (line 4)) (0.2.5)\n", | |
"Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2.6.0)\n", | |
"Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (3.3.6)\n", | |
"Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.8.1)\n", | |
"Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.0.0)\n", | |
"Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.6.1)\n", | |
"Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2.0.3)\n", | |
"Requirement already satisfied: protobuf>=3.6.0 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (3.19.4)\n", | |
"Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.43.0)\n", | |
"Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.8/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.4.6)\n", | |
"Requirement already satisfied: wheel>=0.26 in /usr/lib/python3/dist-packages (from tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.34.2)\n", | |
"Requirement already satisfied: typing-extensions in /usr/local/lib/python3.8/dist-packages (from torch>=1.3.1->torchmetrics==0.5->-r requirements_short.txt (line 11)) (4.1.1)\n", | |
"Requirement already satisfied: click>=8.0.0 in /usr/local/lib/python3.8/dist-packages (from black->ipython->-r requirements_short.txt (line 4)) (8.0.3)\n", | |
"Requirement already satisfied: platformdirs>=2 in /usr/local/lib/python3.8/dist-packages (from black->ipython->-r requirements_short.txt (line 4)) (2.5.0)\n", | |
"Requirement already satisfied: mypy-extensions>=0.4.3 in /usr/local/lib/python3.8/dist-packages (from black->ipython->-r requirements_short.txt (line 4)) (0.4.3)\n", | |
"Requirement already satisfied: pathspec>=0.9.0 in /usr/local/lib/python3.8/dist-packages (from black->ipython->-r requirements_short.txt (line 4)) (0.9.0)\n", | |
"Requirement already satisfied: tomli>=1.1.0 in /usr/local/lib/python3.8/dist-packages (from black->ipython->-r requirements_short.txt (line 4)) (2.0.1)\n", | |
"Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib->pycwt->-r requirements_short.txt (line 10)) (1.3.2)\n", | |
"Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.8/dist-packages (from matplotlib->pycwt->-r requirements_short.txt (line 10)) (4.29.1)\n", | |
"Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.8/dist-packages (from matplotlib->pycwt->-r requirements_short.txt (line 10)) (0.11.0)\n", | |
"Requirement already satisfied: asttokens in /usr/local/lib/python3.8/dist-packages (from stack-data->ipython->-r requirements_short.txt (line 4)) (2.0.5)\n", | |
"Requirement already satisfied: pure-eval in /usr/local/lib/python3.8/dist-packages (from stack-data->ipython->-r requirements_short.txt (line 4)) (0.2.2)\n", | |
"Requirement already satisfied: executing in /usr/local/lib/python3.8/dist-packages (from stack-data->ipython->-r requirements_short.txt (line 4)) (0.8.2)\n", | |
"Requirement already satisfied: six in /usr/lib/python3/dist-packages (from absl-py>=0.4->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.14.0)\n", | |
"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.3.1)\n", | |
"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.3.3)\n", | |
"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.8.2)\n", | |
"Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2.1.1)\n", | |
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (21.4.0)\n", | |
"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (6.0.4)\n", | |
"Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (4.0.2)\n", | |
"Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from google-auth<3,>=1.6.3->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (5.0.0)\n", | |
"Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.8/dist-packages (from google-auth<3,>=1.6.3->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.2.8)\n", | |
"Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.8/dist-packages (from google-auth<3,>=1.6.3->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (4.8)\n", | |
"Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.8/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (1.3.1)\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.8/dist-packages (from markdown>=2.6.8->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (4.11.1)\n", | |
"Requirement already satisfied: llvmlite<0.40,>=0.39.0dev0 in /usr/local/lib/python3.8/dist-packages (from numba>=0.45.1->librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (0.39.1)\n", | |
"Requirement already satisfied: appdirs>=1.3.0 in /usr/local/lib/python3.8/dist-packages (from pooch>=1.0->librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (1.4.4)\n", | |
"Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.8/dist-packages (from scikit-learn>=0.19.1->librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (3.1.0)\n", | |
"Requirement already satisfied: cffi>=1.0 in /usr/local/lib/python3.8/dist-packages (from soundfile>=0.10.2->librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (1.15.0)\n", | |
"Requirement already satisfied: pycparser in /usr/local/lib/python3.8/dist-packages (from cffi>=1.0->soundfile>=0.10.2->librosa==0.9.1->torchcrepe->-r requirements_short.txt (line 1)) (2.21)\n", | |
"Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.8/dist-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (3.7.0)\n", | |
"Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.8/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (0.4.8)\n", | |
"Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.8/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard!=2.5.0,>=2.2.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (3.2.0)\n", | |
"Requirement already satisfied: idna>=2.0 in /usr/lib/python3/dist-packages (from yarl<2.0,>=1.0->aiohttp!=4.0.0a0,!=4.0.0a1->fsspec[http]>=2021.4.0->pytorch_lightning==1.3.3->-r requirements_short.txt (line 12)) (2.8)\n", | |
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n", | |
"\u001b[0m\u001b[33mWARNING: You are using pip version 22.0.3; however, version 22.3.1 is available.\n", | |
"You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n", | |
"\u001b[0m" | |
] | |
} | |
], | |
"source": [ | |
"# clone repo\n", | |
"! git clone https://github.com/prophesier/diff-svc\n", | |
"\n", | |
"import os\n", | |
"\n", | |
"home_dir = os.getcwd()\n", | |
"repo_dir = os.path.join(home_dir, 'diff-svc')\n", | |
"os.chdir(repo_dir)\n", | |
"\n", | |
"# install apt packages\n", | |
"! apt update\n", | |
"! apt install build-essential python3.9-dev libpython3.9-dev zip unzip ffmpeg -y\n", | |
"\n", | |
"# install python packages\n", | |
"! pip install gdown tensorflow pyyaml\n", | |
"! pip install -r requirements_short.txt" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"Downloading...\n", | |
"From: https://drive.google.com/u/0/uc?id=1qeAvXvrGWvpiozsin4nwRcdwNVbcf3La\n", | |
"To: /workspace/t4-20230125/diff-svc/checkpoint.zip\n", | |
"100%|██████████| 846M/846M [00:09<00:00, 87.8MB/s] " | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Archive: checkpoint.zip\r\n" | |
] | |
}, | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" creating: ./checkpoints/0102_xiaoma_pe/\n", | |
" inflating: ./checkpoints/0102_xiaoma_pe/config.yaml \n", | |
" inflating: ./checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt \n", | |
" creating: ./checkpoints/0109_hifigan_bigpopcs_hop128/\n", | |
" inflating: ./checkpoints/0109_hifigan_bigpopcs_hop128/config.yaml \n", | |
" inflating: ./checkpoints/0109_hifigan_bigpopcs_hop128/model_ckpt_steps_1512000.ckpt \n", | |
" inflating: ./checkpoints/0109_hifigan_bigpopcs_hop128/model_ckpt_steps_1512000.pth \n", | |
" creating: ./checkpoints/hubert/\n", | |
" inflating: ./checkpoints/hubert/hubert.onnx \n", | |
" inflating: ./checkpoints/hubert/hubert_soft.pt \n" | |
] | |
}, | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"Downloading...\n", | |
"From: https://drive.google.com/u/0/uc?id=1z2qLq7DcInpF15EwhtL8v-IeBWSXnxAD\n", | |
"To: /workspace/t4-20230125/diff-svc/vocoder.zip\n", | |
"100%|██████████| 53.0M/53.0M [00:01<00:00, 49.4MB/s]" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Archive: vocoder.zip\r\n", | |
" inflating: ./checkpoints/nsf_hifigan/config.json \r\n", | |
" inflating: ./checkpoints/nsf_hifigan/model " | |
] | |
}, | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"\r\n", | |
" inflating: ./checkpoints/nsf_hifigan/NOTICE.txt \r\n", | |
" inflating: ./checkpoints/nsf_hifigan/NOTICE.zh-CN.txt \r\n" | |
] | |
} | |
], | |
"source": [ | |
"import gdown\n", | |
"\n", | |
"# dependent checkpoint\n", | |
"gdown.download('https://drive.google.com/u/0/uc?id=1qeAvXvrGWvpiozsin4nwRcdwNVbcf3La', 'checkpoint.zip', quiet=False)\n", | |
"! unzip checkpoint.zip -d ./checkpoints\n", | |
"os.remove('checkpoint.zip')\n", | |
"\n", | |
"# initial vocoder\n", | |
"gdown.download('https://drive.google.com/u/0/uc?id=1z2qLq7DcInpF15EwhtL8v-IeBWSXnxAD', 'vocoder.zip', quiet=False)\n", | |
"! unzip vocoder.zip -d ./checkpoints/nsf_hifigan\n", | |
"os.remove('vocoder.zip')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Dataset" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"Downloading...\n", | |
"From: https://drive.google.com/u/0/uc?id=1sya8W09n1EauPvVVLe0xuCcME9GqkPBM\n", | |
"To: /workspace/t4-20230125/diff-svc/dataset.zip\n", | |
"100%|██████████| 16.9M/16.9M [00:00<00:00, 41.4MB/s]\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Archive: dataset.zip\n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-0.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-1.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-2.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-3.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-4.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-5.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-6.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-7.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-8.wav \n", | |
" inflating: /workspace/t4-20230125/diff-svc/data/raw/test/test-9.wav \n" | |
] | |
} | |
], | |
"source": [ | |
"# dataset\n", | |
"# prior to run this cell, Be sure\n", | |
"# 1. dataset is 44100Hz wav file\n", | |
"# 2. the zip file structure is\n", | |
"# dataset.zip\n", | |
"# ├── some_wave_file_0001.wav\n", | |
"# ├── some_wave_file_0002.wav\n", | |
"# ...\n", | |
"# that means, there is \"NO DIRECTORY\" in zip file\n", | |
"gdown.download(dataset_url, 'dataset.zip', quiet=False)\n", | |
"dataset_dir = os.path.join(repo_dir, 'data', 'raw', speaker_name)\n", | |
"if not os.path.exists(dataset_dir):\n", | |
" os.makedirs(dataset_dir)\n", | |
"! unzip dataset.zip -d {dataset_dir}\n", | |
"os.remove('dataset.zip')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Config settings" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import yaml\n", | |
"\n", | |
"# config file (incl in repo)\n", | |
"config_path = os.path.join(repo_dir, 'training', 'config_nsf.yaml')\n", | |
"your_config_path = config_path.replace('config', 'config_' + speaker_name)\n", | |
"\n", | |
"with open(config_path, 'r') as f:\n", | |
" config = yaml.safe_load(f)\n", | |
" your_config = config" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"K_step 1000\n", | |
"accumulate_grad_batches 1\n", | |
"audio_num_mel_bins 128\n", | |
"audio_sample_rate 44100\n", | |
"binarization_args {'shuffle': False, 'with_align': True, 'with_f0': True, 'with_hubert': True, 'with_spk_embed': False, 'with_wav': False}\n", | |
"binarizer_cls preprocessing.SVCpre.SVCBinarizer\n", | |
"binary_data_dir data/binary/nyaru\n", | |
"check_val_every_n_epoch 10\n", | |
"choose_test_manually False\n", | |
"clip_grad_norm 1\n", | |
"config_path training/config_nsf.yaml\n", | |
"content_cond_steps []\n", | |
"cwt_add_f0_loss False\n", | |
"cwt_hidden_size 128\n", | |
"cwt_layers 2\n", | |
"cwt_loss l1\n", | |
"cwt_std_scale 0.8\n", | |
"datasets ['opencpop']\n", | |
"debug False\n", | |
"dec_ffn_kernel_size 9\n", | |
"dec_layers 4\n", | |
"decay_steps 40000\n", | |
"decoder_type fft\n", | |
"dict_dir \n", | |
"diff_decoder_type wavenet\n", | |
"diff_loss_type l2\n", | |
"dilation_cycle_length 4\n", | |
"dropout 0.1\n", | |
"ds_workers 4\n", | |
"dur_enc_hidden_stride_kernel ['0,2,3', '0,2,3', '0,1,3']\n", | |
"dur_loss mse\n", | |
"dur_predictor_kernel 3\n", | |
"dur_predictor_layers 5\n", | |
"enc_ffn_kernel_size 9\n", | |
"enc_layers 4\n", | |
"encoder_K 8\n", | |
"encoder_type fft\n", | |
"endless_ds False\n", | |
"f0_bin 256\n", | |
"f0_max 1100.0\n", | |
"f0_min 40.0\n", | |
"ffn_act gelu\n", | |
"ffn_padding SAME\n", | |
"fft_size 2048\n", | |
"fmax 16000\n", | |
"fmin 40\n", | |
"fs2_ckpt \n", | |
"gaussian_start True\n", | |
"gen_dir_name \n", | |
"gen_tgt_spk_id -1\n", | |
"hidden_size 256\n", | |
"hop_size 512\n", | |
"hubert_path checkpoints/hubert/hubert_soft.pt\n", | |
"hubert_gpu True\n", | |
"infer False\n", | |
"keep_bins 128\n", | |
"lambda_commit 0.25\n", | |
"lambda_energy 0.0\n", | |
"lambda_f0 1.0\n", | |
"lambda_ph_dur 0.3\n", | |
"lambda_sent_dur 1.0\n", | |
"lambda_uv 1.0\n", | |
"lambda_word_dur 1.0\n", | |
"load_ckpt \n", | |
"log_interval 100\n", | |
"loud_norm False\n", | |
"lr 0.0008\n", | |
"max_beta 0.02\n", | |
"max_epochs 3000\n", | |
"max_eval_sentences 1\n", | |
"max_eval_tokens 60000\n", | |
"max_frames 42000\n", | |
"max_input_tokens 60000\n", | |
"max_sentences 88\n", | |
"max_tokens 128000\n", | |
"max_updates 1000000\n", | |
"mel_loss ssim:0.5|l1:0.5\n", | |
"mel_vmax 1.5\n", | |
"mel_vmin -6.0\n", | |
"min_level_db -120\n", | |
"norm_type gn\n", | |
"num_ckpt_keep 10\n", | |
"num_heads 2\n", | |
"num_sanity_val_steps 1\n", | |
"num_spk 1\n", | |
"num_test_samples 0\n", | |
"num_valid_plots 10\n", | |
"optimizer_adam_beta1 0.9\n", | |
"optimizer_adam_beta2 0.98\n", | |
"out_wav_norm False\n", | |
"pe_ckpt checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt\n", | |
"pe_enable False\n", | |
"perform_enhance True\n", | |
"pitch_ar False\n", | |
"pitch_enc_hidden_stride_kernel ['0,2,5', '0,2,5', '0,2,5']\n", | |
"pitch_extractor parselmouth\n", | |
"pitch_loss l2\n", | |
"pitch_norm log\n", | |
"pitch_type frame\n", | |
"pndm_speedup 10\n", | |
"pre_align_args {'allow_no_txt': False, 'denoise': False, 'forced_align': 'mfa', 'txt_processor': 'zh_g2pM', 'use_sox': True, 'use_tone': False}\n", | |
"pre_align_cls data_gen.singing.pre_align.SingingPreAlign\n", | |
"predictor_dropout 0.5\n", | |
"predictor_grad 0.1\n", | |
"predictor_hidden -1\n", | |
"predictor_kernel 5\n", | |
"predictor_layers 5\n", | |
"prenet_dropout 0.5\n", | |
"prenet_hidden_size 256\n", | |
"pretrain_fs_ckpt \n", | |
"processed_data_dir xxx\n", | |
"profile_infer False\n", | |
"raw_data_dir data/raw/nyaru\n", | |
"ref_norm_layer bn\n", | |
"rel_pos True\n", | |
"reset_phone_dict True\n", | |
"residual_channels 384\n", | |
"residual_layers 20\n", | |
"save_best False\n", | |
"save_ckpt True\n", | |
"save_codes ['configs', 'modules', 'src', 'utils']\n", | |
"save_f0 True\n", | |
"save_gt False\n", | |
"schedule_type linear\n", | |
"seed 1234\n", | |
"sort_by_len True\n", | |
"speaker_id nyaru\n", | |
"spec_max [0.0]\n", | |
"spec_min [-5.0]\n", | |
"spk_cond_steps []\n", | |
"stop_token_weight 5.0\n", | |
"task_cls training.task.SVC_task.SVCTask\n", | |
"test_ids []\n", | |
"test_input_dir \n", | |
"test_num 0\n", | |
"test_prefixes ['test']\n", | |
"test_set_name test\n", | |
"timesteps 1000\n", | |
"train_set_name train\n", | |
"use_crepe True\n", | |
"use_denoise False\n", | |
"use_energy_embed False\n", | |
"use_gt_dur False\n", | |
"use_gt_f0 False\n", | |
"use_midi False\n", | |
"use_nsf True\n", | |
"use_pitch_embed True\n", | |
"use_pos_embed True\n", | |
"use_spk_embed False\n", | |
"use_spk_id False\n", | |
"use_split_spk_id False\n", | |
"use_uv False\n", | |
"use_vec False\n", | |
"use_var_enc False\n", | |
"val_check_interval 2000\n", | |
"valid_num 0\n", | |
"valid_set_name valid\n", | |
"vocoder network.vocoders.nsf_hifigan.NsfHifiGAN\n", | |
"vocoder_ckpt checkpoints/nsf_hifigan/model\n", | |
"warmup_updates 2000\n", | |
"wav2spec_eps 1e-6\n", | |
"weight_decay 0\n", | |
"win_size 2048\n", | |
"work_dir checkpoints/nyaru\n", | |
"no_fs2 True\n" | |
] | |
} | |
], | |
"source": [ | |
"# show original config list\n", | |
"for k, v in config.items():\n", | |
" print(k, v)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### ★Edit config as you need" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# following is frequently changed\n", | |
"# you can change every config item that listed previous cell\n", | |
"# to restore your config into original config, run Config settings cell again\n", | |
"\n", | |
"your_config['max_sentences'] = 8\n", | |
"# batch size. increase if you have enough vram, and if you get out of memory error decrease it\n", | |
"# You can increase up to 48 if you are using rtx3090 24gb. As you increse this, model quaility will be better, but learning speed will decrease.\n", | |
"\n", | |
"your_config['lr'] = 0.0008\n", | |
"# initial learning rate\n", | |
"\n", | |
"your_config['decay_steps'] = 30000\n", | |
"# every each decay_steps, learning rate will be decayed as half\n", | |
"\n", | |
"your_config['val_check_interval'] = 5000\n", | |
"# every each val_check_interval, ckpt will be saved and validation will be performed\n", | |
"# you can check validation result in tensorboard\n", | |
"\n", | |
"your_config['endless_ds'] = False\n", | |
"# if dataset is smaller than 1hr, set True\n", | |
"\n", | |
"your_config['work_dir'] = os.path.join(repo_dir, 'checkpoints', speaker_name)\n", | |
"# checkpoint output directory\n", | |
"\n", | |
"your_config['num_ckpt_keep'] = 9999\n", | |
"# number of checkpoints to keep\n", | |
"\n", | |
"your_config['ds_workers'] = 8\n", | |
"# number of workers for dataset\n", | |
"# if shared memory error occurs, decrease this number\n", | |
"\n", | |
"your_config['no_fs2'] = True\n", | |
"your_config['enable_train'] = True\n", | |
"# dont know what it is. just got it from recommendation\n", | |
"\n", | |
"original_speaker_name = 'nyaru'\n", | |
"# speaker name in original config in repo\n", | |
"\n", | |
"for k, v in your_config.items():\n", | |
" if isinstance(v, str):\n", | |
" if original_speaker_name in v:\n", | |
" your_config[k] = v.replace('nyaru', speaker_name)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"K_step: 1000\n", | |
"accumulate_grad_batches: 1\n", | |
"audio_num_mel_bins: 128\n", | |
"audio_sample_rate: 44100\n", | |
"binarization_args: {'shuffle': False, 'with_align': True, 'with_f0': True, 'with_hubert': True, 'with_spk_embed': False, 'with_wav': False}\n", | |
"binarizer_cls: preprocessing.SVCpre.SVCBinarizer\n", | |
"binary_data_dir: data/binary/test\n", | |
"check_val_every_n_epoch: 10\n", | |
"choose_test_manually: False\n", | |
"clip_grad_norm: 1\n", | |
"config_path: training/config_nsf.yaml\n", | |
"content_cond_steps: []\n", | |
"cwt_add_f0_loss: False\n", | |
"cwt_hidden_size: 128\n", | |
"cwt_layers: 2\n", | |
"cwt_loss: l1\n", | |
"cwt_std_scale: 0.8\n", | |
"datasets: ['opencpop']\n", | |
"debug: False\n", | |
"dec_ffn_kernel_size: 9\n", | |
"dec_layers: 4\n", | |
"decay_steps: 30000\n", | |
"decoder_type: fft\n", | |
"dict_dir: \n", | |
"diff_decoder_type: wavenet\n", | |
"diff_loss_type: l2\n", | |
"dilation_cycle_length: 4\n", | |
"dropout: 0.1\n", | |
"ds_workers: 2\n", | |
"dur_enc_hidden_stride_kernel: ['0,2,3', '0,2,3', '0,1,3']\n", | |
"dur_loss: mse\n", | |
"dur_predictor_kernel: 3\n", | |
"dur_predictor_layers: 5\n", | |
"enc_ffn_kernel_size: 9\n", | |
"enc_layers: 4\n", | |
"encoder_K: 8\n", | |
"encoder_type: fft\n", | |
"endless_ds: False\n", | |
"f0_bin: 256\n", | |
"f0_max: 1100.0\n", | |
"f0_min: 40.0\n", | |
"ffn_act: gelu\n", | |
"ffn_padding: SAME\n", | |
"fft_size: 2048\n", | |
"fmax: 16000\n", | |
"fmin: 40\n", | |
"fs2_ckpt: \n", | |
"gaussian_start: True\n", | |
"gen_dir_name: \n", | |
"gen_tgt_spk_id: -1\n", | |
"hidden_size: 256\n", | |
"hop_size: 512\n", | |
"hubert_path: checkpoints/hubert/hubert_soft.pt\n", | |
"hubert_gpu: True\n", | |
"infer: False\n", | |
"keep_bins: 128\n", | |
"lambda_commit: 0.25\n", | |
"lambda_energy: 0.0\n", | |
"lambda_f0: 1.0\n", | |
"lambda_ph_dur: 0.3\n", | |
"lambda_sent_dur: 1.0\n", | |
"lambda_uv: 1.0\n", | |
"lambda_word_dur: 1.0\n", | |
"load_ckpt: \n", | |
"log_interval: 100\n", | |
"loud_norm: False\n", | |
"lr: 0.0008\n", | |
"max_beta: 0.02\n", | |
"max_epochs: 3000\n", | |
"max_eval_sentences: 1\n", | |
"max_eval_tokens: 60000\n", | |
"max_frames: 42000\n", | |
"max_input_tokens: 60000\n", | |
"max_sentences: 8\n", | |
"max_tokens: 128000\n", | |
"max_updates: 1000000\n", | |
"mel_loss: ssim:0.5|l1:0.5\n", | |
"mel_vmax: 1.5\n", | |
"mel_vmin: -6.0\n", | |
"min_level_db: -120\n", | |
"norm_type: gn\n", | |
"num_ckpt_keep: 9999\n", | |
"num_heads: 2\n", | |
"num_sanity_val_steps: 1\n", | |
"num_spk: 1\n", | |
"num_test_samples: 0\n", | |
"num_valid_plots: 10\n", | |
"optimizer_adam_beta1: 0.9\n", | |
"optimizer_adam_beta2: 0.98\n", | |
"out_wav_norm: False\n", | |
"pe_ckpt: checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt\n", | |
"pe_enable: False\n", | |
"perform_enhance: True\n", | |
"pitch_ar: False\n", | |
"pitch_enc_hidden_stride_kernel: ['0,2,5', '0,2,5', '0,2,5']\n", | |
"pitch_extractor: parselmouth\n", | |
"pitch_loss: l2\n", | |
"pitch_norm: log\n", | |
"pitch_type: frame\n", | |
"pndm_speedup: 10\n", | |
"pre_align_args: {'allow_no_txt': False, 'denoise': False, 'forced_align': 'mfa', 'txt_processor': 'zh_g2pM', 'use_sox': True, 'use_tone': False}\n", | |
"pre_align_cls: data_gen.singing.pre_align.SingingPreAlign\n", | |
"predictor_dropout: 0.5\n", | |
"predictor_grad: 0.1\n", | |
"predictor_hidden: -1\n", | |
"predictor_kernel: 5\n", | |
"predictor_layers: 5\n", | |
"prenet_dropout: 0.5\n", | |
"prenet_hidden_size: 256\n", | |
"pretrain_fs_ckpt: \n", | |
"processed_data_dir: xxx\n", | |
"profile_infer: False\n", | |
"raw_data_dir: data/raw/test\n", | |
"ref_norm_layer: bn\n", | |
"rel_pos: True\n", | |
"reset_phone_dict: True\n", | |
"residual_channels: 384\n", | |
"residual_layers: 20\n", | |
"save_best: False\n", | |
"save_ckpt: True\n", | |
"save_codes: ['configs', 'modules', 'src', 'utils']\n", | |
"save_f0: True\n", | |
"save_gt: False\n", | |
"schedule_type: linear\n", | |
"seed: 1234\n", | |
"sort_by_len: True\n", | |
"speaker_id: test\n", | |
"spec_max: [0.0]\n", | |
"spec_min: [-5.0]\n", | |
"spk_cond_steps: []\n", | |
"stop_token_weight: 5.0\n", | |
"task_cls: training.task.SVC_task.SVCTask\n", | |
"test_ids: []\n", | |
"test_input_dir: \n", | |
"test_num: 0\n", | |
"test_prefixes: ['test']\n", | |
"test_set_name: test\n", | |
"timesteps: 1000\n", | |
"train_set_name: train\n", | |
"use_crepe: True\n", | |
"use_denoise: False\n", | |
"use_energy_embed: False\n", | |
"use_gt_dur: False\n", | |
"use_gt_f0: False\n", | |
"use_midi: False\n", | |
"use_nsf: True\n", | |
"use_pitch_embed: True\n", | |
"use_pos_embed: True\n", | |
"use_spk_embed: False\n", | |
"use_spk_id: False\n", | |
"use_split_spk_id: False\n", | |
"use_uv: False\n", | |
"use_vec: False\n", | |
"use_var_enc: False\n", | |
"val_check_interval: 5000\n", | |
"valid_num: 0\n", | |
"valid_set_name: valid\n", | |
"vocoder: network.vocoders.nsf_hifigan.NsfHifiGAN\n", | |
"vocoder_ckpt: checkpoints/nsf_hifigan/model\n", | |
"warmup_updates: 2000\n", | |
"wav2spec_eps: 1e-6\n", | |
"weight_decay: 0\n", | |
"win_size: 2048\n", | |
"work_dir: /workspace/t4-20230125/diff-svc/checkpoints/test\n", | |
"no_fs2: True\n" | |
] | |
} | |
], | |
"source": [ | |
"# show your config list\n", | |
"for k, v in your_config.items():\n", | |
" print(f'{k}: {v}')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# save config\n", | |
"with open(your_config_path, 'w') as f:\n", | |
" yaml.dump(your_config, f)\n", | |
" \n", | |
"if not os.path.exists(your_config['work_dir']):\n", | |
" os.makedirs(your_config['work_dir'])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Tensorboard" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Tensorboard command:\n", | |
"tensorboard --load_fast=true --reload_interval=1 --reload_multifile=true --logdir=/workspace/t4-20230125/diff-svc/checkpoints/test/lightning_logs --port=6006\n", | |
"ngrok command:\n", | |
"curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo \"deb https://ngrok-agent.s3.amazonaws.com buster main\" | tee /etc/apt/sources.list.d/ngrok.list && apt update && apt install ngrok\n", | |
"ngrok authtoken 1q2w3e4r5t6y7u8i9o0p\n", | |
"ngrok http 6006\n" | |
] | |
} | |
], | |
"source": [ | |
"import datetime, os\n", | |
"\n", | |
"log_dir = os.path.join(repo_dir, 'checkpoints', speaker_name, 'lightning_logs')\n", | |
"print(\"Tensorboard command:\")\n", | |
"print(f\"tensorboard --load_fast=true --reload_interval=1 --reload_multifile=true --logdir={log_dir} --port=6006\")\n", | |
"\n", | |
"if ngrok_token:\n", | |
" print(\"ngrok command:\")\n", | |
" print(\"\"\"curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo \"deb https://ngrok-agent.s3.amazonaws.com buster main\" | tee /etc/apt/sources.list.d/ngrok.list && apt update && apt install ngrok\"\"\")\n", | |
" print(f\"ngrok authtoken {ngrok_token}\")\n", | |
" print(\"ngrok http 6006\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Preprocess" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"| Hparams chains: ['/workspace/t4-20230125/diff-svc/training/config_test_nsf.yaml']\n", | |
"| Hparams: \n", | |
"\u001b[;33;mK_step\u001b[0m: 1000, \u001b[;33;maccumulate_grad_batches\u001b[0m: 1, \u001b[;33;maudio_num_mel_bins\u001b[0m: 128, \u001b[;33;maudio_sample_rate\u001b[0m: 44100, \u001b[;33;mbinarization_args\u001b[0m: {'shuffle': False, 'with_align': True, 'with_f0': True, 'with_hubert': True, 'with_spk_embed': False, 'with_wav': False}, \n", | |
"\u001b[;33;mbinarizer_cls\u001b[0m: preprocessing.SVCpre.SVCBinarizer, \u001b[;33;mbinary_data_dir\u001b[0m: data/binary/test, \u001b[;33;mcheck_val_every_n_epoch\u001b[0m: 10, \u001b[;33;mchoose_test_manually\u001b[0m: False, \u001b[;33;mclip_grad_norm\u001b[0m: 1, \n", | |
"\u001b[;33;mconfig_path\u001b[0m: training/config_nsf.yaml, \u001b[;33;mcontent_cond_steps\u001b[0m: [], \u001b[;33;mcwt_add_f0_loss\u001b[0m: False, \u001b[;33;mcwt_hidden_size\u001b[0m: 128, \u001b[;33;mcwt_layers\u001b[0m: 2, \n", | |
"\u001b[;33;mcwt_loss\u001b[0m: l1, \u001b[;33;mcwt_std_scale\u001b[0m: 0.8, \u001b[;33;mdatasets\u001b[0m: ['opencpop'], \u001b[;33;mdebug\u001b[0m: False, \u001b[;33;mdec_ffn_kernel_size\u001b[0m: 9, \n", | |
"\u001b[;33;mdec_layers\u001b[0m: 4, \u001b[;33;mdecay_steps\u001b[0m: 30000, \u001b[;33;mdecoder_type\u001b[0m: fft, \u001b[;33;mdict_dir\u001b[0m: , \u001b[;33;mdiff_decoder_type\u001b[0m: wavenet, \n", | |
"\u001b[;33;mdiff_loss_type\u001b[0m: l2, \u001b[;33;mdilation_cycle_length\u001b[0m: 4, \u001b[;33;mdropout\u001b[0m: 0.1, \u001b[;33;mds_workers\u001b[0m: 2, \u001b[;33;mdur_enc_hidden_stride_kernel\u001b[0m: ['0,2,3', '0,2,3', '0,1,3'], \n", | |
"\u001b[;33;mdur_loss\u001b[0m: mse, \u001b[;33;mdur_predictor_kernel\u001b[0m: 3, \u001b[;33;mdur_predictor_layers\u001b[0m: 5, \u001b[;33;menc_ffn_kernel_size\u001b[0m: 9, \u001b[;33;menc_layers\u001b[0m: 4, \n", | |
"\u001b[;33;mencoder_K\u001b[0m: 8, \u001b[;33;mencoder_type\u001b[0m: fft, \u001b[;33;mendless_ds\u001b[0m: False, \u001b[;33;mf0_bin\u001b[0m: 256, \u001b[;33;mf0_max\u001b[0m: 1100.0, \n", | |
"\u001b[;33;mf0_min\u001b[0m: 40.0, \u001b[;33;mffn_act\u001b[0m: gelu, \u001b[;33;mffn_padding\u001b[0m: SAME, \u001b[;33;mfft_size\u001b[0m: 2048, \u001b[;33;mfmax\u001b[0m: 16000, \n", | |
"\u001b[;33;mfmin\u001b[0m: 40, \u001b[;33;mfs2_ckpt\u001b[0m: , \u001b[;33;mgaussian_start\u001b[0m: True, \u001b[;33;mgen_dir_name\u001b[0m: , \u001b[;33;mgen_tgt_spk_id\u001b[0m: -1, \n", | |
"\u001b[;33;mhidden_size\u001b[0m: 256, \u001b[;33;mhop_size\u001b[0m: 512, \u001b[;33;mhubert_gpu\u001b[0m: True, \u001b[;33;mhubert_path\u001b[0m: checkpoints/hubert/hubert_soft.pt, \u001b[;33;minfer\u001b[0m: False, \n", | |
"\u001b[;33;mkeep_bins\u001b[0m: 128, \u001b[;33;mlambda_commit\u001b[0m: 0.25, \u001b[;33;mlambda_energy\u001b[0m: 0.0, \u001b[;33;mlambda_f0\u001b[0m: 1.0, \u001b[;33;mlambda_ph_dur\u001b[0m: 0.3, \n", | |
"\u001b[;33;mlambda_sent_dur\u001b[0m: 1.0, \u001b[;33;mlambda_uv\u001b[0m: 1.0, \u001b[;33;mlambda_word_dur\u001b[0m: 1.0, \u001b[;33;mload_ckpt\u001b[0m: , \u001b[;33;mlog_interval\u001b[0m: 100, \n", | |
"\u001b[;33;mloud_norm\u001b[0m: False, \u001b[;33;mlr\u001b[0m: 0.0008, \u001b[;33;mmax_beta\u001b[0m: 0.02, \u001b[;33;mmax_epochs\u001b[0m: 3000, \u001b[;33;mmax_eval_sentences\u001b[0m: 1, \n", | |
"\u001b[;33;mmax_eval_tokens\u001b[0m: 60000, \u001b[;33;mmax_frames\u001b[0m: 42000, \u001b[;33;mmax_input_tokens\u001b[0m: 60000, \u001b[;33;mmax_sentences\u001b[0m: 8, \u001b[;33;mmax_tokens\u001b[0m: 128000, \n", | |
"\u001b[;33;mmax_updates\u001b[0m: 1000000, \u001b[;33;mmel_loss\u001b[0m: ssim:0.5|l1:0.5, \u001b[;33;mmel_vmax\u001b[0m: 1.5, \u001b[;33;mmel_vmin\u001b[0m: -6.0, \u001b[;33;mmin_level_db\u001b[0m: -120, \n", | |
"\u001b[;33;mno_fs2\u001b[0m: True, \u001b[;33;mnorm_type\u001b[0m: gn, \u001b[;33;mnum_ckpt_keep\u001b[0m: 9999, \u001b[;33;mnum_heads\u001b[0m: 2, \u001b[;33;mnum_sanity_val_steps\u001b[0m: 1, \n", | |
"\u001b[;33;mnum_spk\u001b[0m: 1, \u001b[;33;mnum_test_samples\u001b[0m: 0, \u001b[;33;mnum_valid_plots\u001b[0m: 10, \u001b[;33;moptimizer_adam_beta1\u001b[0m: 0.9, \u001b[;33;moptimizer_adam_beta2\u001b[0m: 0.98, \n", | |
"\u001b[;33;mout_wav_norm\u001b[0m: False, \u001b[;33;mpe_ckpt\u001b[0m: checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt, \u001b[;33;mpe_enable\u001b[0m: False, \u001b[;33;mperform_enhance\u001b[0m: True, \u001b[;33;mpitch_ar\u001b[0m: False, \n", | |
"\u001b[;33;mpitch_enc_hidden_stride_kernel\u001b[0m: ['0,2,5', '0,2,5', '0,2,5'], \u001b[;33;mpitch_extractor\u001b[0m: parselmouth, \u001b[;33;mpitch_loss\u001b[0m: l2, \u001b[;33;mpitch_norm\u001b[0m: log, \u001b[;33;mpitch_type\u001b[0m: frame, \n", | |
"\u001b[;33;mpndm_speedup\u001b[0m: 10, \u001b[;33;mpre_align_args\u001b[0m: {'allow_no_txt': False, 'denoise': False, 'forced_align': 'mfa', 'txt_processor': 'zh_g2pM', 'use_sox': True, 'use_tone': False}, \u001b[;33;mpre_align_cls\u001b[0m: data_gen.singing.pre_align.SingingPreAlign, \u001b[;33;mpredictor_dropout\u001b[0m: 0.5, \u001b[;33;mpredictor_grad\u001b[0m: 0.1, \n", | |
"\u001b[;33;mpredictor_hidden\u001b[0m: -1, \u001b[;33;mpredictor_kernel\u001b[0m: 5, \u001b[;33;mpredictor_layers\u001b[0m: 5, \u001b[;33;mprenet_dropout\u001b[0m: 0.5, \u001b[;33;mprenet_hidden_size\u001b[0m: 256, \n", | |
"\u001b[;33;mpretrain_fs_ckpt\u001b[0m: , \u001b[;33;mprocessed_data_dir\u001b[0m: xxx, \u001b[;33;mprofile_infer\u001b[0m: False, \u001b[;33;mraw_data_dir\u001b[0m: data/raw/test, \u001b[;33;mref_norm_layer\u001b[0m: bn, \n", | |
"\u001b[;33;mrel_pos\u001b[0m: True, \u001b[;33;mreset_phone_dict\u001b[0m: True, \u001b[;33;mresidual_channels\u001b[0m: 384, \u001b[;33;mresidual_layers\u001b[0m: 20, \u001b[;33;msave_best\u001b[0m: False, \n", | |
"\u001b[;33;msave_ckpt\u001b[0m: True, \u001b[;33;msave_codes\u001b[0m: ['configs', 'modules', 'src', 'utils'], \u001b[;33;msave_f0\u001b[0m: True, \u001b[;33;msave_gt\u001b[0m: False, \u001b[;33;mschedule_type\u001b[0m: linear, \n", | |
"\u001b[;33;mseed\u001b[0m: 1234, \u001b[;33;msort_by_len\u001b[0m: True, \u001b[;33;mspeaker_id\u001b[0m: test, \u001b[;33;mspec_max\u001b[0m: [0.0], \u001b[;33;mspec_min\u001b[0m: [-5.0], \n", | |
"\u001b[;33;mspk_cond_steps\u001b[0m: [], \u001b[;33;mstop_token_weight\u001b[0m: 5.0, \u001b[;33;mtask_cls\u001b[0m: training.task.SVC_task.SVCTask, \u001b[;33;mtest_ids\u001b[0m: [], \u001b[;33;mtest_input_dir\u001b[0m: , \n", | |
"\u001b[;33;mtest_num\u001b[0m: 0, \u001b[;33;mtest_prefixes\u001b[0m: ['test'], \u001b[;33;mtest_set_name\u001b[0m: test, \u001b[;33;mtimesteps\u001b[0m: 1000, \u001b[;33;mtrain_set_name\u001b[0m: train, \n", | |
"\u001b[;33;muse_crepe\u001b[0m: True, \u001b[;33;muse_denoise\u001b[0m: False, \u001b[;33;muse_energy_embed\u001b[0m: False, \u001b[;33;muse_gt_dur\u001b[0m: False, \u001b[;33;muse_gt_f0\u001b[0m: False, \n", | |
"\u001b[;33;muse_midi\u001b[0m: False, \u001b[;33;muse_nsf\u001b[0m: True, \u001b[;33;muse_pitch_embed\u001b[0m: True, \u001b[;33;muse_pos_embed\u001b[0m: True, \u001b[;33;muse_spk_embed\u001b[0m: False, \n", | |
"\u001b[;33;muse_spk_id\u001b[0m: False, \u001b[;33;muse_split_spk_id\u001b[0m: False, \u001b[;33;muse_uv\u001b[0m: False, \u001b[;33;muse_var_enc\u001b[0m: False, \u001b[;33;muse_vec\u001b[0m: False, \n", | |
"\u001b[;33;mval_check_interval\u001b[0m: 5000, \u001b[;33;mvalid_num\u001b[0m: 0, \u001b[;33;mvalid_set_name\u001b[0m: valid, \u001b[;33;mvalidate\u001b[0m: False, \u001b[;33;mvocoder\u001b[0m: network.vocoders.nsf_hifigan.NsfHifiGAN, \n", | |
"\u001b[;33;mvocoder_ckpt\u001b[0m: checkpoints/nsf_hifigan/model, \u001b[;33;mwarmup_updates\u001b[0m: 2000, \u001b[;33;mwav2spec_eps\u001b[0m: 1e-6, \u001b[;33;mweight_decay\u001b[0m: 0, \u001b[;33;mwin_size\u001b[0m: 2048, \n", | |
"\u001b[;33;mwork_dir\u001b[0m: , \n", | |
"| Binarizer: <class 'preprocessing.SVCpre.SVCBinarizer'>\n", | |
"spkers: {'test'}\n", | |
"| spk_map: {'test': 0}\n", | |
"100%|█████████████████████████████████████████████| 5/5 [00:38<00:00, 7.63s/it]\n", | |
"| valid total duration: 69.731s\n", | |
"100%|█████████████████████████████████████████████| 5/5 [00:33<00:00, 6.65s/it]\n", | |
"| test total duration: 69.731s\n", | |
"100%|█████████████████████████████████████████████| 5/5 [00:26<00:00, 5.30s/it]\n", | |
"(128,)\n", | |
"| train total duration: 55.318s\n" | |
] | |
} | |
], | |
"source": [ | |
"os.environ['PYTHONPATH']='.'\n", | |
"binarize_py = os.path.join(repo_dir, 'preprocessing', 'binarize.py')\n", | |
"os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"0\"\n", | |
"! python {binarize_py} --config {your_config_path}" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Train" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"| Hparams chains: ['/workspace/t4-20230125/diff-svc/training/config_test_nsf.yaml']\n", | |
"| Hparams: \n", | |
"\u001b[;33;mK_step\u001b[0m: 1000, \u001b[;33;maccumulate_grad_batches\u001b[0m: 1, \u001b[;33;maudio_num_mel_bins\u001b[0m: 128, \u001b[;33;maudio_sample_rate\u001b[0m: 44100, \u001b[;33;mbinarization_args\u001b[0m: {'shuffle': False, 'with_align': True, 'with_f0': True, 'with_hubert': True, 'with_spk_embed': False, 'with_wav': False}, \n", | |
"\u001b[;33;mbinarizer_cls\u001b[0m: preprocessing.SVCpre.SVCBinarizer, \u001b[;33;mbinary_data_dir\u001b[0m: data/binary/test, \u001b[;33;mcheck_val_every_n_epoch\u001b[0m: 10, \u001b[;33;mchoose_test_manually\u001b[0m: False, \u001b[;33;mclip_grad_norm\u001b[0m: 1, \n", | |
"\u001b[;33;mconfig_path\u001b[0m: training/config_nsf.yaml, \u001b[;33;mcontent_cond_steps\u001b[0m: [], \u001b[;33;mcwt_add_f0_loss\u001b[0m: False, \u001b[;33;mcwt_hidden_size\u001b[0m: 128, \u001b[;33;mcwt_layers\u001b[0m: 2, \n", | |
"\u001b[;33;mcwt_loss\u001b[0m: l1, \u001b[;33;mcwt_std_scale\u001b[0m: 0.8, \u001b[;33;mdatasets\u001b[0m: ['opencpop'], \u001b[;33;mdebug\u001b[0m: False, \u001b[;33;mdec_ffn_kernel_size\u001b[0m: 9, \n", | |
"\u001b[;33;mdec_layers\u001b[0m: 4, \u001b[;33;mdecay_steps\u001b[0m: 30000, \u001b[;33;mdecoder_type\u001b[0m: fft, \u001b[;33;mdict_dir\u001b[0m: , \u001b[;33;mdiff_decoder_type\u001b[0m: wavenet, \n", | |
"\u001b[;33;mdiff_loss_type\u001b[0m: l2, \u001b[;33;mdilation_cycle_length\u001b[0m: 4, \u001b[;33;mdropout\u001b[0m: 0.1, \u001b[;33;mds_workers\u001b[0m: 2, \u001b[;33;mdur_enc_hidden_stride_kernel\u001b[0m: ['0,2,3', '0,2,3', '0,1,3'], \n", | |
"\u001b[;33;mdur_loss\u001b[0m: mse, \u001b[;33;mdur_predictor_kernel\u001b[0m: 3, \u001b[;33;mdur_predictor_layers\u001b[0m: 5, \u001b[;33;menc_ffn_kernel_size\u001b[0m: 9, \u001b[;33;menc_layers\u001b[0m: 4, \n", | |
"\u001b[;33;mencoder_K\u001b[0m: 8, \u001b[;33;mencoder_type\u001b[0m: fft, \u001b[;33;mendless_ds\u001b[0m: False, \u001b[;33;mf0_bin\u001b[0m: 256, \u001b[;33;mf0_max\u001b[0m: 1100.0, \n", | |
"\u001b[;33;mf0_min\u001b[0m: 40.0, \u001b[;33;mffn_act\u001b[0m: gelu, \u001b[;33;mffn_padding\u001b[0m: SAME, \u001b[;33;mfft_size\u001b[0m: 2048, \u001b[;33;mfmax\u001b[0m: 16000, \n", | |
"\u001b[;33;mfmin\u001b[0m: 40, \u001b[;33;mfs2_ckpt\u001b[0m: , \u001b[;33;mgaussian_start\u001b[0m: True, \u001b[;33;mgen_dir_name\u001b[0m: , \u001b[;33;mgen_tgt_spk_id\u001b[0m: -1, \n", | |
"\u001b[;33;mhidden_size\u001b[0m: 256, \u001b[;33;mhop_size\u001b[0m: 512, \u001b[;33;mhubert_gpu\u001b[0m: True, \u001b[;33;mhubert_path\u001b[0m: checkpoints/hubert/hubert_soft.pt, \u001b[;33;minfer\u001b[0m: False, \n", | |
"\u001b[;33;mkeep_bins\u001b[0m: 128, \u001b[;33;mlambda_commit\u001b[0m: 0.25, \u001b[;33;mlambda_energy\u001b[0m: 0.0, \u001b[;33;mlambda_f0\u001b[0m: 1.0, \u001b[;33;mlambda_ph_dur\u001b[0m: 0.3, \n", | |
"\u001b[;33;mlambda_sent_dur\u001b[0m: 1.0, \u001b[;33;mlambda_uv\u001b[0m: 1.0, \u001b[;33;mlambda_word_dur\u001b[0m: 1.0, \u001b[;33;mload_ckpt\u001b[0m: , \u001b[;33;mlog_interval\u001b[0m: 100, \n", | |
"\u001b[;33;mloud_norm\u001b[0m: False, \u001b[;33;mlr\u001b[0m: 0.0008, \u001b[;33;mmax_beta\u001b[0m: 0.02, \u001b[;33;mmax_epochs\u001b[0m: 3000, \u001b[;33;mmax_eval_sentences\u001b[0m: 1, \n", | |
"\u001b[;33;mmax_eval_tokens\u001b[0m: 60000, \u001b[;33;mmax_frames\u001b[0m: 42000, \u001b[;33;mmax_input_tokens\u001b[0m: 60000, \u001b[;33;mmax_sentences\u001b[0m: 8, \u001b[;33;mmax_tokens\u001b[0m: 128000, \n", | |
"\u001b[;33;mmax_updates\u001b[0m: 1000000, \u001b[;33;mmel_loss\u001b[0m: ssim:0.5|l1:0.5, \u001b[;33;mmel_vmax\u001b[0m: 1.5, \u001b[;33;mmel_vmin\u001b[0m: -6.0, \u001b[;33;mmin_level_db\u001b[0m: -120, \n", | |
"\u001b[;33;mno_fs2\u001b[0m: True, \u001b[;33;mnorm_type\u001b[0m: gn, \u001b[;33;mnum_ckpt_keep\u001b[0m: 9999, \u001b[;33;mnum_heads\u001b[0m: 2, \u001b[;33;mnum_sanity_val_steps\u001b[0m: 1, \n", | |
"\u001b[;33;mnum_spk\u001b[0m: 1, \u001b[;33;mnum_test_samples\u001b[0m: 0, \u001b[;33;mnum_valid_plots\u001b[0m: 10, \u001b[;33;moptimizer_adam_beta1\u001b[0m: 0.9, \u001b[;33;moptimizer_adam_beta2\u001b[0m: 0.98, \n", | |
"\u001b[;33;mout_wav_norm\u001b[0m: False, \u001b[;33;mpe_ckpt\u001b[0m: checkpoints/0102_xiaoma_pe/model_ckpt_steps_60000.ckpt, \u001b[;33;mpe_enable\u001b[0m: False, \u001b[;33;mperform_enhance\u001b[0m: True, \u001b[;33;mpitch_ar\u001b[0m: False, \n", | |
"\u001b[;33;mpitch_enc_hidden_stride_kernel\u001b[0m: ['0,2,5', '0,2,5', '0,2,5'], \u001b[;33;mpitch_extractor\u001b[0m: parselmouth, \u001b[;33;mpitch_loss\u001b[0m: l2, \u001b[;33;mpitch_norm\u001b[0m: log, \u001b[;33;mpitch_type\u001b[0m: frame, \n", | |
"\u001b[;33;mpndm_speedup\u001b[0m: 10, \u001b[;33;mpre_align_args\u001b[0m: {'allow_no_txt': False, 'denoise': False, 'forced_align': 'mfa', 'txt_processor': 'zh_g2pM', 'use_sox': True, 'use_tone': False}, \u001b[;33;mpre_align_cls\u001b[0m: data_gen.singing.pre_align.SingingPreAlign, \u001b[;33;mpredictor_dropout\u001b[0m: 0.5, \u001b[;33;mpredictor_grad\u001b[0m: 0.1, \n", | |
"\u001b[;33;mpredictor_hidden\u001b[0m: -1, \u001b[;33;mpredictor_kernel\u001b[0m: 5, \u001b[;33;mpredictor_layers\u001b[0m: 5, \u001b[;33;mprenet_dropout\u001b[0m: 0.5, \u001b[;33;mprenet_hidden_size\u001b[0m: 256, \n", | |
"\u001b[;33;mpretrain_fs_ckpt\u001b[0m: , \u001b[;33;mprocessed_data_dir\u001b[0m: xxx, \u001b[;33;mprofile_infer\u001b[0m: False, \u001b[;33;mraw_data_dir\u001b[0m: data/raw/test, \u001b[;33;mref_norm_layer\u001b[0m: bn, \n", | |
"\u001b[;33;mrel_pos\u001b[0m: True, \u001b[;33;mreset_phone_dict\u001b[0m: True, \u001b[;33;mresidual_channels\u001b[0m: 384, \u001b[;33;mresidual_layers\u001b[0m: 20, \u001b[;33;msave_best\u001b[0m: False, \n", | |
"\u001b[;33;msave_ckpt\u001b[0m: True, \u001b[;33;msave_codes\u001b[0m: ['configs', 'modules', 'src', 'utils'], \u001b[;33;msave_f0\u001b[0m: True, \u001b[;33;msave_gt\u001b[0m: False, \u001b[;33;mschedule_type\u001b[0m: linear, \n", | |
"\u001b[;33;mseed\u001b[0m: 1234, \u001b[;33;msort_by_len\u001b[0m: True, \u001b[;33;mspeaker_id\u001b[0m: test, \u001b[;33;mspec_max\u001b[0m: [0.0], \u001b[;33;mspec_min\u001b[0m: [-5.0], \n", | |
"\u001b[;33;mspk_cond_steps\u001b[0m: [], \u001b[;33;mstop_token_weight\u001b[0m: 5.0, \u001b[;33;mtask_cls\u001b[0m: training.task.SVC_task.SVCTask, \u001b[;33;mtest_ids\u001b[0m: [], \u001b[;33;mtest_input_dir\u001b[0m: , \n", | |
"\u001b[;33;mtest_num\u001b[0m: 0, \u001b[;33;mtest_prefixes\u001b[0m: ['test'], \u001b[;33;mtest_set_name\u001b[0m: test, \u001b[;33;mtimesteps\u001b[0m: 1000, \u001b[;33;mtrain_set_name\u001b[0m: train, \n", | |
"\u001b[;33;muse_crepe\u001b[0m: True, \u001b[;33;muse_denoise\u001b[0m: False, \u001b[;33;muse_energy_embed\u001b[0m: False, \u001b[;33;muse_gt_dur\u001b[0m: False, \u001b[;33;muse_gt_f0\u001b[0m: False, \n", | |
"\u001b[;33;muse_midi\u001b[0m: False, \u001b[;33;muse_nsf\u001b[0m: True, \u001b[;33;muse_pitch_embed\u001b[0m: True, \u001b[;33;muse_pos_embed\u001b[0m: True, \u001b[;33;muse_spk_embed\u001b[0m: False, \n", | |
"\u001b[;33;muse_spk_id\u001b[0m: False, \u001b[;33;muse_split_spk_id\u001b[0m: False, \u001b[;33;muse_uv\u001b[0m: False, \u001b[;33;muse_var_enc\u001b[0m: False, \u001b[;33;muse_vec\u001b[0m: False, \n", | |
"\u001b[;33;mval_check_interval\u001b[0m: 5000, \u001b[;33;mvalid_num\u001b[0m: 0, \u001b[;33;mvalid_set_name\u001b[0m: valid, \u001b[;33;mvalidate\u001b[0m: False, \u001b[;33;mvocoder\u001b[0m: network.vocoders.nsf_hifigan.NsfHifiGAN, \n", | |
"\u001b[;33;mvocoder_ckpt\u001b[0m: checkpoints/nsf_hifigan/model, \u001b[;33;mwarmup_updates\u001b[0m: 2000, \u001b[;33;mwav2spec_eps\u001b[0m: 1e-6, \u001b[;33;mweight_decay\u001b[0m: 0, \u001b[;33;mwin_size\u001b[0m: 2048, \n", | |
"\u001b[;33;mwork_dir\u001b[0m: checkpoints/test, \n", | |
"| Mel losses: {'ssim': 0.5, 'l1': 0.5}\n", | |
"| Load HifiGAN: checkpoints/nsf_hifigan/model\n", | |
"Removing weight norm...\n", | |
"01/29 01:13:25 PM gpu available: True, used: True\n", | |
"| model Trainable Parameters: 33.709M\n", | |
"Validation sanity check: 0%| | 0/1 [00:00<?, ?batch/s]\n", | |
"sample time step: 0%| | 0/100 [00:00<?, ?it/s]\u001b[A\n", | |
"sample time step: 1%|▎ | 1/100 [00:00<00:14, 6.79it/s]\u001b[A\n", | |
"sample time step: 5%|█▎ | 5/100 [00:00<00:04, 21.48it/s]\u001b[A\n", | |
"sample time step: 9%|██▎ | 9/100 [00:00<00:03, 28.39it/s]\u001b[A\n", | |
"sample time step: 13%|███ | 13/100 [00:00<00:02, 32.18it/s]\u001b[A\n", | |
"sample time step: 17%|████ | 17/100 [00:00<00:02, 34.64it/s]\u001b[A\n", | |
"sample time step: 21%|█████ | 21/100 [00:00<00:02, 35.53it/s]\u001b[A\n", | |
"sample time step: 25%|██████ | 25/100 [00:00<00:02, 36.71it/s]\u001b[A\n", | |
"sample time step: 29%|██████▉ | 29/100 [00:00<00:01, 37.32it/s]\u001b[A\n", | |
"sample time step: 34%|████████▏ | 34/100 [00:01<00:01, 38.54it/s]\u001b[A\n", | |
"sample time step: 39%|█████████▎ | 39/100 [00:01<00:01, 39.62it/s]\u001b[A\n", | |
"sample time step: 43%|██████████▎ | 43/100 [00:01<00:01, 39.46it/s]\u001b[A\n", | |
"sample time step: 47%|███████████▎ | 47/100 [00:01<00:01, 39.42it/s]\u001b[A\n", | |
"sample time step: 51%|████████████▏ | 51/100 [00:01<00:01, 39.17it/s]\u001b[A\n", | |
"sample time step: 55%|█████████████▏ | 55/100 [00:01<00:01, 39.39it/s]\u001b[A\n", | |
"sample time step: 60%|██████████████▍ | 60/100 [00:01<00:01, 39.86it/s]\u001b[A\n", | |
"sample time step: 65%|███████████████▌ | 65/100 [00:01<00:00, 40.32it/s]\u001b[A\n", | |
"sample time step: 70%|████████████████▊ | 70/100 [00:01<00:00, 40.65it/s]\u001b[A\n", | |
"sample time step: 75%|██████████████████ | 75/100 [00:02<00:00, 40.33it/s]\u001b[A\n", | |
"sample time step: 80%|███████████████████▏ | 80/100 [00:02<00:00, 39.63it/s]\u001b[A\n", | |
"sample time step: 84%|████████████████████▏ | 84/100 [00:02<00:00, 39.29it/s]\u001b[A\n", | |
"sample time step: 88%|█████████████████████ | 88/100 [00:02<00:00, 38.12it/s]\u001b[A\n", | |
"sample time step: 92%|██████████████████████ | 92/100 [00:02<00:00, 38.33it/s]\u001b[A\n", | |
"sample time step: 96%|███████████████████████ | 96/100 [00:02<00:00, 35.34it/s]\u001b[A\n", | |
"sample time step: 100%|███████████████████████| 100/100 [00:02<00:00, 36.86it/s]\u001b[A\n", | |
"\n", | |
"==============\n", | |
" valid results: {'total_loss': 1.0046, 'mel': 1.0046}\n", | |
"==============\n", | |
"\n" | |
] | |
}, | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Epoch 1: : 1batch [00:02, 2.28s/batch, batch_size=5, lr=0.0008, mel=1, step=0] \n", | |
"==============\n", | |
" Epoch 0 ended. Steps: 0. {'total_loss': 1.0014, 'mel': 1.0014, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 2: : 2batch [00:03, 1.44s/batch, batch_size=5, lr=0.0008, mel=1, step=1]\n", | |
"==============\n", | |
" Epoch 1 ended. Steps: 1. {'total_loss': 1.0013, 'mel': 1.0013, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 3: : 3batch [00:03, 1.16s/batch, batch_size=5, lr=0.0008, mel=1, step=2]\n", | |
"==============\n", | |
" Epoch 2 ended. Steps: 2. {'total_loss': 1.0019, 'mel': 1.0019, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 4: : 4batch [00:04, 1.02s/batch, batch_size=5, lr=0.0008, mel=0.997, step=3]\n", | |
"==============\n", | |
" Epoch 3 ended. Steps: 3. {'total_loss': 0.9974, 'mel': 0.9974, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 5: : 5batch [00:05, 1.05batch/s, batch_size=5, lr=0.0008, mel=0.995, step=4]\n", | |
"==============\n", | |
" Epoch 4 ended. Steps: 4. {'total_loss': 0.9952, 'mel': 0.9952, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 6: : 6batch [00:06, 1.06batch/s, batch_size=5, lr=0.0008, mel=0.993, step=5]\n", | |
"==============\n", | |
" Epoch 5 ended. Steps: 5. {'total_loss': 0.9933, 'mel': 0.9933, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 7: : 7batch [00:07, 1.06batch/s, batch_size=5, lr=0.0008, mel=0.996, step=6]\n", | |
"==============\n", | |
" Epoch 6 ended. Steps: 6. {'total_loss': 0.9956, 'mel': 0.9956, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 8: : 8batch [00:08, 1.10batch/s, batch_size=5, lr=0.0008, mel=0.987, step=7]\n", | |
"==============\n", | |
" Epoch 7 ended. Steps: 7. {'total_loss': 0.9865, 'mel': 0.9865, 'batch_size': 5.0, 'lr': 0.0008}\n", | |
"==============\n", | |
"\n", | |
"Epoch 9: : 8batch [00:08, 1.10batch/s, batch_size=5, lr=0.0008, mel=0.987, step=7]^C\n", | |
"Traceback (most recent call last):\n", | |
" File \"/workspace/t4-20230125/diff-svc/run.py\", line 15, in <module>\n", | |
" run_task()\n", | |
" File \"/workspace/t4-20230125/diff-svc/run.py\", line 11, in run_task\n", | |
" task_cls.start()\n", | |
" File \"/workspace/t4-20230125/diff-svc/training/task/base_task.py\", line 234, in start\n", | |
" trainer.fit(task)\n", | |
" File \"/workspace/t4-20230125/diff-svc/utils/pl_utils.py\", line 495, in fit\n", | |
" self.run_pretrain_routine(model)\n", | |
" File \"/workspace/t4-20230125/diff-svc/utils/pl_utils.py\", line 588, in run_pretrain_routine\n", | |
" self.train()\n", | |
" File \"/workspace/t4-20230125/diff-svc/utils/pl_utils.py\", line 1364, in train\n", | |
" self.run_training_epoch()\n", | |
" File \"/workspace/t4-20230125/diff-svc/utils/pl_utils.py\", line 1398, in run_training_epoch\n", | |
" output = self.run_training_batch(batch, batch_idx)\n", | |
" File \"/workspace/t4-20230125/diff-svc/utils/pl_utils.py\", line 1520, in run_training_batch\n", | |
" loss = optimizer_closure()\n", | |
" File \"/workspace/t4-20230125/diff-svc/utils/pl_utils.py\", line 1503, in optimizer_closure\n", | |
" model_ref.backward(closure_loss, optimizer)\n", | |
" File \"/workspace/t4-20230125/diff-svc/training/task/base_task.py\", line 316, in backward\n", | |
" loss.backward()\n", | |
" File \"/usr/local/lib/python3.8/dist-packages/torch/_tensor.py\", line 307, in backward\n" | |
] | |
} | |
], | |
"source": [ | |
"# if error occurs, edit config file and run again. don't need to run binarize.py again\n", | |
"os.environ['PYTHONPATH']='.'\n", | |
"run_path = os.path.join(repo_dir, 'run.py')\n", | |
"os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"0\"\n", | |
"! python {run_path} --config {your_config_path} --exp_name {speaker_name} --reset" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.10" | |
}, | |
"vscode": { | |
"interpreter": { | |
"hash": "d3355b554e33c79ba315c6a34d2d5bc309be1808e07ad4360a975b20076fde3d" | |
} | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment