Instantly share code, notes, and snippets.
Created
October 20, 2022 09:21
-
Star
0
(0)
You must be signed in to star a gist -
Fork
0
(0)
You must be signed in to fork a gist
-
Save kun432/0f18735683ceae8d51fc4822742c068d to your computer and use it in GitHub Desktop.
sharevox_sample_c++.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "nbformat": 4, | |
| "nbformat_minor": 0, | |
| "metadata": { | |
| "colab": { | |
| "provenance": [], | |
| "collapsed_sections": [], | |
| "authorship_tag": "ABX9TyPlgESfBLya/kkYUgHIJY8a", | |
| "include_colab_link": true | |
| }, | |
| "kernelspec": { | |
| "name": "python3", | |
| "display_name": "Python 3" | |
| }, | |
| "language_info": { | |
| "name": "python" | |
| }, | |
| "accelerator": "GPU" | |
| }, | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": { | |
| "id": "view-in-github", | |
| "colab_type": "text" | |
| }, | |
| "source": [ | |
| "<a href=\"https://colab.research.google.com/gist/kun432/0f18735683ceae8d51fc4822742c068d/sharevox_sample_c.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "## 事前準備\n", | |
| "\n", | |
| "SHAREVOXのコアや必要なライブラリ、日本語辞書、言語モデル等をダウンロードして展開します。\n", | |
| "- ライブラリ等はGPU対応のものを使用していますが、Google Colab側でGPUを有効にしなくても使えるので、多分ちゃんとGPU使えていないです 😓\n", | |
| "- とりあえず動くのでOKとしてます 😝" | |
| ], | |
| "metadata": { | |
| "id": "MMRSQHlHMGey" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": null, | |
| "metadata": { | |
| "id": "Dibi1Uc-Ts09" | |
| }, | |
| "outputs": [], | |
| "source": [ | |
| "!mkdir archives\n", | |
| "%cd archives\n", | |
| "!wget https://github.com/SHAREVOX/sharevox_core/archive/refs/tags/0.1.2.zip && unzip 0.1.2.zip\n", | |
| "!wget https://github.com/SHAREVOX/sharevox_core/releases/download/0.1.2/sharevox_core-linux-x64-gpu-0.1.2.zip && unzip sharevox_core-linux-x64-gpu-0.1.2.zip\n", | |
| "!wget https://github.com/microsoft/onnxruntime/releases/download/v1.10.0/onnxruntime-linux-x64-gpu-1.10.0.tgz && tar zxvf onnxruntime-linux-x64-gpu-1.10.0.tgz\n", | |
| "!wget http://downloads.sourceforge.net/open-jtalk/open_jtalk_dic_utf_8-1.11.tar.gz && tar zxvf open_jtalk_dic_utf_8-1.11.tar.gz\n", | |
| "!wget https://github.com/SHAREVOX/sharevox_core/releases/download/0.1.0/sharevox_model-0.1.0.zip && unzip sharevox_model-0.1.0.zip\n", | |
| "%cd ..\n", | |
| "!cp -pr archives/sharevox_core-0.1.2 .\n", | |
| "%cd sharevox_core-0.1.2\n", | |
| "!cp -p ../archives/sharevox_core-linux-x64-gpu-0.1.2/libcore.so example/cpp/unix/.\n", | |
| "!cp -p ../archives/onnxruntime-linux-x64-gpu-1.10.0/lib/* example/cpp/unix/.\n", | |
| "!cp -pr ../archives/open_jtalk_dic_utf_8-1.11 example/cpp/unix/." | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "## SHAREVOXで音声合成\n", | |
| "\n", | |
| "C++のサンプルがあるのでそれを使ってやってみましょう。" | |
| ], | |
| "metadata": { | |
| "id": "BcEyiHJNjUGt" | |
| } | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "1. C++サンプルのディレクトリに移動" | |
| ], | |
| "metadata": { | |
| "id": "Qd50KHtvMRk3" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "%cd example/cpp/unix/" | |
| ], | |
| "metadata": { | |
| "id": "a905ziRHLlK5" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "2. ビルド" | |
| ], | |
| "metadata": { | |
| "id": "HLlbMjhaMWHJ" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "!cmake -S . -B build\n", | |
| "!cmake --build build" | |
| ], | |
| "metadata": { | |
| "id": "FjQZAvykZCQB" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "3. ビルドしたバイナリで音声ファイルを出力\n", | |
| "\n", | |
| "話させたい言葉を下のフォームに入力してください。\n", | |
| "\n", | |
| "* リスト項目\n", | |
| "* リスト項目\n", | |
| "\n" | |
| ], | |
| "metadata": { | |
| "id": "mmNq-GmFMZUL" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "SPEECH_TEXT=\"\\u3053\\u3093\\u306B\\u3061\\u306F\\u3002\\u30B7\\u30A7\\u30A2\\u30DC\\u30C3\\u30AF\\u30B9\\u3067\\u8A71\\u3057\\u3066\\u307F\\u307E\\u3057\\u305F\\u3002\\u805E\\u3053\\u3048\\u3066\\u3044\\u307E\\u3059\\u304B\\uFF1F\" #@param {type:\"string\"}" | |
| ], | |
| "metadata": { | |
| "id": "0IS1rwHYPTG7" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "!build/simple_tts $SPEECH_TEXT" | |
| ], | |
| "metadata": { | |
| "id": "M_28ZAeWdZT8" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "4. 音声ファイルを再生" | |
| ], | |
| "metadata": { | |
| "id": "K8mX_l9ZMiqa" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "import IPython\n", | |
| "IPython.display.Audio(\"audio.wav\")" | |
| ], | |
| "metadata": { | |
| "id": "cPtYsOfsdeq2" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "違う言葉で話させる場合は、上のフォームに入力して、再度3から順番に実行してみてください。\n", | |
| "\n", | |
| "多分音声が聞こえたと思いますが、いかにも音声合成という感じでイマイチだと思います。SHAREVOXでは学習済みの言語モデルのセットが用意されていますので、これに差し替えて見ましょう。" | |
| ], | |
| "metadata": { | |
| "id": "RqxmRhNai-UC" | |
| } | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "## 言語モデルを差し替えてみる\n", | |
| "\n", | |
| "1. 言語モデルを差し替えます。" | |
| ], | |
| "metadata": { | |
| "id": "QDpWioISSMkd" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "%cd ../../..\n", | |
| "!mv model model-0.1.2\n", | |
| "!mv ../archives/model .\n", | |
| "%cd example/cpp/unix" | |
| ], | |
| "metadata": { | |
| "id": "6EzzdaZjSPoE" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "2. 声を変えてみます。以下の声が選択できます。\n", | |
| "\n", | |
| "* 0: 小春音アミ(ノーマル)\n", | |
| "* 1: 小春音アミ(ノーマル)\n", | |
| "* 2: 小春音アミ(ノーマル)\n", | |
| "* 3: 小春音アミ(ノーマル)\n", | |
| "* 4: つくよみちゃん\n", | |
| "* 5: 白痴一\n", | |
| "* 6: 開発者\n", | |
| "\n", | |
| "下のフォームで選択してください。" | |
| ], | |
| "metadata": { | |
| "id": "DM5OYMzBVjwN" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "SPEECH_ID=\"6\" #@param [0, 1, 2, 3, 4, 5, 6]\n", | |
| "!perl -i -spe 's/^ int64_t speaker_id = \\d+;/ int64_t speaker_id = $id;/' -- -id=$SPEECH_ID simple_tts.cpp\n", | |
| "!grep \"int64_t speaker_id\" simple_tts.cpp" | |
| ], | |
| "metadata": { | |
| "id": "bASrqnXtWbUo" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "3. 再ビルドします。" | |
| ], | |
| "metadata": { | |
| "id": "MHnLgVoMrEu8" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "!pushd build; make; popd" | |
| ], | |
| "metadata": { | |
| "id": "TyWxkN7eQTEb" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "4 . 音声ファイルを出力\n", | |
| "\n", | |
| "話させたい言葉を下のフォームで設定します" | |
| ], | |
| "metadata": { | |
| "id": "z7cf6-tnrQKm" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "SPEECH_TEXT=\"\\u3053\\u3093\\u306B\\u3061\\u306F\\u3002\\u30B7\\u30A7\\u30A2\\u30DC\\u30C3\\u30AF\\u30B9\\u3067\\u9055\\u3046\\u8A00\\u8A9E\\u30E2\\u30C7\\u30EB\\u3067\\u8A71\\u3057\\u3066\\u307F\\u307E\\u3057\\u305F\\u3002\\u805E\\u3053\\u3048\\u3066\\u3044\\u307E\\u3059\\u304B\\uFF1F\" #@param {type:\"string\"}" | |
| ], | |
| "metadata": { | |
| "id": "J8sOzcUvTz1p" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "!build/simple_tts $SPEECH_TEXT" | |
| ], | |
| "metadata": { | |
| "id": "pdBS58OZTdNw" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "5. 音声ファイルを再生します" | |
| ], | |
| "metadata": { | |
| "id": "pm9bkHBlryqa" | |
| } | |
| }, | |
| { | |
| "cell_type": "code", | |
| "source": [ | |
| "import IPython\n", | |
| "IPython.display.Audio(\"audio.wav\")" | |
| ], | |
| "metadata": { | |
| "id": "dzrGjkx3TiKw" | |
| }, | |
| "execution_count": null, | |
| "outputs": [] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "source": [ | |
| "最初に実行したときと比べて、声が変わって少し流暢な感じになっているのがわかると思います 😀\n", | |
| "\n", | |
| "声を変える場合は2から、話させたい言葉を変える場合は4から順番に実行してください。" | |
| ], | |
| "metadata": { | |
| "id": "wc7RJLJEsBiR" | |
| } | |
| } | |
| ] | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment