Skip to content

Instantly share code, notes, and snippets.

@kun432
Created October 20, 2022 09:21
Show Gist options
  • Save kun432/0f18735683ceae8d51fc4822742c068d to your computer and use it in GitHub Desktop.
Save kun432/0f18735683ceae8d51fc4822742c068d to your computer and use it in GitHub Desktop.
sharevox_sample_c++.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"collapsed_sections": [],
"authorship_tag": "ABX9TyPlgESfBLya/kkYUgHIJY8a",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/kun432/0f18735683ceae8d51fc4822742c068d/sharevox_sample_c.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"## 事前準備\n",
"\n",
"SHAREVOXのコアや必要なライブラリ、日本語辞書、言語モデル等をダウンロードして展開します。\n",
"- ライブラリ等はGPU対応のものを使用していますが、Google Colab側でGPUを有効にしなくても使えるので、多分ちゃんとGPU使えていないです 😓\n",
"- とりあえず動くのでOKとしてます 😝"
],
"metadata": {
"id": "MMRSQHlHMGey"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Dibi1Uc-Ts09"
},
"outputs": [],
"source": [
"!mkdir archives\n",
"%cd archives\n",
"!wget https://github.com/SHAREVOX/sharevox_core/archive/refs/tags/0.1.2.zip && unzip 0.1.2.zip\n",
"!wget https://github.com/SHAREVOX/sharevox_core/releases/download/0.1.2/sharevox_core-linux-x64-gpu-0.1.2.zip && unzip sharevox_core-linux-x64-gpu-0.1.2.zip\n",
"!wget https://github.com/microsoft/onnxruntime/releases/download/v1.10.0/onnxruntime-linux-x64-gpu-1.10.0.tgz && tar zxvf onnxruntime-linux-x64-gpu-1.10.0.tgz\n",
"!wget http://downloads.sourceforge.net/open-jtalk/open_jtalk_dic_utf_8-1.11.tar.gz && tar zxvf open_jtalk_dic_utf_8-1.11.tar.gz\n",
"!wget https://github.com/SHAREVOX/sharevox_core/releases/download/0.1.0/sharevox_model-0.1.0.zip && unzip sharevox_model-0.1.0.zip\n",
"%cd ..\n",
"!cp -pr archives/sharevox_core-0.1.2 .\n",
"%cd sharevox_core-0.1.2\n",
"!cp -p ../archives/sharevox_core-linux-x64-gpu-0.1.2/libcore.so example/cpp/unix/.\n",
"!cp -p ../archives/onnxruntime-linux-x64-gpu-1.10.0/lib/* example/cpp/unix/.\n",
"!cp -pr ../archives/open_jtalk_dic_utf_8-1.11 example/cpp/unix/."
]
},
{
"cell_type": "markdown",
"source": [
"## SHAREVOXで音声合成\n",
"\n",
"C++のサンプルがあるのでそれを使ってやってみましょう。"
],
"metadata": {
"id": "BcEyiHJNjUGt"
}
},
{
"cell_type": "markdown",
"source": [
"1. C++サンプルのディレクトリに移動"
],
"metadata": {
"id": "Qd50KHtvMRk3"
}
},
{
"cell_type": "code",
"source": [
"%cd example/cpp/unix/"
],
"metadata": {
"id": "a905ziRHLlK5"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"2. ビルド"
],
"metadata": {
"id": "HLlbMjhaMWHJ"
}
},
{
"cell_type": "code",
"source": [
"!cmake -S . -B build\n",
"!cmake --build build"
],
"metadata": {
"id": "FjQZAvykZCQB"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"3. ビルドしたバイナリで音声ファイルを出力\n",
"\n",
"話させたい言葉を下のフォームに入力してください。\n",
"\n",
"* リスト項目\n",
"* リスト項目\n",
"\n"
],
"metadata": {
"id": "mmNq-GmFMZUL"
}
},
{
"cell_type": "code",
"source": [
"SPEECH_TEXT=\"\\u3053\\u3093\\u306B\\u3061\\u306F\\u3002\\u30B7\\u30A7\\u30A2\\u30DC\\u30C3\\u30AF\\u30B9\\u3067\\u8A71\\u3057\\u3066\\u307F\\u307E\\u3057\\u305F\\u3002\\u805E\\u3053\\u3048\\u3066\\u3044\\u307E\\u3059\\u304B\\uFF1F\" #@param {type:\"string\"}"
],
"metadata": {
"id": "0IS1rwHYPTG7"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!build/simple_tts $SPEECH_TEXT"
],
"metadata": {
"id": "M_28ZAeWdZT8"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"4. 音声ファイルを再生"
],
"metadata": {
"id": "K8mX_l9ZMiqa"
}
},
{
"cell_type": "code",
"source": [
"import IPython\n",
"IPython.display.Audio(\"audio.wav\")"
],
"metadata": {
"id": "cPtYsOfsdeq2"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"違う言葉で話させる場合は、上のフォームに入力して、再度3から順番に実行してみてください。\n",
"\n",
"多分音声が聞こえたと思いますが、いかにも音声合成という感じでイマイチだと思います。SHAREVOXでは学習済みの言語モデルのセットが用意されていますので、これに差し替えて見ましょう。"
],
"metadata": {
"id": "RqxmRhNai-UC"
}
},
{
"cell_type": "markdown",
"source": [
"## 言語モデルを差し替えてみる\n",
"\n",
"1. 言語モデルを差し替えます。"
],
"metadata": {
"id": "QDpWioISSMkd"
}
},
{
"cell_type": "code",
"source": [
"%cd ../../..\n",
"!mv model model-0.1.2\n",
"!mv ../archives/model .\n",
"%cd example/cpp/unix"
],
"metadata": {
"id": "6EzzdaZjSPoE"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"2. 声を変えてみます。以下の声が選択できます。\n",
"\n",
"* 0: 小春音アミ(ノーマル)\n",
"* 1: 小春音アミ(ノーマル)\n",
"* 2: 小春音アミ(ノーマル)\n",
"* 3: 小春音アミ(ノーマル)\n",
"* 4: つくよみちゃん\n",
"* 5: 白痴一\n",
"* 6: 開発者\n",
"\n",
"下のフォームで選択してください。"
],
"metadata": {
"id": "DM5OYMzBVjwN"
}
},
{
"cell_type": "code",
"source": [
"SPEECH_ID=\"6\" #@param [0, 1, 2, 3, 4, 5, 6]\n",
"!perl -i -spe 's/^ int64_t speaker_id = \\d+;/ int64_t speaker_id = $id;/' -- -id=$SPEECH_ID simple_tts.cpp\n",
"!grep \"int64_t speaker_id\" simple_tts.cpp"
],
"metadata": {
"id": "bASrqnXtWbUo"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"3. 再ビルドします。"
],
"metadata": {
"id": "MHnLgVoMrEu8"
}
},
{
"cell_type": "code",
"source": [
"!pushd build; make; popd"
],
"metadata": {
"id": "TyWxkN7eQTEb"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"4 . 音声ファイルを出力\n",
"\n",
"話させたい言葉を下のフォームで設定します"
],
"metadata": {
"id": "z7cf6-tnrQKm"
}
},
{
"cell_type": "code",
"source": [
"SPEECH_TEXT=\"\\u3053\\u3093\\u306B\\u3061\\u306F\\u3002\\u30B7\\u30A7\\u30A2\\u30DC\\u30C3\\u30AF\\u30B9\\u3067\\u9055\\u3046\\u8A00\\u8A9E\\u30E2\\u30C7\\u30EB\\u3067\\u8A71\\u3057\\u3066\\u307F\\u307E\\u3057\\u305F\\u3002\\u805E\\u3053\\u3048\\u3066\\u3044\\u307E\\u3059\\u304B\\uFF1F\" #@param {type:\"string\"}"
],
"metadata": {
"id": "J8sOzcUvTz1p"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!build/simple_tts $SPEECH_TEXT"
],
"metadata": {
"id": "pdBS58OZTdNw"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"5. 音声ファイルを再生します"
],
"metadata": {
"id": "pm9bkHBlryqa"
}
},
{
"cell_type": "code",
"source": [
"import IPython\n",
"IPython.display.Audio(\"audio.wav\")"
],
"metadata": {
"id": "dzrGjkx3TiKw"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"最初に実行したときと比べて、声が変わって少し流暢な感じになっているのがわかると思います 😀\n",
"\n",
"声を変える場合は2から、話させたい言葉を変える場合は4から順番に実行してください。"
],
"metadata": {
"id": "wc7RJLJEsBiR"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment