Skip to content

Instantly share code, notes, and snippets.

@shinseitaro
Created March 12, 2022 01:17
Show Gist options
  • Save shinseitaro/2729ebfd9e2b78206ef27e50e88581ae to your computer and use it in GitHub Desktop.
Save shinseitaro/2729ebfd9e2b78206ef27e50e88581ae to your computer and use it in GitHub Desktop.
insert or ignore.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "insert or ignore.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"authorship_tag": "ABX9TyOoKz8ol3Qd8L9laTowJGJt",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/shinseitaro/2729ebfd9e2b78206ef27e50e88581ae/insert-or-ignore.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# pandas `to_sql` メソッドで Insert or Ignore を実行したい\n",
"\n",
"簡単な `to_sql` メソッド の使い方と、今回当てたパッチについて書いています。パッチだけみたい方は、 `パッチをあてる` だけ見てください。\n"
],
"metadata": {
"id": "_GUDnNXytmfi"
}
},
{
"cell_type": "markdown",
"source": [
"## やりたいこと\n",
"\n",
"DataFrame(もしくは Series。ここではDataFrameで説明します。)がもつレコードを、`SQLite` に格納したい。その時、重複したデータは挿入したくない。\n",
"\n",
"つまりDataFrameのレコードを\n",
"\n",
"```sql\n",
"INSERT or IGNORE INTO x VALUES (...);\n",
"```\n",
"したい。\n"
],
"metadata": {
"id": "57nsaQxytpn3"
}
},
{
"cell_type": "markdown",
"source": [
"## .to_sql メソッド\n",
"`to_sql()` メソッドを使うと、DataFrame に格納されたレコードを SQL データベースに書き込むことができます。SQLAlchemy でサポートされているデータベースが対象です。テーブルの新規作成、追加、上書きが可能です。\n",
"\n",
"[pandas.DataFrame.to_sql — pandas 1.4.1 documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html)\n"
],
"metadata": {
"id": "D5pXEPwAttol"
}
},
{
"cell_type": "markdown",
"source": [
"\n",
"### 簡単な例\n",
"\n",
"1. DBをPythonで操作するためのライブラリ、 `SQLAlchemy` をインストールする。\n",
"1. SQLite database を /tmp ディレクトリに作成\n",
"1. test data を `test.db` に挿入。テーブル名は `mytable` にする。\n",
"1. インサートしたレコードを読み込む\n",
"\n",
"> SQLAlchemy と接続するには、`create_engine()` を使って、データベースの URI から engine オブジェクトを生成します。ここでは詳しく説明しません。こちらを参照してください。[Engine connection examples](https://pandas.pydata.org/docs/user_guide/io.html#engine-connection-examples)"
],
"metadata": {
"id": "iszNCIl1t3I5"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "TYR2hapBtKu5",
"outputId": "942ab32c-4eb5-4cf1-a094-e5b91a31e70b"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Requirement already satisfied: SQLAlchemy in /usr/local/lib/python3.7/dist-packages (1.4.32)\n",
"Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.7/dist-packages (from SQLAlchemy) (1.1.2)\n",
"Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from SQLAlchemy) (4.11.2)\n",
"Requirement already satisfied: typing-extensions>=3.6.4 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->SQLAlchemy) (3.10.0.2)\n",
"Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->SQLAlchemy) (3.7.0)\n"
]
}
],
"source": [
"# sqlalchemy インストール\n",
"!pip install SQLAlchemy"
]
},
{
"cell_type": "code",
"source": [
"from sqlalchemy import create_engine\n",
"\n",
"engine = create_engine(\n",
" 'sqlite:////tmp/test.db', \n",
" echo=True # ログ出力\n",
" )\n"
],
"metadata": {
"id": "mv2wqOmbv5iY"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# テストデータ作成\n",
"import pandas as pd \n",
"df_test = pd._testing.makeTimeDataFrame(5)\n",
"df_test.index.name=\"datetime\"\n",
"df_test"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 237
},
"id": "ApN0HWUytomC",
"outputId": "8b36c327-7a83-4ced-cfe7-fedecc0462bf"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B C D\n",
"datetime \n",
"2000-01-03 -0.123544 0.379581 -0.690869 -0.951027\n",
"2000-01-04 0.634132 -0.955994 0.452346 1.480265\n",
"2000-01-05 0.980905 1.408927 -1.257866 -1.739920\n",
"2000-01-06 0.907418 0.594829 -1.822315 2.378887\n",
"2000-01-07 1.310886 0.860470 -0.539857 -0.715786"
],
"text/html": [
"\n",
" <div id=\"df-fa3382f5-a3fc-4159-beb9-8198f6373c74\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" <th>C</th>\n",
" <th>D</th>\n",
" </tr>\n",
" <tr>\n",
" <th>datetime</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2000-01-03</th>\n",
" <td>-0.123544</td>\n",
" <td>0.379581</td>\n",
" <td>-0.690869</td>\n",
" <td>-0.951027</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-04</th>\n",
" <td>0.634132</td>\n",
" <td>-0.955994</td>\n",
" <td>0.452346</td>\n",
" <td>1.480265</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-05</th>\n",
" <td>0.980905</td>\n",
" <td>1.408927</td>\n",
" <td>-1.257866</td>\n",
" <td>-1.739920</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-06</th>\n",
" <td>0.907418</td>\n",
" <td>0.594829</td>\n",
" <td>-1.822315</td>\n",
" <td>2.378887</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-07</th>\n",
" <td>1.310886</td>\n",
" <td>0.860470</td>\n",
" <td>-0.539857</td>\n",
" <td>-0.715786</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-fa3382f5-a3fc-4159-beb9-8198f6373c74')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-fa3382f5-a3fc-4159-beb9-8198f6373c74 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-fa3382f5-a3fc-4159-beb9-8198f6373c74');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 3
}
]
},
{
"cell_type": "code",
"source": [
"# test.db に df_test のデータを挿入。テーブル名は mytable \n",
"df_test.to_sql(\n",
" \"mytable\", \n",
" con=engine, \n",
" if_exists='append', # \n",
" index=True, # dataframe の index を index として使う。デフォルトはTrue\n",
"\n",
")\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "dhgBhiAUuIDB",
"outputId": "0be6dc5e-e78f-4a81-eb25-a4cc837f1733"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,124 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,130 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,136 INFO sqlalchemy.engine.Engine PRAGMA temp.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,138 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,143 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:22,149 INFO sqlalchemy.engine.Engine \n",
"CREATE TABLE mytable (\n",
"\tdatetime DATETIME, \n",
"\t\"A\" FLOAT, \n",
"\t\"B\" FLOAT, \n",
"\t\"C\" FLOAT, \n",
"\t\"D\" FLOAT\n",
")\n",
"\n",
"\n",
"2022-03-12 01:04:22,155 INFO sqlalchemy.engine.Engine [no key 0.00599s] ()\n",
"2022-03-12 01:04:22,171 INFO sqlalchemy.engine.Engine CREATE INDEX ix_mytable_datetime ON mytable (datetime)\n",
"2022-03-12 01:04:22,173 INFO sqlalchemy.engine.Engine [no key 0.00186s] ()\n",
"2022-03-12 01:04:22,186 INFO sqlalchemy.engine.Engine COMMIT\n",
"2022-03-12 01:04:22,190 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:22,194 INFO sqlalchemy.engine.Engine INSERT INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)\n",
"2022-03-12 01:04:22,195 INFO sqlalchemy.engine.Engine [generated in 0.00183s] (('2000-01-03 00:00:00.000000', -0.12354442038250116, 0.37958096813029285, -0.6908686527400801, -0.9510269233054016), ('2000-01-04 00:00:00.000000', 0.634132335595418, -0.9559938161889382, 0.45234645681223346, 1.480265100928995), ('2000-01-05 00:00:00.000000', 0.9809046968804958, 1.4089271128237468, -1.2578660286058347, -1.7399200633057164), ('2000-01-06 00:00:00.000000', 0.9074178790167357, 0.5948293992272192, -1.8223152466452481, 2.378887252050187), ('2000-01-07 00:00:00.000000', 1.3108864287668685, 0.8604700423982842, -0.5398572776848934, -0.7157860057297839))\n",
"2022-03-12 01:04:22,198 INFO sqlalchemy.engine.Engine COMMIT\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# mytable からレコードを読み込む\n",
"engine.execute(\"SELECT * FROM mytable\").fetchall()\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "6P8saw4_wWSG",
"outputId": "b500b53e-d105-4ef9-e139-f77043e1ae7e"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,215 INFO sqlalchemy.engine.Engine SELECT * FROM mytable\n",
"2022-03-12 01:04:22,217 INFO sqlalchemy.engine.Engine [raw sql] ()\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[('2000-01-03 00:00:00.000000', -0.12354442038250116, 0.37958096813029285, -0.6908686527400801, -0.9510269233054016),\n",
" ('2000-01-04 00:00:00.000000', 0.634132335595418, -0.9559938161889382, 0.45234645681223346, 1.480265100928995),\n",
" ('2000-01-05 00:00:00.000000', 0.9809046968804958, 1.4089271128237468, -1.2578660286058347, -1.7399200633057164),\n",
" ('2000-01-06 00:00:00.000000', 0.9074178790167357, 0.5948293992272192, -1.8223152466452481, 2.378887252050187),\n",
" ('2000-01-07 00:00:00.000000', 1.3108864287668685, 0.8604700423982842, -0.5398572776848934, -0.7157860057297839)]"
]
},
"metadata": {},
"execution_count": 5
}
]
},
{
"cell_type": "markdown",
"source": [
"### 良い点\n",
"- ログからも分かるとおり、初回に `CREATE TABLE` をいい感じでやってくれる\n",
"- 二回目以降はインサートだけやってくれる。"
],
"metadata": {
"id": "MseBols_wwTg"
}
},
{
"cell_type": "markdown",
"source": [
"### 困った点\n",
"- 重複したレコードを挿入した場合に、 Replace もしくは Ignore してもらいたいが、できないが出来ない。\n",
"- `if_exists=` オプションは\n",
" - `append` 追加\n",
" - `replace` DropTableして、レコードを入れ直す\n",
" - `fail` (default) ValueErrorを吐く\n",
"\n",
"\n",
"\n"
],
"metadata": {
"id": "wk76t88v2sUV"
}
},
{
"cell_type": "code",
"source": [
"# 同じDateTimeを持つデータを作る\n",
"df_test2 = pd._testing.makeTimeDataFrame(5)\n",
"df_test2.index.name=\"datetime\"\n",
"df_test2"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 237
},
"id": "rTDnV2bt2rg_",
"outputId": "59585315-4766-474b-cd35-1f0ea97a2040"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B C D\n",
"datetime \n",
"2000-01-03 -1.077221 -0.126767 -0.675892 2.284929\n",
"2000-01-04 -0.743914 -0.237186 0.871497 1.036184\n",
"2000-01-05 1.371556 0.859110 -0.417913 0.362411\n",
"2000-01-06 -0.560620 0.266883 -0.702033 -0.292397\n",
"2000-01-07 -1.274753 -3.464823 2.399377 0.753354"
],
"text/html": [
"\n",
" <div id=\"df-f8e4e4e3-e92f-4611-83e4-deb323dd80a2\">\n",
" <div class=\"colab-df-container\">\n",
" <div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>A</th>\n",
" <th>B</th>\n",
" <th>C</th>\n",
" <th>D</th>\n",
" </tr>\n",
" <tr>\n",
" <th>datetime</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2000-01-03</th>\n",
" <td>-1.077221</td>\n",
" <td>-0.126767</td>\n",
" <td>-0.675892</td>\n",
" <td>2.284929</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-04</th>\n",
" <td>-0.743914</td>\n",
" <td>-0.237186</td>\n",
" <td>0.871497</td>\n",
" <td>1.036184</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-05</th>\n",
" <td>1.371556</td>\n",
" <td>0.859110</td>\n",
" <td>-0.417913</td>\n",
" <td>0.362411</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-06</th>\n",
" <td>-0.560620</td>\n",
" <td>0.266883</td>\n",
" <td>-0.702033</td>\n",
" <td>-0.292397</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2000-01-07</th>\n",
" <td>-1.274753</td>\n",
" <td>-3.464823</td>\n",
" <td>2.399377</td>\n",
" <td>0.753354</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n",
" <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f8e4e4e3-e92f-4611-83e4-deb323dd80a2')\"\n",
" title=\"Convert this dataframe to an interactive table.\"\n",
" style=\"display:none;\">\n",
" \n",
" <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
" width=\"24px\">\n",
" <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
" <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
" </svg>\n",
" </button>\n",
" \n",
" <style>\n",
" .colab-df-container {\n",
" display:flex;\n",
" flex-wrap:wrap;\n",
" gap: 12px;\n",
" }\n",
"\n",
" .colab-df-convert {\n",
" background-color: #E8F0FE;\n",
" border: none;\n",
" border-radius: 50%;\n",
" cursor: pointer;\n",
" display: none;\n",
" fill: #1967D2;\n",
" height: 32px;\n",
" padding: 0 0 0 0;\n",
" width: 32px;\n",
" }\n",
"\n",
" .colab-df-convert:hover {\n",
" background-color: #E2EBFA;\n",
" box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
" fill: #174EA6;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert {\n",
" background-color: #3B4455;\n",
" fill: #D2E3FC;\n",
" }\n",
"\n",
" [theme=dark] .colab-df-convert:hover {\n",
" background-color: #434B5C;\n",
" box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
" filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
" fill: #FFFFFF;\n",
" }\n",
" </style>\n",
"\n",
" <script>\n",
" const buttonEl =\n",
" document.querySelector('#df-f8e4e4e3-e92f-4611-83e4-deb323dd80a2 button.colab-df-convert');\n",
" buttonEl.style.display =\n",
" google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
"\n",
" async function convertToInteractive(key) {\n",
" const element = document.querySelector('#df-f8e4e4e3-e92f-4611-83e4-deb323dd80a2');\n",
" const dataTable =\n",
" await google.colab.kernel.invokeFunction('convertToInteractive',\n",
" [key], {});\n",
" if (!dataTable) return;\n",
"\n",
" const docLinkHtml = 'Like what you see? Visit the ' +\n",
" '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
" + ' to learn more about interactive tables.';\n",
" element.innerHTML = '';\n",
" dataTable['output_type'] = 'display_data';\n",
" await google.colab.output.renderOutput(dataTable, element);\n",
" const docLink = document.createElement('div');\n",
" docLink.innerHTML = docLinkHtml;\n",
" element.appendChild(docLink);\n",
" }\n",
" </script>\n",
" </div>\n",
" </div>\n",
" "
]
},
"metadata": {},
"execution_count": 6
}
]
},
{
"cell_type": "code",
"source": [
"# レコードを挿入\n",
"df_test2.to_sql(\n",
" \"mytable\", \n",
" con=engine, \n",
" if_exists='append', # append はただの追加\n",
")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "jBXzeyDp4giO",
"outputId": "2e78a305-f5b9-4301-a24f-1dd8ee847a34"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,276 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,279 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,284 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:22,287 INFO sqlalchemy.engine.Engine INSERT INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)\n",
"2022-03-12 01:04:22,289 INFO sqlalchemy.engine.Engine [generated in 0.00245s] (('2000-01-03 00:00:00.000000', -1.077221091672625, -0.1267668297671788, -0.6758918940438777, 2.284928553212947), ('2000-01-04 00:00:00.000000', -0.7439139542380407, -0.2371858340936854, 0.8714974355315291, 1.0361841193180115), ('2000-01-05 00:00:00.000000', 1.3715564541679814, 0.8591099571901915, -0.4179130019099759, 0.3624109919516517), ('2000-01-06 00:00:00.000000', -0.560620198095559, 0.2668825463068234, -0.7020332515875394, -0.29239741149533993), ('2000-01-07 00:00:00.000000', -1.2747527237965657, -3.4648226991173865, 2.399377060933696, 0.7533536833893498))\n",
"2022-03-12 01:04:22,293 INFO sqlalchemy.engine.Engine COMMIT\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"engine.execute(\"SELECT * FROM mytable\").fetchall()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "rtsCkd_E4rCT",
"outputId": "7ddf366e-82e4-43b5-bdab-8422bf1b9887"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,312 INFO sqlalchemy.engine.Engine SELECT * FROM mytable\n",
"2022-03-12 01:04:22,316 INFO sqlalchemy.engine.Engine [raw sql] ()\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[('2000-01-03 00:00:00.000000', -0.12354442038250116, 0.37958096813029285, -0.6908686527400801, -0.9510269233054016),\n",
" ('2000-01-04 00:00:00.000000', 0.634132335595418, -0.9559938161889382, 0.45234645681223346, 1.480265100928995),\n",
" ('2000-01-05 00:00:00.000000', 0.9809046968804958, 1.4089271128237468, -1.2578660286058347, -1.7399200633057164),\n",
" ('2000-01-06 00:00:00.000000', 0.9074178790167357, 0.5948293992272192, -1.8223152466452481, 2.378887252050187),\n",
" ('2000-01-07 00:00:00.000000', 1.3108864287668685, 0.8604700423982842, -0.5398572776848934, -0.7157860057297839),\n",
" ('2000-01-03 00:00:00.000000', -1.077221091672625, -0.1267668297671788, -0.6758918940438777, 2.284928553212947),\n",
" ('2000-01-04 00:00:00.000000', -0.7439139542380407, -0.2371858340936854, 0.8714974355315291, 1.0361841193180115),\n",
" ('2000-01-05 00:00:00.000000', 1.3715564541679814, 0.8591099571901915, -0.4179130019099759, 0.3624109919516517),\n",
" ('2000-01-06 00:00:00.000000', -0.560620198095559, 0.2668825463068234, -0.7020332515875394, -0.29239741149533993),\n",
" ('2000-01-07 00:00:00.000000', -1.2747527237965657, -3.4648226991173865, 2.399377060933696, 0.7533536833893498)]"
]
},
"metadata": {},
"execution_count": 8
}
]
},
{
"cell_type": "code",
"source": [
"df_test2.to_sql(\n",
" \"mytable\", \n",
" con=engine, \n",
" if_exists='replace', # replace はテーブルをDropして、新しいレコードを挿入する\n",
")\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XgrSLK4g3DDO",
"outputId": "1cca5bc9-d28f-40e0-f749-44553dce6207"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,338 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,340 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,343 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,344 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,349 INFO sqlalchemy.engine.Engine SELECT name FROM sqlite_master WHERE type='table' ORDER BY name\n",
"2022-03-12 01:04:22,350 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,352 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,354 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,357 INFO sqlalchemy.engine.Engine SELECT sql FROM (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_temp_master) WHERE name = ? AND type = 'table'\n",
"2022-03-12 01:04:22,359 INFO sqlalchemy.engine.Engine [raw sql] ('mytable',)\n",
"2022-03-12 01:04:22,361 INFO sqlalchemy.engine.Engine PRAGMA main.foreign_key_list(\"mytable\")\n",
"2022-03-12 01:04:22,363 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,364 INFO sqlalchemy.engine.Engine PRAGMA temp.foreign_key_list(\"mytable\")\n",
"2022-03-12 01:04:22,366 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,368 INFO sqlalchemy.engine.Engine SELECT sql FROM (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_temp_master) WHERE name = ? AND type = 'table'\n",
"2022-03-12 01:04:22,371 INFO sqlalchemy.engine.Engine [raw sql] ('mytable',)\n",
"2022-03-12 01:04:22,375 INFO sqlalchemy.engine.Engine PRAGMA main.index_list(\"mytable\")\n",
"2022-03-12 01:04:22,378 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,381 INFO sqlalchemy.engine.Engine PRAGMA main.index_info(\"ix_mytable_datetime\")\n",
"2022-03-12 01:04:22,383 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,385 INFO sqlalchemy.engine.Engine PRAGMA main.index_list(\"mytable\")\n",
"2022-03-12 01:04:22,387 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,389 INFO sqlalchemy.engine.Engine PRAGMA main.index_info(\"ix_mytable_datetime\")\n",
"2022-03-12 01:04:22,392 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:22,394 INFO sqlalchemy.engine.Engine SELECT sql FROM (SELECT * FROM sqlite_master UNION ALL SELECT * FROM sqlite_temp_master) WHERE name = ? AND type = 'table'\n",
"2022-03-12 01:04:22,395 INFO sqlalchemy.engine.Engine [raw sql] ('mytable',)\n",
"2022-03-12 01:04:22,404 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:22,406 INFO sqlalchemy.engine.Engine \n",
"DROP TABLE mytable\n",
"2022-03-12 01:04:22,408 INFO sqlalchemy.engine.Engine [no key 0.00213s] ()\n",
"2022-03-12 01:04:22,425 INFO sqlalchemy.engine.Engine COMMIT\n",
"2022-03-12 01:04:22,429 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:22,433 INFO sqlalchemy.engine.Engine \n",
"CREATE TABLE mytable (\n",
"\tdatetime DATETIME, \n",
"\t\"A\" FLOAT, \n",
"\t\"B\" FLOAT, \n",
"\t\"C\" FLOAT, \n",
"\t\"D\" FLOAT\n",
")\n",
"\n",
"\n",
"2022-03-12 01:04:22,434 INFO sqlalchemy.engine.Engine [no key 0.00158s] ()\n",
"2022-03-12 01:04:22,447 INFO sqlalchemy.engine.Engine CREATE INDEX ix_mytable_datetime ON mytable (datetime)\n",
"2022-03-12 01:04:22,450 INFO sqlalchemy.engine.Engine [no key 0.00319s] ()\n",
"2022-03-12 01:04:22,461 INFO sqlalchemy.engine.Engine COMMIT\n",
"2022-03-12 01:04:22,466 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:22,470 INFO sqlalchemy.engine.Engine INSERT INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)\n",
"2022-03-12 01:04:22,472 INFO sqlalchemy.engine.Engine [generated in 0.00249s] (('2000-01-03 00:00:00.000000', -1.077221091672625, -0.1267668297671788, -0.6758918940438777, 2.284928553212947), ('2000-01-04 00:00:00.000000', -0.7439139542380407, -0.2371858340936854, 0.8714974355315291, 1.0361841193180115), ('2000-01-05 00:00:00.000000', 1.3715564541679814, 0.8591099571901915, -0.4179130019099759, 0.3624109919516517), ('2000-01-06 00:00:00.000000', -0.560620198095559, 0.2668825463068234, -0.7020332515875394, -0.29239741149533993), ('2000-01-07 00:00:00.000000', -1.2747527237965657, -3.4648226991173865, 2.399377060933696, 0.7533536833893498))\n",
"2022-03-12 01:04:22,475 INFO sqlalchemy.engine.Engine COMMIT\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"engine.execute(\"SELECT * FROM mytable\").fetchall()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "tADgU-br3I1G",
"outputId": "6b1e7d19-c0d2-4a91-dc5d-603cad68bf7b"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,494 INFO sqlalchemy.engine.Engine SELECT * FROM mytable\n",
"2022-03-12 01:04:22,496 INFO sqlalchemy.engine.Engine [raw sql] ()\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[('2000-01-03 00:00:00.000000', -1.077221091672625, -0.1267668297671788, -0.6758918940438777, 2.284928553212947),\n",
" ('2000-01-04 00:00:00.000000', -0.7439139542380407, -0.2371858340936854, 0.8714974355315291, 1.0361841193180115),\n",
" ('2000-01-05 00:00:00.000000', 1.3715564541679814, 0.8591099571901915, -0.4179130019099759, 0.3624109919516517),\n",
" ('2000-01-06 00:00:00.000000', -0.560620198095559, 0.2668825463068234, -0.7020332515875394, -0.29239741149533993),\n",
" ('2000-01-07 00:00:00.000000', -1.2747527237965657, -3.4648226991173865, 2.399377060933696, 0.7533536833893498)]"
]
},
"metadata": {},
"execution_count": 10
}
]
},
{
"cell_type": "code",
"source": [
"# デフォルトは ValueError の送出\n",
"df_test2.to_sql(\n",
" \"mytable\", \n",
" con=engine, \n",
")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 411
},
"id": "R2nPDOT4cj9_",
"outputId": "8c99eece-7ac4-4936-da2c-3e193750693c"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:22,514 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:22,516 INFO sqlalchemy.engine.Engine [raw sql] ()\n"
]
},
{
"output_type": "error",
"ename": "ValueError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-11-b9488847e125>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 2\u001b[0m df_test2.to_sql(\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\"mytable\"\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0mcon\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mengine\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m )\n",
"\u001b[0;32m/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py\u001b[0m in \u001b[0;36mto_sql\u001b[0;34m(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)\u001b[0m\n\u001b[1;32m 2880\u001b[0m \u001b[0mchunksize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mchunksize\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2881\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2882\u001b[0;31m \u001b[0mmethod\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmethod\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2883\u001b[0m )\n\u001b[1;32m 2884\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.7/dist-packages/pandas/io/sql.py\u001b[0m in \u001b[0;36mto_sql\u001b[0;34m(frame, name, con, schema, if_exists, index, index_label, chunksize, dtype, method, engine, **engine_kwargs)\u001b[0m\n\u001b[1;32m 726\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mmethod\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 727\u001b[0m \u001b[0mengine\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mengine\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 728\u001b[0;31m \u001b[0;34m**\u001b[0m\u001b[0mengine_kwargs\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 729\u001b[0m )\n\u001b[1;32m 730\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.7/dist-packages/pandas/io/sql.py\u001b[0m in \u001b[0;36mto_sql\u001b[0;34m(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype, method, engine, **engine_kwargs)\u001b[0m\n\u001b[1;32m 1756\u001b[0m \u001b[0mindex_label\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mindex_label\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1757\u001b[0m \u001b[0mschema\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mschema\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1758\u001b[0;31m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1759\u001b[0m )\n\u001b[1;32m 1760\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.7/dist-packages/pandas/io/sql.py\u001b[0m in \u001b[0;36mprep_table\u001b[0;34m(self, frame, name, if_exists, index, index_label, schema, dtype)\u001b[0m\n\u001b[1;32m 1648\u001b[0m \u001b[0mdtype\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtype\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1649\u001b[0m )\n\u001b[0;32m-> 1650\u001b[0;31m \u001b[0mtable\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 1651\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mtable\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 1652\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/usr/local/lib/python3.7/dist-packages/pandas/io/sql.py\u001b[0m in \u001b[0;36mcreate\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 856\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexists\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 857\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mif_exists\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m\"fail\"\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 858\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf\"Table '{self.name}' already exists.\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 859\u001b[0m \u001b[0;32melif\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mif_exists\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m\"replace\"\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 860\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpd_sql\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdrop_table\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mschema\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mValueError\u001b[0m: Table 'mytable' already exists."
]
}
]
},
{
"cell_type": "markdown",
"source": [
"## パッチをあてる\n",
"\n",
"色々探した結果、この解決方法が一番よかったので紹介します。\n",
"- [Python: Pandas.to_sql INSERT IGNORE the correct way (sqlalchemy)](https://sick.codes/python-pandas-to_sql-insert-ignore-the-correct-way-sqlalchemy/)\n",
"\n",
"```python\n",
"from sqlalchemy.ext.compiler import compiles\n",
"from sqlalchemy.sql.expression import Insert\n",
"\n",
"# adds the word IGNORE after INSERT in sqlalchemy\n",
"@compiles(Insert)\n",
"def _prefix_insert_with_ignore(insert, compiler, **kw):\n",
" return compiler.visit_insert(insert.prefix_with('IGNORE'), **kw)\n",
"```\n",
"\n",
"これは、 `.to_sql` の実行時に走るSQL文\n",
"```sql\n",
"`INSERT INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)` \n",
"```\n",
"の `INSERT` の直後に `IGNORE` という文字列を入れてしまうというパッチのようです。\n",
"\n",
"よってこれを実行した後、 `.to_sql` メソッドを実行すると\n",
"```sql\n",
"`INSERT IGNORE INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)` \n",
"```\n",
"として実行されるようです。\n",
"\n",
"SQLITEの場合、構文が `INSERT or IGNORE` なので (参照:[INSERT](https://www.sqlite.org/lang_insert.html))\n",
"\n",
"```python\n",
"@compiles(Insert)\n",
"def _prefix_insert_with_ignore(insert, compiler, **kw):\n",
" return compiler.visit_insert(insert.prefix_with(\"or IGNORE\"), **kw)\n",
"```\n",
"\n",
"とすれば良いです。\n",
"\n",
"\n"
],
"metadata": {
"id": "8381d9x54xvG"
}
},
{
"cell_type": "markdown",
"source": [
"### CREATE TABLE\n",
"先述した通り、to_sql メソッドは、`Create Tableしなくても、DataFrameからいい感じにテーブルを作ってくれ`ます。その時の構文は\n",
"\n",
"```sql\n",
"CREATE TABLE mytable (\n",
"\tdatetime DATETIME, \n",
"\t\"A\" FLOAT, \n",
"\t\"B\" FLOAT, \n",
"\t\"C\" FLOAT, \n",
"\t\"D\" FLOAT\n",
")\n",
"```\n",
"で、重複したデータを処理するための、`UNIQUE constraints` や `PRIMARY KEY` は入っていません。\n",
"\n",
"よってこのパッチを使う場合は **`to_sql` メソッドが自動で作成してくれるTableではなく、先にDBとTableを作成する必要があります**。\n"
],
"metadata": {
"id": "AzXqwM1vlts0"
}
},
{
"cell_type": "markdown",
"source": [
"### パッチ"
],
"metadata": {
"id": "VsBeLa2nfNrI"
}
},
{
"cell_type": "code",
"source": [
"import sqlite3 \n",
"\n",
"con = sqlite3.connect(f\"/tmp/test2.db\")\n",
"cur = con.cursor()\n",
"\n",
"sql = f\"\"\"CREATE TABLE mytable (\n",
"datetime DATETIME, \n",
" \"A\" FLOAT, \n",
" \"B\" FLOAT, \n",
" \"C\" FLOAT, \n",
" \"D\" FLOAT,\n",
"PRIMARY KEY(\"DATETIME\"));\"\"\"\n",
"cur.execute(sql)\n",
"con.commit()\n",
"con.close()\n",
"\n"
],
"metadata": {
"id": "xbmrVmGKoFY6"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"from sqlalchemy.ext.compiler import compiles\n",
"from sqlalchemy.sql.expression import Insert\n",
"@compiles(Insert)\n",
"def _prefix_insert_with_ignore(insert, compiler, **kw):\n",
" return compiler.visit_insert(insert.prefix_with(\"or IGNORE\"), **kw)"
],
"metadata": {
"id": "E7GA6q5I3Kel"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"from sqlalchemy import create_engine\n",
"\n",
"engine = create_engine(\n",
" 'sqlite:////tmp/test2.db', \n",
" echo=True \n",
" )\n",
"\n",
"df_test.to_sql(\n",
" \"mytable\", \n",
" con=engine, \n",
" if_exists='append',\n",
" index=True, \n",
")\n",
"df_test2.to_sql(\n",
" \"mytable\", \n",
" con=engine, \n",
" if_exists='append',\n",
" index=True, \n",
")\n",
"\n",
"engine.execute(\"SELECT * FROM mytable\").fetchall()\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "HHwsxFx1fp1i",
"outputId": "b4605846-8e56-46f0-e12f-2a6fa5887387"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"2022-03-12 01:04:54,355 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:54,357 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:54,362 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:54,365 INFO sqlalchemy.engine.Engine INSERT or IGNORE INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)\n",
"2022-03-12 01:04:54,366 INFO sqlalchemy.engine.Engine [generated in 0.00176s] (('2000-01-03 00:00:00.000000', -0.12354442038250116, 0.37958096813029285, -0.6908686527400801, -0.9510269233054016), ('2000-01-04 00:00:00.000000', 0.634132335595418, -0.9559938161889382, 0.45234645681223346, 1.480265100928995), ('2000-01-05 00:00:00.000000', 0.9809046968804958, 1.4089271128237468, -1.2578660286058347, -1.7399200633057164), ('2000-01-06 00:00:00.000000', 0.9074178790167357, 0.5948293992272192, -1.8223152466452481, 2.378887252050187), ('2000-01-07 00:00:00.000000', 1.3108864287668685, 0.8604700423982842, -0.5398572776848934, -0.7157860057297839))\n",
"2022-03-12 01:04:54,369 INFO sqlalchemy.engine.Engine COMMIT\n",
"2022-03-12 01:04:54,384 INFO sqlalchemy.engine.Engine PRAGMA main.table_info(\"mytable\")\n",
"2022-03-12 01:04:54,387 INFO sqlalchemy.engine.Engine [raw sql] ()\n",
"2022-03-12 01:04:54,393 INFO sqlalchemy.engine.Engine BEGIN (implicit)\n",
"2022-03-12 01:04:54,396 INFO sqlalchemy.engine.Engine INSERT or IGNORE INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)\n",
"2022-03-12 01:04:54,398 INFO sqlalchemy.engine.Engine [generated in 0.00216s] (('2000-01-03 00:00:00.000000', -1.077221091672625, -0.1267668297671788, -0.6758918940438777, 2.284928553212947), ('2000-01-04 00:00:00.000000', -0.7439139542380407, -0.2371858340936854, 0.8714974355315291, 1.0361841193180115), ('2000-01-05 00:00:00.000000', 1.3715564541679814, 0.8591099571901915, -0.4179130019099759, 0.3624109919516517), ('2000-01-06 00:00:00.000000', -0.560620198095559, 0.2668825463068234, -0.7020332515875394, -0.29239741149533993), ('2000-01-07 00:00:00.000000', -1.2747527237965657, -3.4648226991173865, 2.399377060933696, 0.7533536833893498))\n",
"2022-03-12 01:04:54,403 INFO sqlalchemy.engine.Engine COMMIT\n",
"2022-03-12 01:04:54,406 INFO sqlalchemy.engine.Engine SELECT * FROM mytable\n",
"2022-03-12 01:04:54,408 INFO sqlalchemy.engine.Engine [raw sql] ()\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[('2000-01-03 00:00:00.000000', -0.12354442038250116, 0.37958096813029285, -0.6908686527400801, -0.9510269233054016),\n",
" ('2000-01-04 00:00:00.000000', 0.634132335595418, -0.9559938161889382, 0.45234645681223346, 1.480265100928995),\n",
" ('2000-01-05 00:00:00.000000', 0.9809046968804958, 1.4089271128237468, -1.2578660286058347, -1.7399200633057164),\n",
" ('2000-01-06 00:00:00.000000', 0.9074178790167357, 0.5948293992272192, -1.8223152466452481, 2.378887252050187),\n",
" ('2000-01-07 00:00:00.000000', 1.3108864287668685, 0.8604700423982842, -0.5398572776848934, -0.7157860057297839)]"
]
},
"metadata": {},
"execution_count": 14
}
]
},
{
"cell_type": "markdown",
"source": [
"ログでも確認できるように、\n",
"\n",
"```sql\n",
"INSERT or IGNORE INTO mytable (datetime, \"A\", \"B\", \"C\", \"D\") VALUES (?, ?, ?, ?, ?)\n",
"```\n",
"\n",
"or IGNORE が追記されていますし、レコードも、df_test のレコードだけが挿入されていて、同じインデックスを持つ `df_test2` は挿入されていません。\n",
"\n"
],
"metadata": {
"id": "d6bsPmIGqGu5"
}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment