Skip to content

Instantly share code, notes, and snippets.

@PandoraRiot
Created March 28, 2022 21:34
Show Gist options
  • Save PandoraRiot/b920a56a8e4bd646a90c21d23b660c84 to your computer and use it in GitHub Desktop.
Save PandoraRiot/b920a56a8e4bd646a90c21d23b660c84 to your computer and use it in GitHub Desktop.
03_00_Clasificacion_BW__TF_IDF.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "03_00_Clasificacion_BW__TF_IDF.ipynb",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/PandoraRiot/b920a56a8e4bd646a90c21d23b660c84/03_00_clasificacion_bw__tf_idf.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"metadata": {
"id": "4wCFBly4uu9c"
},
"source": [
"import pandas as pd\n",
"from sklearn.feature_extraction.text import TfidfVectorizer"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "355mRwx6uyki"
},
"source": [
"documentA = 'i love dogs'\n",
"documentB = 'i hate dogs and knitting'\n",
"documentC ='knitting is my hobby and my passion'"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "GUlaDZXYvC6a"
},
"source": [
"bagOfWordsA = documentA.split(' ')\n",
"bagOfWordsB = documentB.split(' ')\n",
"bagOfWordsC = documentC.split(' ')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "bTuUh7Hlw84Z"
},
"source": [
"#VEAMOSLO COMO BAG OF WORDS\n",
"\n",
"# Cargar libreria\n",
"import numpy as np\n",
"from sklearn.feature_extraction.text import CountVectorizer\n",
"# Crear vector de textos\n",
"text_data = np.array([documentA,documentB,documentC])\n",
"\n",
"# Crear bolsa de palabas (matriz)\n",
"count = CountVectorizer()\n",
"bag_of_words = count.fit_transform(text_data)\n",
"\n",
"# A arreglo\n",
"bag_of_words.toarray()\n",
"\n",
"\n",
"# Obtener nombres para las columnas\n",
"feature_names = count.get_feature_names()\n",
"\n",
"# ver nombre de las columnas\n",
"feature_names\n",
"\n",
"# Crear data frame\n",
"df_bw=pd.DataFrame(bag_of_words.toarray(), columns=feature_names)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
""
],
"metadata": {
"id": "WfAmbfvrc5Lq"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
""
],
"metadata": {
"id": "fIIinhRac5Ya"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "ALHqkk54w_FC",
"outputId": "995bb76c-545b-4d7f-e91b-70c00f693486",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 141
}
},
"source": [
"df_bw"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>and</th>\n",
" <th>dogs</th>\n",
" <th>hate</th>\n",
" <th>hobby</th>\n",
" <th>is</th>\n",
" <th>knitting</th>\n",
" <th>love</th>\n",
" <th>my</th>\n",
" <th>passion</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" and dogs hate hobby is knitting love my passion\n",
"0 0 1 0 0 0 0 1 0 0\n",
"1 1 1 1 0 0 1 0 0 0\n",
"2 1 0 0 1 1 1 0 2 1"
]
},
"metadata": {
"tags": []
},
"execution_count": 5
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "7oDH-A3yNFQx"
},
"source": [
"Tf-idf (del inglés Term frequency – Inverse document frequency), frecuencia de término – frecuencia inversa de documento (https://es.wikipedia.org/wiki/Tf-idf)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BIF2ywCMMzSS"
},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZQAAAA4CAYAAADabA6BAAAgAElEQVR4nO2d/0/T1/7H+w/0F39o0h9umiwhhCyELMRgIMRrRpDIgmAwYBzoFIfjMp0yqyKulzIRVCoKlKKFuwrKlQkI63T0VoFZRBi7iG4FcXOIc2r5DqUUS9/Pzw/mnL3fpd/0g5PtnkfCD8C7755z3q/z+nbO+3VEYDAYDAZjCRC96QYwGAwG468BMygMBoPBWBKYQWEwGAzGksAMCoPBYDCWBGZQGAwGg7EkMIPCYDAYjCWBGRQGg8FgLAnMoDAYDAZjSWAGhcFgMBhLAjMoDAaDwVgSmEHxgcPhwPXr1xEdHQ2xWIzOzk7B/wwGA8LCwrB+/XrcunULHMcteRsePnyI06dPIy4uDidOnMDCwsKSf8fLwnEczGYzKioqcPHiRdjtdr8+t7CwgNzcXKxatQo//fTTa24l48/Kjz/+iJCQEBQWFr6xNoyMjKC1tRVarRanTp3yW8b/l2EGxU+OHz8OkUiEAwcOCBT67OwsPvzwQ9TW1r7W7zeZTBCLxbh8+fJr/R5/4DgO58+fh1wuh9FohFQqxdWrV/3+rFqtRnx8PH7++efX3FLGn5WhoSHExsZCo9G8UQeqv78fwcHB0Gq1b6wNfyaYQfGD6elppKamQiKRIDw8HENDQ/R/Q0NDiIuLw8DAwGttg0ajQUhIyGv/Hn8YGBjAqlWr0N7eDo7jMDs76zUy6+npwZ07d/7AFvqP1WpFc3MzbDbbm27KsmoL4wUGgwFSqRRdXV1vuimvjaWUO2ZQ/GBoaAhbtmxBdnY2RCIR/v3vf9P/mUwmfPDBB5iZmXlt32+z2ZCeno4tW7a81u/xF41Gg5iYGDx69MjntT09PYiOjsbdu3fp35xO57JIH9hsNvzzn/+EQqHA/Pw8a8syw263w+l0vrHv5zgOeXl5fsv6n5GlljtmUPzAYDAgKysLZrMZYWFheP/99zE9PQ0AUKlUKC4uph76zMwMzGYzLl68iIcPHwJ4EeHcuHEDjx49wtDQEBoaGnDo0CEUFRXh6dOnOHr0KGJjY2EymTA2NoajR48iOjoaer0ewAuDFh4eDpVKBaPRiOTkZMTHx6O1tVXQTqfTCbPZjKqqKpw+fRrDw8OYn5/Ho0ePYDQa0dDQgImJCeh0OhiNRo/95TgO9+/fR21tLUpKSnDz5k04nU6YTCbk5eUhIiICQUFBUCgUi9pAmJiYwJEjR7BixQoEBAQgNzcX33zzDXbu3AmxWAytVouRkRHo9XocOXIEBw4cgMViQXl5OWJjY/Hll1/CZrPR3zUajSAKctdX4IUSMhgMKC8vR319PS5duoTx8fFF7fv222+xdu1aiEQiJCUloaGhgaZWbDYbWltboVarUV1dDZvN5nEc6+vrX7kP/rTFFafTiZ6eHlRUVODChQv417/+hZaWFszMzOCrr76CUqnE/v37YTAYsGvXLionz58/R3t7O44cOYJdu3bBaDQiJycHsbGxuHDhgqBdHMdheHgY586dw6lTpxZFl1arFXq9HmVlZWhqakJJSQnu3bvnU9YePnyIM2fOoLKyEleuXMG33367qH+zs7OorKxEaGgoYmJiMDQ0hJ6eHpSVlWH37t3o7u7G999/j5SUFHz00Udun60/bfWE0+nE3bt3odFoUFFRgdjYWGRlZQkcIKvVCqPRiLNnz+KLL77A6Oio4B4OhwPd3d2orKxEZWUlLBYLHA4Hnj59itbWVuj1evp8zWYzuru7MTMzA5PJhLKyMqSnp+OHH35AW1sbkpOTcfDgQVitVvq7a7/dzYWRkRE0NDRAqVTiyJEjGBsbQ3FxsUCvvIzc+QszKD4gXopWq8XCwgIOHjyIgIAA9PX1wWazITMzE+3t7fT6iYkJfPLJJwKvpr6+HlKplF5nMpkgEolQUFCAnJwc6PV6BAcH47PPPsPevXvR0tKCyMhIyOVyzM/Pw2AwQCKRYNOmTairq8PIyAhSUlIQGhqK+/fvAwAsFgu2b9+OEydOYGxsDOnp6di3bx+sViu6u7sRHBwMpVKJAwcOIDAwEElJSZiamlrUX6vVipycHMjlcoyNjeHBgwcICwujOeT79+8jNDQURUVFPjcgPH78GH//+9+Rn59PBVWn00Emk+G7774D8CJ9FhISgv3790OhUODKlSt49913sXv3buzduxdXrlxBUlKSoL2e+vr8+XNotVrExcXh2bNn6OrqQlJSkkel4y6N2NnZiZiYGBiNRto2vV6P+fl5j+PY09Pz0n3wpy2u8Ptts9mg0+kE62qPHj1CZGQkZDIZqqurMTY2hh07dtA07dTUFJKTkyEWi3H8+HFMTk4iLy8PUqkUPT09bp+/UqkUtJs/PlarFWlpaYiIiMAvv/zidYzu3buHqKgo1NTUwG63Y9euXaivr/fYz6ioKKrIHQ4H5HI5YmJicO7cOarsZTIZent7PY6Xt7a6Y2xsDP/4xz9w/PhxWK1W6HQ6iEQi1NTUuL2n0+lEUVER3n33XTx+/BgAMDg4iMTERNTV1WFubg579uzBli1bMDExgRs3buCtt96ic2l8fBwJCQlIT0+HzWajmYjY2FicPHkS1dXVkMvliI6ORn5+PtRqNYqKigT99jYXent78dZbb0GlUiEvLw9tbW2IioqiesVfuXsZmEHxwfT0NLZv304VoNFohFgsRn5+PoaGhpCUlEQjEXJ9SkoKdu3aBbvdDqvViq1btwoMjE6ng0QiQWJiIgYHB9He3g6RSISQkBD09fXBYDAIBDk/Px9isRglJSVwOBwAAKVSSQVramoKqamp2LZtG6ampmA2mxEREYHGxkYAQF9fHwICAvDOO++go6MDMzMzblNnxECuW7eOevyjo6NYu3YtVSrt7e2QSqX45ptvfI6dyWSCTCaj187Pz2PPnj1U4QNAe3s7JBIJ1qxZg5s3b9JFUJlMhqtXr9K2E6Pkq68qlYpOkPHxcZw8edJtbthdGrGvr48uwDqdTlRXVyM6OppuHvA0ji/bB3/a4orVakVGRgZ1EgCgpqZGoAw6OzshFotRUFAgkBNiMIgzkJGRIbiHSCRCU1MTHA4HsrOz6fN/8uQJ4uLioFarsbCwALPZjNWrV6Ourg4cx9F2E4XobYxIlJ2XlweO4/DVV1/h5s2bbvva09MDqVRK5X9ychKJiYkIDw+HQqGA1WpFfn6+11SUP211N778sdHpdAgKCqLpWpPJhICAAGi1WupMkfHT6/V4+vQp1q9fj1OnTsHhcGBhYQFffvklLl68SHeE8tdjiC4hBoYY0tDQUJSWlmJychJpaWkQiUTIzc2ljiSZP77mQlNTEyQSCeLj43Hz5k36/Mn3vY5UOjMoPhgYGMCmTZuo4BKvYu3atWhoaMDOnTsFAkq8WvLQurq6IJVKqSCTKEcqlVJh12g0VLA4joNKpaLGghgovpInijkmJgbDw8M4c+YMpFIpmpubcevWLSQkJKC+vp4qFSL0ubm5XtcuLl++DIlEgvPnz9MJQxQBaf/LeDQajQZhYWF48OABgN8jFn4KQaPRQCQS0UlIjClpa1NTE8RiMfR6PTiO89nXvr4+BAUFYePGjV7TIaQtJNIikzM6OhoDAwOorKxEamqqIN3jaRxfpg/+tMUV0u+goCDcvn0bAKgc8ZWBRqMRXEPkhDwDotD47VCpVFTJtbS0QCwWQ6PR4N69e0hJSYFarYbNZqPjk5qaSqMV0m7+DihPY0SMlT87FTUajSCKJUaKzAEyJ1xTUQR/2+pufG/duiUYXyJH4+Pj2LhxI9avX4+nT58uGr/W1lYcPnwYYWFhGBwcdPsd/PUYEqXxMxekn8nJybBYLLTNpN9kLsrlctjtdq9zgbSfr1cMBoNgXH3J3avADIoPmpqaFgmuVquFSCTCO++8s0hA+V6I1WrFjh07IBKJ6HUjIyOIiYmhwk68BCKos7OzSEtLo14IMVD8dRqShjp27BhGRkYQGxuLNWvWQKVSwWQyCQwcUSok+vHE3NwcMjIyFhkLo9EIiUSCqqoqeq+EhASfuWt33o+r50muiY6OxsOHD+mkCwsLg9lsBsdxUCgUgojDW1+BFxO3uLgYIpEIOTk5HnPCrtET8e7T0tJQWVmJgYEBwYKwp3F82T740xZXfvvtN0RFRQnGkqS3iDIg7eArUfK5zMxMzM7OIi8vj7YTAI2eExIS8OjRI/r8CwsL0dLSIkjPuXrTAHDlyhWBx+1L1n755Re6/sbfpMHHXRRbU1MDsVhMHR2iWPmpKD7+tNXf8SVRJZEPfpRJxu+9997DnTt3EBUV5TECcs1cXLt2DVKpVBBl1dTUCAy+yWQSOHgkk1FTU+NzLhA9wzeAKpVKMK6+5O5VYAbFC0QZuBoNotD51p5cT7yQ4eFh1NXVITk5GTKZDF1dXbDb7YuUKpkcxGDwvZDp6WlcvnxZMBEcDgfy8vKo10IMzp49e9zu0iBeSGZmptdtgSS1lZiYiMnJSQAvFrizsrKoUBIh5udgPcHvl9VqhdPppJ5nT08PZmdn6TVkkpLUBjHg5PekpCSMjo7Sl93c9dVms8FgMGB+fp56qBERETSqc0WlUiEiIgKDg4OYm5sTpH7c4WkcX7YPJJLy1hZXiMyoVCoAL2SgoKBAkHok7eB7myQl1t3dTdtBFBrwIt349ttvo7Gx0e3zd20jf62FpHf4CtHTGN2+fZt67SQKKikp8TrOcrkcU1NTsNvtOHjwoMAQGgwGBAUFoa+vz22E4k9b+RBjQcZuYWEBx44do+M7NzeH6urqRfLx3XffISAgAOfPn0d/f7/XecjPXDx58gT79+9HcHAwsrKyMDU1Rddb+AqfyAVZ8yGZC7IN39v3kWinoqICHMdRg6ZQKDA9PQ2n0+lT7l4FZlC8MDIygvXr1y/yaki4yn/4wO9eSGZmJhobG1FaWoqsrCy6yNbQ0IDy8nKBt0q8BP6CvUQiQWlpKT777DPs3r2bLqqSFwrj4uLQ398P4HfvKiYmBiMjI+A4Dh0dHSgoKIDVaqX385VmIB5ufHw8JiYmAADXrl3D2rVr0d3dDQCLcrDeIP2qrKyERqPB1NQUDh48iLi4OFy6dAnNzc1oaWkRhPxk0hFjS35XKpVQKpXo6Ojw2NehoSG8//779B0hpVLpMe3F9+Z1Oh06OzupV0ty/CSl0NjYCI7jPI4jiUj97YNrOsRdW1whBkWpVMLpdKKurg6bNm2iCqqqqgrXr18XpNXu3r2LqKgoNDQ0gOM4uq5DFPmvv/6KDRs2QK1Ww+FwUG+bL5v9/f3IycmBxWKBSqWCSCSCyWTC9PQ0lEol1qxZg61bt6KkpASdnZ0ex0in01FjODAwgNDQUI/ySBRheXk5SktLMTAwgJiYGGqgidMWFxeHuro6XL58eVG6xp+2uhtfhUJB1z0CAwMRFhaGpqYm6HQ6tLe3QywW040EVqsV6enpOHLkCKxWK40I+DtA7969ixMnTmBubo7Kyddffw2FQoG6ujoEBASgoKAACoUCd+7cEWxgcV3zIb8nJiaivLwclZWVXud9TU2N280v5eXlKC4uxvDwsE+5exWYQfHA5OQkDh06BJFIRBcC+RiNRuTm5gq8g6mpKSQlJUEmk6G4uBiTk5PIzs6GTCaDRqPB9PS0IA3EcRyKiooEnpNer4dEIsHHH3+MwcFB1NXVYfXq1dBoNDh8+DBUKhUsFougLZ2dnYiMjERgYCA2b96MS5cuUc+Nv0jtC7JD5fPPP0dZWRmUSqXAw29vb18UlXlCr9dDKpUiKysLFouFllwhY0NSMK674fjpkN7eXshkMmzevJl6m576OjAwgOTkZOzevRs6nQ4pKSnUELpCnlNkZCRdm3E4HDSCioyMREZGBm7fvk2VlbtxdPeegj998NUWV2w2Gw4fPkw3cjQ3N6OpqYlGLTabDSqVCmKxGIcOHcLZs2exb98+QfvJOs6OHTtQW1uLjIwMtLW1CdJ6g4ODSEhIgEwmw4YNG6DRaKjc//zzz3jvvfcQGBiIjz76CHfv3oVcLsfKlStx7do1uvbnTtYKCgqQkpKC8vJy7N+/H4WFhYvmE4FEoQkJCejt7UVPTw9kMhmtRMGXI7Va7TZC8aetfMjONolEgnXr1qGqqgoqlQorVqygCprIR1xcHGpra5GdnY3GxkZBxElkc+PGjVAoFDh79iztp16vh1gspv3q7e1FYGAgEhISMDg4uCj95Lq7kshJeHg4zp8/D4fD4XEu8FPTRFfwx7W/v98vuXsV/DYoxPsOCwvDoUOHIJfLIZVKERERgby8PKSlpUEsFtMQ69mzZzh58iTdvZSbm4uCggIolUpERkYKFmsB0P367pTV+Pg48vLyBG+oM/54NBqNQEgZywMyNz3t1iELtH/lF/QYywO/DcrAwADS09OpMiE7WUgo6u6dDNdrCA8ePMCHH35Ic7VWqxV79uxBW1ubx+9/+vQptm/fDrPZ/NKdZLw68/PzmJmZoSE3cRgYyweSzvC0W4ekY/jrJwzG68Bvg3L9+nUYDAb6O8lTklyx0+lEWVmZwANyvYZgsVhQVlaG+fl5usB45swZt5OBXweqr68PqampzEP+AyksLMS6devw9ddfY+fOnWzslxkcx9EFY08vCpIdRcePH/+DW8f4X+OV1lBIiO3tTVVyDT+fbLPZFr2l2tXVhdWrVwvSXwTXOlBkMZxV/vzjsFgsqKiogE6nW1RigvHmGRoaQmtrK/1xTQuTEuzkh0X4jNfJKxkUskXRdZcTH7IjiFxDdihduHCBXrOwsIDDhw8jLS0Ns7Oz9O/u6kCRLYM6nc6v9yAYDAaD8cfySgaFrI14ex+BXBMREYH9+/fj3Xffxdtvvy14oYm816BQKBalu9zVgQJAS3+42zED/J5PFolEXn+WSyl4BoPB+KvwSgbF09oIgWyn5Od1v/vuO6SnpwveviVRjFKpXHQPT29xksKKnl5AczgcGB8fx+joqNef8fFxty+ZMRgMBuPVeGmD4m5txJ9r+vv7cfbsWUEkwn/pyxXXOlAEYlA8GbOlxFeUw37YD/thP3/ln5fWmS/7AX/WT4ih8LXWQapruhoUb1UwiUHh7zjjwyIUBoPBeDO8tEEhayPeTvhqamryeQ3wu+HIyMgQ1JJxVweKf29v6x9sDYXBYDDeDC9lUEipEJHI8553d+sn3qivr6f1aAiudaD4ZRpUKhW2bt3qsXQDg8FgMN4MfhuUyclJ1NXVUe9/27ZtGBgYEKyJOBwO6PV6es2nn34qODvAHRaLBYmJiYIjaV3rQBHGx8eRlJQkeBufwWAwGMuDZVEcsq2tDSkpKV7XW8gbwfzT6P5KGAwGrFixAikpKSgoKMC6desgEomQmZkJpVKJlStXCkp4LzV2ux0mkwmnT5/Grl276JGmb5qHDx/i9OnTiIuLw4kTJ/7fZ14vBRzHwWw2o6KiAhcvXvxTlzMhxRZXrVqFn376aUnuabPZ0NDQgIyMDGzevFmQfVgKFhYW0NvbC71ej7y8PPz3v/9d0vszXp1lYVA4jsPVq1dRWFjo9swO8n+NRvOnnrye4DgOBQUF9ARHfiUCUizTZDItOh1yqSFnjns6JOhNYTKZ/Drp74+AvKArl8thNBohlUpx9erVN92sV4bjOKjVasTHx9OjjpcC8h6Zr3XUV4WcVcQKXi4vloVBIQwODrotDfHs2TP09fX9ZYsSTk9Po7i4mL6j424n3Q8//PDSW6X5ddD8gWxo8HT40ZviZY4dft0MDAxg1apVaG9vB8dxmJ2d9SiXJAX8v1DVwbWvpOz863ICXE9A/KvysnP4TbOsDArjBf5UIvCFax00f7+Xf1jUcsDbFvI3gUaj8csrdjgc0Gq1+OCDD5ZFu73hdDr/X0rZXV9ftxPAPwHxr8qrzOE3DTMoyxBflQiAFyH//fv3UVtbi5KSEty8eRNOp9NrHTSO4zAwMACtVosvvvgCtbW1qK2txfPnz90eFsVnZGQEDQ0NUCqVOHLkCMbGxlBcXIzo6Gh6SuDMzAzMZjMuXrxIv3N6eho3btzA0NAQent7odVqkZmZievXr+POnTtIS0vDjh07YLFY6O/Jycm0yCGJ1lQqFYxGI5KTkxEfH4/W1lZB+2w2G1pbW6FWq1FdXQ2bzYb5+Xk8evQIRqMRDQ0NmJiYgE6nE2wAccXpdKK3txfV1dUoLS2l58KbTCbk5eXRM9EVCsWiNhDMZjM2bdoEkUiE6OhoVFVV0RSi0+mE2WxGVVUVTp8+jeHhYTidToyOjqK7uxvV1dX47bff0NbWhoqKCly7dg1lZWVIT0/HDz/8gLa2NiQnJ+PgwYOwWq30948++ohGB3a7HQaDAeXl5aivr8elS5fcRkl9fX3YuXMnPX/dn2fsT1/Hx8eRkZGBLVu2YHR0FJcuXUJ8fDwKCwsFhovjOAwPD+PcuXM4deqUT098amoK9fX1KC0tRUFBgdsjuN3NCdd7XLlyBRqNBs3NzfTojQcPHqChoYG2wW63o7OzE/fu3cPQ0BAaGhpw6NAhFBUV4enTpzh69ChiY2NhMpkwNjaGo0ePuh0nV7mcmZlBT08PysrKsHv3bnR3d+P7779HSkoKfYbe5vByhxmUZYY/lZzJCXNyuRxjY2OLTndzVwfN9TPt7e0QiUT0LHt3Z4670tvbi7feegsqlQp5eXloa2tDVFQUjaQmJibwySefLDrBkB/1KJVKhISEoKysDMXFxSgtLUVQUBBOnjyJ3Nxc1NbWCo6yNRgMkEgk2LRpE+rq6jAyMoKUlBSEhobi/v37AF6clBcTEwOj0Ug9V71ej/n5eXR3dyM4OBhKpRIHDhxAYGAgkpKSBCWACBaLBdu3b8eJEydgs9nQ1dUFmUyG//znPwB+LxXk6dwRPu6iPf79x8bGkJ6ejn379sFut+O3337DBx98QI94DQ8Ph0wmw61bt5Cenk6Pka6uroZcLkd0dDTy8/OhVqtRVFQkkBetVkvTpV1dXUhKSvKYdtPpdALF7OsZ+9NXIn9KpRL5+floa2vDp59+itWrV1PF6CqPSqXS43Mha6gbNmzA999/j2fPniEqKkqQEvY1JziOQ0NDAzZs2ID+/n7qqGi1WthsNhQVFQkqe3R1dUEqlVKnjrxQXVBQgJycHOj1egQHB+Ozzz7D3r170dLSgsjISME4eZJLh8MBuVyOmJgYnDt3DhqNBhUVFYJn6KmW4XKHGZRlhq9KBOQgs3Xr1tHjeUdHR7F27Vo6IV3roJEzZ5KTk+k2bJPJJFAC5CxvbymEpqYmSCQSxMfH4+bNm4vOmHfNa5NzyomBmZ2dRVpaGgICAqBQKDA9PQ2FQkG3oY+Pj+PgwYOCVEl+fj7EYjFKSkro7j6lUkknX19fH4KDg6HVauF0OlFdXY3o6Gi6wEz69c4776CjowMzMzNuU1AWiwUJCQnYtm0bVWpECezbtw/Pnz+nhUld68u54i7am5qaQmpqKr2/2WxGREQEGhsbBWMnkUhw+vRpWK1WjI+P4+nTp4iKikJoaChKS0sxOTmJtLQ0iEQi5ObmUgPLlxf+Ubzj4+M4efKk200W5KhY/md9PWN/+moymfC3v/0NkZGR6OjooBUxSNrS4XAgOzubyvCTJ08QFxcHtVrtVnm2tbUhJCSEHsBH7kcW/P2ZE21tbVi5ciU6Ojro/0tKStDT00NPtCSVPUgVdL6B0el09AjmwcFB6pCFhISgr6+PpqmJAfIml8R5Cw8Pp8eL5+fnLxpDd7UMlzvMoCwzfFUiuHz5MiQSCc6fP0+9ZGKEyO4s1zpoLS0tkMlkgnDcdS2gqanJ63nxZNKR9AjHcTAYDILPuOa1iZdH2kW8rjVr1uDevXtUiZJJSSYaUQLk/3xFQZRgTEwM+vv7kZqaiujoaAwMDKCyshKpqamC1ElNTQ1Vvp4iL47jUFxcDJlMJjg1tKenB1KplHqJ/q4LuBpWjuNw5swZSKVSNDc349atW0hISEB9fT01kmTs+H0FfjeIxBkgY0iuI8+e7xn39fUhKCgIGzdu9LohgNyLnKjqzzP21VfghWyJRCKcOnUKDoeDjiOJhltaWiAWi6HRaHDv3j2kpKRArVa7NXq//vor1qxZA7lcTv9P+kcW/H3NiQcPHiAqKgqZmZluv4OcaEnmHIluiKEl4yKVSulOTI1GIxgnlUpFnRziPHiSS/JMyTMkY8g/2dZTLcPlDjMoywxv6ydzc3PIyMhYpNSMRiMkEgnN1fMXsV2jBODFS6qbN2+mSoBMGG/12cikW79+PX1ZVaVSCT5DUh9dXV2wWq3YsWMHRCIRNTBk+y85RpgoUfJuUX9/P4KDg6kCJ/8niggAnezHjh1DR0cHxGIx0tLSUFlZiYGBAUHOnBgfYrA8QY5RcO2/VqulKS9yL3/O4nE1rOT+a9asgUqlgslkWqTYmpqaBAqKUFNTA6lUSp0Bk8kkUJ7EU+bLCzGQIpEIOTk5HlMmRNGTz/rzjH31lcgf/50p0of29naBDBcWFqKlpcVtmoug1WoFUSHZ5kzmgD9zwt0Y8SEKnhh4UumDGBgyLqmpqZiamqJ9JONEIm8yTp2dnV7lsqamBmKxmD5DYvxI+5bbRpSXgRmUZYSvSs4kjE9MTMTk5CSAF4uHWVlZVLhd66A9ePAA4eHh2LNnD+bn5+l7FER5Ab8rkqysLExPT7uNjMikI8aAtJWkrhYWFmjqY3h4GHV1dUhOToZMJkNXVxfsdrsgFQMsjoqIUm1ubsbc3ByampqogQJAJzvx7Ej04ekoA+KBe/JMCUQpkjECXhiBjRs3IiMjg6afYmNj/dp5R/rV1dWF2dlZt/fnQwx6WFgYBgcH6d/dpaRUKhUiIiLoyafEM+7p6cHExAQMBgPm5+eplxwRESGIePhoNBr62dnZWZ/P2HWB211fifwRp4Dfh8ePH+PJkyeLZNgTxFjwPfXBwUGEhYVhy2KMtYsAAARPSURBVJYtmJiYcHs/1znhS05qamronPv222+xfft2iMViNDQ0YG5ubpHh5c8xvkGQy+WYnp6mRzK7+z7yrPkG12AwICgoCH19fbDb7V5rGS53mEFZRhDv25MXTDyX+Ph4TExMAHhxXvjatWvR3d0NYHEdNLPZTEN/q9WK1tZWbNu2DQEBASgvL8epU6dw69YtBAQEoLKyEqdOnRIoNUJNTY3b9FZ5eTmKi4vx4MEDpKSkIDMzE42NjSgtLUVWVhZdTD5//jxSUlJo+ovk3vknepJdVFqtFmq1Gvv370d4eDiGhoaoIYyLi0N/fz+AF16oWCxGXl4eOI6D3W7HmTNn0NjYSHdmSSQSn+9CEIPKT1FVV1dj48aNdC3G11oCgb+mcOHCBTQ3N+Px48eIioqiNes4jkNHRwcKCgpgtVoFBp2flnNdmCXPn4wh+Z0s5H/xxRd4//336Q45pVLpMe3Fj0ovXbqE5uZmVFdXe33G/DJInvp648YNQe6fGOLs7GwcP34c165dw9atWwWORX9/P3Jychbdnxij4OBg9Pf348mTJ9i6dSvEYjE+//xzaLVa3Llzx685QVJswIuddrW1tdDr9YL1k5aWFhQWFuLYsWN044hGoxFEROR+MpmMrj8SOSstLYVCoUB9fb1HubRYLIJnzZ8HdXV1uHz5Mh1Dd7UMlzvMoCwDOI5DX18f9uzZA5FIBJlMhgsXLrgVpMHBQSQmJuLzzz9HWVkZlEqlwAN1rYNGcv8rVqxATEwMqqqq0NnZiYCAAOzfvx9jY2P48ccfERISgvfee49OQj78dA+Z9OQzCQkJ6O/vx9TUFJKSkiCTyVBcXIzJyUlkZ2dDJpNBo9HQ6/lpoISEBJpWmJ+fh1wux9tvv43Tp09jamoKdXV1WL16NTQaDQ4fPgyVSiVQOg6Hg3rZkZGRyMjIwO3bt2nKyDUi8kZnZyeio6NRWlqKoqIiqFQqjI2N0f+3t7d7XUvgj5VcLkdwcDCqqqroGklnZyciIyMRGBiIzZs349KlS9R4uHrABNeFWdedS2TMw8PDcf78efz4449ITk7G7t27odPpkJKS4vZ5Ar+XXOE/L1/P2J++ukZQDx8+xOrVqxEZGQm9Xg+O4zA4OIiEhATIZDJs2LDBq9Ls7OzEypUrERQUhPT0dBiNRqxevVpgMHzNCSIngYGB2Lt3L7Kzs9Ha2gqn00n7sGLFCmRnZ2NsbAxqtRpSqRQKhQIjIyOC9BMpkMtPIev1ekgkEnz88cf4+eefvcoleeGztrZ20XNQq9Ww2+0eaxn+GWAGhcHwA41GI1C2DAZjMcygMBgemJ+fx8zMDE0tkbUFBoPhHmZQGAwPFBYWYt26dfj666+xc+dOFp0wGD5gBoXB8IDFYkFFRQV0Oh1GR0ffdHMYjGUPMygMBoPBWBKYQWEwGAzGksAMCoPBYDCWBGZQGAwGg7EkMIPCYDAYjCXh/wBWuisWzW1cYAAAAABJRU5ErkJggg==)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "oZtyR1PzMzXD"
},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZsAAAA/CAYAAAA2YmW0AAAgAElEQVR4nO2d3VNT1/rH8w/khovM5CoznckwXDBMx3FyRoaxTBlgiqPA0BM6ClhfhtaCVqiISJsSFNFzCCoQoidaUBRLBYVGKByoIFEUUyPYA4gFQWpBRF7iNgYD+/u7YNYyOy+8eE4Q/a3PDBchyV5rr7XzPOt5Wc8SgcFgMBgMLyN62x1gMBgMxvsPUzYMBoPB8DpM2TAYDAbD6zBlw2AwGAyvw5QNg8FgMLwOUzYMBoPB8DpM2TAYDAbD6zBlw2AwGAyvw5QNg8FgMLwOUzYMBoPB8DpM2TC8hsViQVxcHBQKBfbv34+0tDRIpVIEBgYiOzsb27Ztg1gsxokTJ8DzvFf6MDAwgKqqKuzZswdnzpzxShtL5dWrV7hy5Qp2796NdevWoaen5213CQDAcRwuXbqE48ePo7e3d8HPz8zMwGw2w2AwIDs7G3fu3FmGXjLeVZiyYXiNnp4eJCYmYnR0FADQ0NAAkUiE1NRU2Gw2WK1WJCUloaWlxav9qK2thVQq9Xo7S8FqtSIxMRGffvopxsfH33Z3MDIygk2bNqGpqQnJycnYuHEjLBbLgt/jeR4qlQrh4eEYGhpahp4y3lWYsmF4jV9//RUNDQ30tUajgUgkQllZGQBgdnYWRUVFSxJSHMehpqYGVqt1UZ/neR7Z2dlYu3YtBgcHl3YDXuTx48f46KOPkJOTg5mZmbfal5mZGeTk5CAxMRFWqxXT09OYnp5e1HeJ9bpz507YbDYv9/TtYTKZ0NnZ+ba78U7DlA1jWSBCSSaTwWw2v9E1rFYrvv/+e6hUqiULQyJIVwpGoxEymQy//PLL2+4KVXx6vX7J3+3p6UFAQMAbffddwWQyITQ0FPfu3XvbXXmnYcqGsSwMDAxgzZo1WL9+PZ48eeLxcxzHobGxEf/6179QUlKCsbExAMC1a9cQFhYGkUgEpVKJqqoqahFwHAeDwYCioiJUV1ejoKAA9+/fBzC/MJyZmYHJZEJRURF27dqF9vZ2/Pbbb4iLi8OXX36J8fFx2O12jIyM4OrVqzAYDLTNrq4utLe346+//oLBYMDBgweRnp6O0dFRFBcXIyIiAj/99BOsVit9rdPpaGxKo9EgMDAQN2/exKFDhxAaGoqMjAxMTU3R/vE8j0ePHuHMmTM4duwYOjs7wfM8JicnYTab8eOPP6Knpwd3796FVqulY+WOsbEx1NbWori4GBUVFeA4DlNTU9BqtUhKSoJIJMKmTZtQWFi4oFtvamoKlZWVKCwsRG5uLmQyGW7fvi3o94MHD1BeXo6CggLcuHEDs7OzLteora2FTqdDTU0NbDYbnj9/jq6uLvz444/UCrVYLGhtbcXQ0BCNv+3fvx95eXkYGRnBoUOHEBERAaPRiGfPntGxNBgMbts7duwYbW+huZ+YmMDBgwfh4+MDuVyOrKwsDA4OwmazoaGhAcXFxaisrMTFixdXhCt0pcOUDWNZIPGatLQ0j1ZJW1sbwsPD0djYiNnZWeTl5eHjjz/G48ePAQA6nQ4BAQGCgLrjdziOw7Zt2xAYGIiHDx8CAKqrqyGVSnHr1i23bdrtdqSlpSE8PBxnzpyBTqfDiRMnqAVmt9vR2tqKDz74gCqs8fFxREVFUWuJKLS9e/dCpVKhtrYWH3/8MXbt2oWUlBTU1tZCqVRCqVRiamqKWltBQUFISUnB0NAQamtrIRaLkZOTA2BOgWZmZiItLQ3Pnj2DWq2GUqnE5OQkJicnkZGRgcDAQOj1eoSHh0MsFrsIWGBO8FdVVSE6Ohpmsxk2mw07d+5EQkICnj9/DgDIycmBQqFAf3//vHPI8zzq6uoQHR2N3377DU+ePEFISIhgAeHc7/7+figUCjp2jv3p7u6mixC9Xo+JiQl8/fXXgvhPZWWlIN5mNBohEomQm5uLzMxMGAwG+Pv747vvvkNKSgrq6+sRFBREnzPn+ycWJZnb+eYecO/u1Ov19J5v3boFpVLJlM0iYMqGsSw4x2ucMRqNkMvl0Ov1dPVfVlYGkUgEg8FAA+qOQrKrqwtr165FRUUFeJ6nnyFKYGZmBvv27Zs3eD05OYmYmBisWbMGKpUKHMchJydH8J2GhgaBwmpsbIRYLKYCtKWlBRKJBMHBwbhx4wa6u7vh7+8PmUyGuro6dHR0QC6XU4H14MEDrFq1CqGhoTTriyistLQ0vHjxAhkZGfjkk0/w6NEjDA8PY/369dBqtZiZmaH3JRaLkZ6ejsnJSWqFOVNeXg65XI66ujr6P7VaTQXqixcvsG3bNsG4eqK5uRkBAQFobm4GAIyOjiIkJIS6NUnCB+k3MGdRhYWFUUXb3NyM1atX4/r16/T9goICmEwml/gPx3HYvHmzYC5KS0shkUgQExOD3t5etLS0QCQSISAgAB0dHXRRQ56z+vp6ev92ux3Z2dlISEjA2NjYoubenbtTo9HQRc/4+Djy8/NXlIt2pcKUDcPrECHi5+fn1u89Pj6OTz/9FBs2bMDIyAj9v0ajoatassLMy8sDz/OYmppCfHw84uPjqevJOfZAhAnJfnMHUQREQJK+ku+QBAMigIhl4Lja1ul0EIlEOHbsGOx2OxV4WVlZsNlsqK6uFlgeDQ0NVFkRxWoymSCVSqHX61FfXw+xWAydTof79+8jLi4OWq2WCrSnT58iPDycClhPPHz4EIGBgdixYwc4jgMATE9PY/fu3VRYLjZR4c8//0RwcDDS0tJoPzo6OuDn54fLly8DAC5fvgyJRIJz587R+yKWS2JiIvr7+xESEoKkpCS3wtnZ5Xnr1i1IpVKXxYNUKqULDJ1OJxhLjUZDFSnpc3x8PIaGhpCdnY2dO3dSRbjQ3JO5dbb6yH2vlEzCdwWmbBheZ6F4TVtbG3UhOcZhNm/ejHXr1uHx48cuK0xn6wJ4neJMLBAiTDxZU8Cc9SQWi6mAJH0l33FebTc1NUEqlVLlQ6yp0NBQDA4OUuWkUCjQ1dVFU4OJcCfvO7r6eJ5HXl4eFAoF7t27hx07diAgIACHDx9GfX29II4DvFZMRJl54sKFCy7W5F9//YWQkBBs27YNz58/p9eqrKycbwqh1+shlUrp+PM8D61WS+/r5cuXtN+Obs7GxkZIJBKcPn2aWiGe5sPRguQ4Dtu3b4dIJKJzTJQsWWCQsSeLFGKlkefswoULEIvFSElJwfnz5/Ho0SPBfq6F5t6dNU3u/ejRoxCJRMjMzHzr2YTvCkzZMLwOWel7yiIj7rLq6mr6v9u3b0Mul1NBQALqvb29ePnyJbV6TCYTgLl9Ihs2bBC4QMrKyiCTydDe3o4XL164bBwlK2WiKEhf/fz80NHRAZvNJlhtDw8PY+/evfD390dqaiqmpqbQ39+PNWvWUEXpbE2R10qlEmNjYxgbG0NMTIwgVbirqwurVq2CXq/H06dPERYWhpiYGExOTrodT51OB5lMBqPROO+4q9VqwRgBc9aHXC6nrrALFy4smCFIFInjCr+3txcKhQIJCQmYmJjA8PCwS79tNhtSU1OpMnA3zwRHC/LRo0eoqKhAbGwsZDIZbt26RQP6UqmUKgOiHI4ePSpQFmlpabBYLMjKynK5/6XMveP1OY4Dx3FoaGjA9PQ0tawDAwOppcSYH6ZsGF6FrNpFIpHH1bPRaIRYLKbvcxyHxMREHDx4EBzH0RVmfHw8SktL0dbWRmNARqMRFosFarUawcHB2Lx5MwoKCnD9+nXs27cP69evR21tLfR6vUtMg6yUnV1m69evR0VFBS5fvoz6+npIpVJcuXIFKpUKFRUVkMvlyM3NhUqlwk8//SRwqRHlRAQiea1Wq6FWq/Hzzz9DLpdDo9EAmFOS0dHRKCwsFMQpHC2E7u5uZGZmYnR0dEmbQcvKyiCRSNDe3g4AGB4eRkxMjGAsnGMU7iCuN39/f3R3d2N4eBibN2+GWCzGgQMHoNfr0dnZicTERERGRmJiYgIA0NTUhLCwMNo+mWedTgdgbp9VeXk5DAYDtSCTkpJw6dIlFBYWIjU1FREREcjPz0dVVRWKi4sF40KsXcfkAYlEgsLCQqhUKuTn5wssKYvFgpycHLS1tS1q7ltbWyGTyXDq1CnodDoMDAxg06ZNGBgYADCnzJkrbfEwZcPwGpOTk6ioqEBAQABEIhG2bNlCXUmO2O126HQ6rF+/HuXl5cjIyMClS5eoQJyamoJSqURQUBAMBgN4nkdfXx/WrVsHX19ffPnll7h37x7S0tKwevVqNDU1wW6305XtkSNHaMzCEZPJBJlMhvLycgBzq92srCzIZDJotVrYbDYYDAaIxWJERUXBbDbDbDbD19cXUVFRuH//viCeA8xlTznGpsxmM2QyGTZu3AiTyYTHjx9j+/bt2Lx5M3744QckJyejublZkBrc29uLqKgoyGQyREdHQ6fT0f6T1fZiNoNyHAeVSoW4uDiUl5djz549uHbtGh1/4naaL6ZFaGtrw+rVq+Hn54fExEQ0NjZi7dq1AmXS29uLmJgYHDhwAEVFRVCr1YJVP5lnX19fpKSkICMjA1evXsXs7CydY5lMhqNHj9KMO5lMBp1OB4vFInBpkUWM49gbDAZIJBIkJyejr6+P3r9UKkVISAj27t1LXZeLnXupVIrU1FSMjo6ip6cHsbGx2LVrF0pLSxEXF0fvnbEwS1Y2PM+jp6cHcXFxEIlE+PTTT3Hz5k3wPA+O41BdXU2Fy+eff47c3Fxs3LgR0dHRaG5uFgia2dlZdHV10WtFRkYiNzcXubm5+OabbyCTyVx2Jj979gx5eXno6+tz6dudO3eg0WiWNTOkq6sL586dQ1JSEn799ddlaZPneTQ1NaGkpIT5ixlvzOPHjxEcHEwFLuP95m3IR0feyLIhQU+xWEwzUQgkHdLR3J2dncXZs2chFoupf5XgmMbpfK3KykrqbgDmXA5bt25FV1eXx741Nzdj9+7dbley3qKwsNAlMOoteJ7HuXPncPjwYbeprgzGfPA8D4vFArvdDqPRiPDw8BVVxofhXd6GfCS8kbKZL5WVBPGcMziIL1osFqOtrY3+nwRQ3Qlro9FIs184jsNXX32F+vp6l/7Y7XYYDAaMj4+D53mcPHkSR44cWZZVP/Ghk30E3qa5uRnbt29flrYY7x99fX1QKBQ4efIksrKyUFlZ6bWK24yVx3LLR0feSNmQoKe74BjJOCH7IRwh+xHUajX9H0lPdVROT548wbNnzwTfraysRFRUlEt7drsder0en3/+Of1+f38/wsLC8Pvvv7/J7S2J5SyoSHauL5SmymB4YnZ2Fs3NzcjPz3dbRobx/rOc8tGRN1I2JJXVWcCSrBXHfHxHiCLavXs3TYElewHItTiOQ3p6usDKIVaRSqUSKLCuri589tlnEIlECA0NxenTp2G1Wmng053C+1/jqaCipxpfju+Tel5nz55dlFnb2NiIDz/8EN3d3f/Te2AwGP9/WE756MgbJQhkZ2fPG69x3LDmSGlpqUDZkHgNKQK4Y8cOfPjhhy4uOFLeo7S01OWaZCOY41klJKY0314FUmNpob+wsLB5CxxqNBqXHcYL1fjq6elBUFAQCgoKYLVakZaWhlWrVqGgoACvXr3y2FZOTg5CQkLo+TAMBoOxVBYjH73BkpUNide4i7F4itcAECgWYsWQXHcirGdnZ1FQUOByciPZYe6889i5lIgjarUaa9asoTnxzkxPT9NNdvP9jY+Pe3Q1kLFwvN+FanyRcidklzOZ+Pn6CrzeWLeQ8mMwGIyFWEg+eoMlKxsSr3EXECeWi3PGGfC6TIbjzmd38Zry8nJBuXLgtRXirGzmO7hJrVZ7PUPMeQfzYmp8kcKExLojcZgtW7bMG/QniQielM1irDT2x/7Y3/+/P3csh3x0kVFL/cJi4jXujt+trKyESCQSFAWcL5nAEVJF11nZzHdWiVqtxkcffURdV878LywbZxfeYmp8zczM4MiRI4iKisLw8DBOnDiBdevWLRiHIRYQs2wYDMZ/y0Ly0RssSdkQt5W7BACSleXu+N2uri4oFApBSfWFkgkcIe42ctYHobq6mtZOevHiBf0/ufZ8pzP+tzEbx4KKfX19ePHixaJqfAFzu8qTk5PR0NAAs9nstl4YUYaO75WVlWHVqlV48ODBvOPFYDAYnliMfPQGS1I28+2JcRev4Xket2/fRmBgID777DOqaAC4xGvmg5SmcBwcx3jN+fPnUVNTQ4U56ac3j6olLrzt27ejuroaVVVVaG1tnbfGF/C69EpAQADq6upcFI3VasXRo0dx+vRpFBUVITk5mboISdHHhQowMhgMhieWQz66Y9HKZnJyEsePH4dYLIZUKsW5c+eo4O/p6aEpyDKZDPv27YNarUZkZCR27NiB+vp6gQadnJxEbm6u22t5ore3F6GhoTQ3fHp6GmlpafD398fp06cFu+lv3bqFmJiYeY8f/m8hJdAd21+oxhcwd+/79+/H5s2b4ePjAx8fH5w+fRo8z8NutyM3Nxc7d+5ET08PlEqlwCVJXHDffvstK1PDYDDeiOWQj+54ZwpxkjIt33zzzbxFAzmOQ1JSEi2hvpIYGRlBQkICbty4AWAuVfybb76hlRh6enqwatUqZGRkoLa21m3CwMjICJRKpUsSheP7Bw4cgFgsxo4dO/Dy5UvBe+TwKY1G45W0x1evXuHKlSvYvXs31q1bt6wByPngOA6XLl3C8ePHBRb2fAwMDKCxsRHHjx9HWVkZ2wD5X0AKXf7tb3/DH3/84fYz4+PjKCkpQUJCAr7++mvBs/u+sJhxWAwlJSXzHnfuibcpH98ZZQPMVQsoKSlBSUmJ27pgVqsV//znP2E0GldkCY4LFy645LZ3d3dj7dq1uHfvHoxGIy3jzvM8/vjjD5dKCsCcEExJSfGYtkjOfZfL5S4nOba0tCAqKsqre3WWUgZ/ORgZGcGmTZvQ1NSE5ORkbNy4ERaLZVHfbWlpWVRckTE/5LC1yMhIt0V0CSRDdbldPMvFYsdhIX7++WesXbvW46LTHW9bPr5TygaYm6w7d+64PbDo3r17GB4efgu9Why3b99GQEAAfUBmZ2dRWlqKvLw82Gw2jI2NYevWrYiMjMT+/fthMBjm3eNDyrM7Q7LkRCKRoJApMJeG7e3SOstZwmchZmZmkJOTQ+N909PTbhMyPOHuWOD3DZPJhM7OzrfSdl9fH65du0Zfk6SfpQhRjuNQU1Pz1qoZr5S+OI+lM29bPr5zyuZd588//8SpU6dw8uRJlJSUoKur63/untFoNMjMzERISAgiIiKodWG1WpGUlOT1VbqnEj5vA6L43mSl7OlY4PcJk8mE0NBQl4K63mB2dlbgAh8YGEBUVBR9TsjGb0/Hh7vDarXi+++/93gK7HKy2L44j8ObwPO8wM3oPJYrEaZs3jMsFgu2bt2K9vZ2ekImeQAHBgbw97//nZYSstvtGBkZwdWrV2EwGKgV0tXVhfb2djx//hxGoxFFRUVITEzE77//jubmZsTGxmLfvn3gOI6+/vLLL6lSI0c437x5E4cOHUJoaCgyMjJcYlBTU1Oora3FsWPHUFNTg1evXmFychJmsxk//vgjenp6cPfuXWi12nn3Fo2NjaG2thbFxcWoqKgAx3GYmpqCVqtFUlISRKK5ckiFhYULuvVI6X2tVkutGsdNyjzP48GDBygvL0dBQYHbYpbkvnQ6HWpqamCz2WC1WtHf34+qqipqRdhsNrS1teH+/fsYGBhAVVUV9u/fj7y8PIyMjODQoUOIiIiA0WjEs2fP6FgaDAZBe1arFVevXoVWq8XZs2fx/PlzmEwmFBUVYdeuXWhvb8dvv/2GuLg4Ok8TExM4ePAgfHx8IJfLkZWVhcHBQdhsNjQ0NKC4uBiVlZW4ePGiy5hNT0+jv78f586dE1Rwn5qaQkNDA2ZmZmCz2XD27FncvXsXHR0d+OKLLyAWi6HX62Gz2VBcXAxfX19IJBKkpqais7MT4+PjiIiIgEqlwvj4OLRaLSIiIlBSUuLW7XPt2jWEhYVBJBJBqVSiqqqKPsPOY0Ks2qGhITQ2NqKqqgoTExMoLS1FZWUlDAYDDh48iPT0dIyOjqK4uBgRERH46aefYLVa6WudTrfkvhCcx8FisaCurg6HDx/Gvn378Ndff+H8+fPztjM4OIiMjAz4+PhApVLBYrG4HcuVCFM27xkkM3BoaIj6v9PT0zEzMwOj0YgvvviCmvh2ux2tra344IMP6MqfVDQgbieyuifH8549exZpaWkIDQ1FTk4OtFot8vLy6Dn2JCU8KCgIKSkpGBoaQm1tLd3sCswJ7KqqKkRHR8NsNlNL6M6dO/SExsDAQOj1eoSHh0MsFrsIWHfXIaWAHC2RnJycRbvBBgYGEBsbi7Nnz8Jms9FTHklGIMdxyMzMRFpaGp49e0ZjY2TsHPvT3d1NK0zo9XpYrVbk5eUJjuW4desWpFIp3axM9n7l5uYiMzMTBoMB/v7++O6775CSkoL6+noEBQUhLS2Nrpwd6/CRTc4GgwF2ux1paWkIDw/HmTNnoNPpcOLECTpPgHt3p16vp5bFrVu3oFQq3SrogYEBBAUFISMjAzMzM+B5HkePHhU8N0lJSbSvpaWlAvcYeU4ctzN0dHTAz88PxcXFyMzMxM2bNxEbGzvv8R06nc5lK4anMZmenkZ7ezv8/f2hVquRnp4OX19fKJVKmEwmBAQEYO/evVCpVKitrcXHH3+MXbt2ISUlBbW1tVAqlUvuizPO4zA0NETH8R//+Ad++eUXbNmyZd52WlpaBLUp3Y3lSoQpm/eMhoYGeswvqV5ANoJqNBoUFBS4fN4xq6WxsZGuvIDXxVVXrVqFwsJCTE5OYtu2bRCJRMjKysLTp08RFxdHBRQpmuq4gZf84Ingqa+vh1wuR11dHex2O7Kzs5GQkICxsTHBYXrp6emYnJzE+Pi424SQ8vJyeh2CWq2mApVUt12MG2xkZAQbNmxAbm4u7HY7rdhA6u4R4fnJJ5/QeCEpPUQEQ3NzM1avXo3r16/T9wsKCmAymeh9kWMyZmZm8O233wqUT2lpKSQSCWJiYtDb24uWlhaIRCIEBASgo6ODVu8gyqmjowP+/v7Q6/X0gMLQ0FD09fXRvRRr1qyBSqUCx3HIyckR1BF05+7UaDRUYI6PjyM/P9+tACOLECLgbt++DZlMJkiA+fe//43y8nK6idDRPeau+kdZWRkkEgk9cptUDvHklnLn5pxvTMj7crkcH374Ia5fv47nz5/j+fPnaGlpgUQiQXBwMG7cuEHblslkqKuro9/zFIdcjMvV3TiYTCbI5XKEhoaiqqoKY2Nj1Lrz5IpznCNPY7kSYcrmPYJsdHV86EiZoBMnTiApKcmlOrZjIVNiGTiu5smPLDY2FqOjo3Q1TIQuWb0TRdLQ0ECVFXEDkA2/er0ef/75J4KDgxEfH4+hoSFkZ2dj586dVICTzb5EwHri4cOHCAwMFJQ/Ij9m8kNcbKKC3W5HRkYGgoODqVAiq0WiuC9fvgyJRCKoBEHuPTExEf39/QgJCUFSUpJb4UzuiwgRYhURwUOUkVQqRUVFBXieh06nE4ylRqOhinRqagrx8fEIDQ1FT08PTp06hfj4eOpCIfNG5sn5fgD3yQ/Eulgok9CxcvDg4CC+//577Nq1C8HBwRgeHgbHccjLyxM8M45tOy9y3N1/dXW12+ryBHJdUu5qoTEBXpfIysrKEsRNyFlbx44dg91up4qdfI70xZ2F7a4v833GeQ4c2yHj7+meyTw6zo/zWK5UmLJ5j5icnMTGjRsFmTykAKqvry82bNggqI7tXMi0qakJUqlUsPotKyuDVCqlPzKj0SgQumT1XVZWJijhQ+JCpPqDQqFAb28vLly4ALFYjJSUFJw/fx6PHj0S/DiJYnIWBs6Qc5Ac6+WRe922bRuNW0il0gUPm/v999/h5+cnWE3euXMHcrkcZWVltOK2s4uksbEREokEp0+fFoyDO4jwr6yspNacSCSibRJlFB8fj6mpKbpSJkVdiZVGlBOpw7dt2zacOnUKPT09gthRWVkZxGIxnSeiGEn/PK3EiTtMJBIhMzNzXiWtVquhUCiQnp6OS5cuoaysDGvWrMHDhw9RUVFBFyxkHkjb7qq1O98/+cx8bilny2yhMXFcjDguZMhYhIaGYnBwkLatUCjQ1dVFFetS+uIO53Eg7ZJ2gLlsvPnaIfNIFlDzVb5faTBl8x7R3d0NpVIpeOiIsBeJRIIVFSA0v4eHh7F37174+/sjNTUVU1NTePnypYvZT4L/RJmQ1bbJZMKTJ08QExMjqMLd1dWFVatW0dW5Wq2GVCqFyWRyew86nU5QGdwT7q5z+fJlyOVyumHtwoULghiFJ4jiIkqJuMz8/PzQ2dmJx48fIywsTOAistlsSE1NpcrAXV08R8rKyqjL7Nq1a9i6dSvEYjGqqqrw8uVLF0HkXFHc0YK0WCw4e/asx/aIlUCEJzC3+vXz80NHRwdsNpvg+hzHgeM4NDQ0YHp6mloIgYGBbrcYEEiVd2JdNjQ0QCaT4fDhw9DpdNT1SebUZDLhxYsX1MWXmpqKyclJ2O12l/snn0lMTKSfcYY8i729vXj58uWCc0AsC2fr01mAO/bPZrPR10qlEmNjY4vqizucx4H0JzMzEzabjc5bQkKCR9cx8RzU1NTgxYsXmJiYcBnLlQpTNu8RZWVlbo9bcA5EE4j5feXKFahUKlRUVEAulyM3NxcqlQqdnZ0CN5Szn568jomJQXFxMfLz8yGXy+nenpGREURHR6OwsJD2Sa/XCywAi8WCnJwctLW1LWkzKPHvt7e3AwCGh4dpvSfyg3OOUXiiuroaIpEIpaWlsNlsKCgogFQqRWRkJC5duoSLFy8iMTERkZGRmJiYAAA0NTUhLCyMtm80GiEWi6HT6QDMpbeWl5fTLD8Sr6mvr8fhw4dx5MgRBAQEoKioCKccXecAAAN4SURBVDqdDlqtVrCiJStlYh0Qi7KwsBAqlQqVlZUQi8XIzs4Gz/Ow2Ww4efIkLl26hNHRUYSHh1NhSVa/69evR0VFBS5fvozW1lbIZDKcOnUKOp0OAwMD2LRpE90orFarF5wHokBv3rxJ+ygSiZCRkUFdm47pzBcvXkRNTQ26u7sREBAAjUaDwsJCjI6OCpQx8PrARI1Gg+zsbJeqD+RZiY+PR2lpKdra2mi80d2Y8DxPx9DZReVcvZ0swsgzSl6r1Wqo1epF9cUZd+PQ2toq6A+x7rKysqDT6XD//n2X65AkhKqqKpSWluI///mPy1iuVJiyeQ/geR7Xrl2DQqGAQqFAe3u7wDXFcRx27NjhspfCYDBALBYjKioKZrMZZrMZvr6+iIqKQm9vr4trwDn7ihQVXbNmDc6dO4fBwUFs374dmzdvxg8//IDk5GQ0NzcLXBkcx9Esr5CQEOzdu5daSc4rzPkg14mLi0N5eTn27NmDa9eu0fsmbidna84do6Oj2LJlC3x8fBAVFYWqqirs2bMHvr6+OHXqFOx2O3p7exETE4MDBw6gqKgIarVasOondfF8fX2RkpKCjIwMuumW1PHz8fFBRkYGnj17Bq1WC6lUCpVKhadPnwpcWsQadVSUBoMBEokEycnJ6Ovro+3JZDIEBQVhx44duHv3Lnieh8lkgkwmQ3l5OYDXJVJkMhm0Wi1sNhsMBgOkUilSU1MxOjqKnp4exMbGYteuXSgtLUVcXBxVpJ64cOECTagA5oTyV199JRB4jm0fPXoUVqsVZrMZcrkc8fHx6Ovro+4tR+VmNpshk8mwbt06t/0gz15QUBAMBgOtLehpTADXwDrg3qVXWVkpUHykLxs3bnRrkbvrizPuxoGc8kuU18jICMLDwxEUFITr16+7vQ4pU0OeI+exXMkwZcN4L3n8+DGCg4OpwGUwGG8XpmwY7w08z8NisdCNmeHh4S5nKzEYjLcDUzaM94a+vj4oFAqcPHkSWVlZqKysXJEFWRmM/48wZcN4b5idnUVzczPy8/PdlpFhMBhvD6ZsGAwGg+F1mLJhMBgMhtdhyobBYDAYXocpGwaDwWB4HaZsGAwGg+F1mLJhMBgMhtdhyobBYDAYXocpGwaDwWB4HaZsGAwGg+F1mLJhMBgMhtdhyobBYDAYXocpGwaDwWB4nf8DboYCh2BA0iEAAAAASUVORK5CYII=)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "5dbDF-n_Mzbl"
},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAANcAAAAfCAYAAAB0xlhsAAAJ20lEQVR4nO2a/0sb9x/H7x+4X/bD4H4KDET6w5AxRqBSaplUWcW1pdKVLm3KCqFhOLJ17RQ2waCk2xSrBbMtg5bGtWzY1a60UxzUkdWJgnWCDYVZ+8XVWhdiew2RmHvsB7lrLl5i1Byffcb7Afklyb3zvvf7+Xx9eV8kBAKBLUj/6wkIBP9VhLkEApsQ5hIIbEKYSyCwCWEugcAmhLkEApsQ5hIIbGJd5goGgzgcDo4dO4bf76e8vBxFUThx4gSNjY2UlpZy6NAhFhcXWVxc5Pz58yiKgsPhoKGhgUAgQEtLCzU1NSiKwsjISNFvSNM0otEoLpcLSZKoq6vj999/R9M0VFWlr6+PsrIyJEniyJEjBAIBDh48yN69e7lx4waaphljpdNppqamjLF2795NIBAgEAhw/PhxHA4H9fX1JJPJot/Hv5X1aCCVSjE8PExVVZVpvQOBAB6Ph5deeom2tjbTmm8GVVVNmuvo6CAej5NOpxkfHzf2saKigpaWFo4fP05FRQXd3d0kEgnTWMXQb8HmSiQS+Hw+fvvtNwBmZ2epqKigvLycu3fvAhAOh2ltbWV5eRmAiYkJSkpKqKurIxaLGWPFYjE8Hg8zMzOF/vy60DSNpqYmZFnm8uXLps/m5+eprKykrKyMaDQKrJjo/PnzyLJMR0eHabOXl5dpaGiwHKu3t5f29nZb7uHfyEY08PTpU1wuFw6Hg9HRUWMsTdNoa2ujr6+vqHO8ffs2r7766irN6ZqQJIlwOGy8Pzk5idPpXPV92Lx+CzbXw4cPCQaDRpQeGxtDURQ8Ho/h+qGhIX7++Wfjmr6+PiRJMi02rESYzs5OFhcXC/35daFv6JYtW5icnDR9ps/78OHDPHv2zDQnt9uNLMsMDw8b78fjcfbt22cyo04kEjHd73+djWhgZmaGrVu3Ultby+PHj03jff/990xMTBR1jgMDA0iSRFNTE0tLS8b7+j5aaSIUChk6zWSz+t1wzxUMBpEkia6uLsvP9YivKIqx2KlUij///JN0Or3Rny2IaDRKWVmZZTQKh8NIkmRZjuj31NzcbLynR69MMz5+/Ji///7b1nv4f2AtDYC12O/du4eqqkWfj6Zp+P1+yypD18SePXtyZqiqqioWFhaA4uh3Q+ZaWlrC5/OhKApDQ0OW34nFYuzatcsU8W/cuMGXX35pigJ2oG9odsTJnLdVxtGN5/P5DCFcvHjRNJaqqnzyySershisiObrr7/m22+/5dq1a/z666/GZ3r/dvbsWbq7u40ySkdVVQYHB/nmm284e/asscmJRILp6WmuXr3K4OAgjx49IhgMcuvWLePaxcVFrl27xunTp7ly5QqpVGpT61cIhWgAoLW11VSKzc3N8dFHH63KYrno6urC5XIVVOXoFYtVlaFnoeyMBi+Mt3XrVqPUK4Z+N2Quvdbevn079+7ds/yOHg22bNlCQ0MDNTU1vPLKK7aXUfmil95vZfYImZw7d85kLj16SZLEu+++i9fr5bXXXltVUgL89ddfVFZWEg6HSSaT1NfX09vba/zue++9R2dnJ8lkkvb2dnbs2MGjR48AGB4eprq6msHBQdLpNG1tbbz55pvMzs6SSCS4fv06sixz5swZjh49isPh4MSJEySTSS5dusTevXsZHx8nEongcDgYHx9fdW+JRAKPx4MkSWu+MnuSXBSiAV3skiTx/vvv43a7KS0ttRR49lzHxsZIJBKEw2G8Xi/Pnz9nYmKCubm5nNflqljy9eDwok/LNFcx9Lshc1nV2tlkl18zMzO43e6cG1Es8kWvXP0WYDKSnqWePHlCdXU1TqeT6elp0uk0XV1dfPXVV6tKSr238Pv9aJrGTz/9xM2bN1FVFa/Xy8cff2yUQoODg4RCIVRVJRKJUFJSQigUMsbU1+7q1avAi0xcXV1NNBolHo+TSCTo7++npKSE69evk0ql8Pv9HD582Mh6mWiaRjweZ2FhYc1Xrj21Wst8GsguxZLJJA0NDcZ95WJ8fJzS0lKOHTvGL7/8Qnt7O6FQiJdffpkffvgh53W5KpZ8/VbmdZl9YTH0uyFz6bV2KBSy/NyqXp2fn+eLL75Y89hav6l8Lyvj6Ogbun///lWlhJ6Zsk8E4UXmcTgcRCIRwLrfunDhgunUSyeVStHY2LgqOoZCIdOYmcRiMerq6nj77bdNEbm9vd0otzIzcaYBHz58yI4dOzh06BAPHjzA7/dTX1/P/fv3865vsVhLA7C639I0jdOnT1tWDdksLS0xMjKCx+OhtraWoaGhvNkuc52yzZvrBBFWtPrpp58iSRKff/45y8vLm9JvJus2VyG1dnbEXw+JRGLNyBqLxXL2FYX0W1bz7u3tRZIkvF6vkWHyHX5YcffuXcrLy40I+fz5c44ePWp5UgYr5aAsy6a56qeWNTU1zM7OGlH3rbfeMhnn4sWLyLLMhx9+yHfffcf9+/fzzrGYmasQDehilyTJKI8LJZlMMjAwgNvtxufz8cEHH3DkyBF6enqIx+OW1xTSb2VrAmB6ehqn04nT6WRqagrYnH4zWbe59Fq7urqaBw8eWH4nX/llJ/qGWh1Y5OsRpqamcDqd7Ny5kzt37gBrH35kcuvWLeO6/v5+ZFmmq6uLhYUFqqqq2Ldvn6UodPNmPusZHR2lpKSEnp4eNE0zsqceVXWam5tRFIWxsbGC1qaYPVchGlirFMvHH3/8wbZt27h06RL9/f00NTVx+/Zttm3blrMs1CuWbM1lPqfMzmh6ye5wOOjt7TWCU7H0u25zjYyMoChK3n8mrDfiF4t8z6SsFkzTNEZHRykvL+fAgQOGQWB90evcuXPGw+RoNMrrr7/O5cuXDUFv376d2dlZYKWca25uJhaLEYlEkGXZiOyqquLxeGhpaTFlz+wHsPDi2YxuhKdPn9La2mp6RmcXhWgg39H3emhtbc0ZnDLJlZ1ymXx+fh6v18sbb7zBlStXTMfrxdJvwebS/8py4MABJEnC6XQyMDBgWlxN07h58yY7d+5EkiRcLpdJsHYSj8fp7OxElmUURaGnp8cob6LRqDFv/a8szc3N7N69G6/XS39/v6kUisfjBAIBy7GsCAQCuFwuuru7OXnyJKdOnTLMcefOHfbs2UNVVRWfffYZHR0dzM/PAytrGgwGqa2t5cKFCzQ2NvLjjz8aJa+ePa3KSlVVaWpqQlEUKisrOXnyZEG9zGYoRAOwktl8Pp/RHw8MDGz48cDc3ByTk5M5Ra5pGmNjY+zatWuV5jL3MfPvV263m3feeYdQKGRa12LrV/xxVyCwCWEugcAmhLkEApsQ5hIIbEKYSyCwCWEugcAmhLkEApsQ5hIIbEKYSyCwCWEugcAmhLkEApv4B+O+atABd+NxAAAAAElFTkSuQmCC)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "6CD2HCqdvGSq"
},
"source": [
"uniqueWords = set(bagOfWordsA).union(set(bagOfWordsB))\n",
"uniqueWords=uniqueWords.union(bagOfWordsC)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "YoZVtnH8vMNv",
"outputId": "7d187131-6230-42f9-9379-fd27d7dfd490",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 188
}
},
"source": [
"uniqueWords\n"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'and',\n",
" 'dogs',\n",
" 'hate',\n",
" 'hobby',\n",
" 'i',\n",
" 'is',\n",
" 'knitting',\n",
" 'love',\n",
" 'my',\n",
" 'passion'}"
]
},
"metadata": {
"tags": []
},
"execution_count": 7
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "SzbfGGfYvM8_"
},
"source": [
"#diccionario\n",
"numOfWordsA = dict.fromkeys(uniqueWords, 0)\n",
"\n",
"for word in bagOfWordsA:\n",
" numOfWordsA[word] += 1\n",
" \n",
"numOfWordsB = dict.fromkeys(uniqueWords, 0)\n",
"\n",
"for word in bagOfWordsB:\n",
" numOfWordsB[word] += 1\n",
"\n",
"numOfWordsC = dict.fromkeys(uniqueWords, 0)\n",
"\n",
"for word in bagOfWordsC:\n",
" numOfWordsC[word] += 1 "
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Mrip3wk6vakQ",
"outputId": "c9afbbc7-8165-4ffc-9040-f65282fadee0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 188
}
},
"source": [
"numOfWordsA"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'and': 0,\n",
" 'dogs': 1,\n",
" 'hate': 0,\n",
" 'hobby': 0,\n",
" 'i': 1,\n",
" 'is': 0,\n",
" 'knitting': 0,\n",
" 'love': 1,\n",
" 'my': 0,\n",
" 'passion': 0}"
]
},
"metadata": {
"tags": []
},
"execution_count": 9
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "_WJ7qXK0vc2F",
"outputId": "33ceb23b-19e3-4b2b-a3ca-d62d43fd2cd2",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 188
}
},
"source": [
"numOfWordsB"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'and': 1,\n",
" 'dogs': 1,\n",
" 'hate': 1,\n",
" 'hobby': 0,\n",
" 'i': 1,\n",
" 'is': 0,\n",
" 'knitting': 1,\n",
" 'love': 0,\n",
" 'my': 0,\n",
" 'passion': 0}"
]
},
"metadata": {
"tags": []
},
"execution_count": 10
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "jp0rnrh1ve4J",
"outputId": "e175570a-37e7-467f-ede2-76d7fea016cf",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 188
}
},
"source": [
"numOfWordsC"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'and': 1,\n",
" 'dogs': 0,\n",
" 'hate': 0,\n",
" 'hobby': 1,\n",
" 'i': 0,\n",
" 'is': 1,\n",
" 'knitting': 1,\n",
" 'love': 0,\n",
" 'my': 2,\n",
" 'passion': 1}"
]
},
"metadata": {
"tags": []
},
"execution_count": 11
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "yRyW0iQZvg96"
},
"source": [
"def computeTF(wordDict, bagOfWords):\n",
" tfDict = {}\n",
" bagOfWordsCount = len(bagOfWords)\n",
" for word, count in wordDict.items():\n",
" tfDict[word] = count / float(bagOfWordsCount)\n",
" return tfDict"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "OmOXBkS0vmo0"
},
"source": [
"#TF de cada documento\n",
"tfA = computeTF(numOfWordsA, bagOfWordsA)\n",
"tfB = computeTF(numOfWordsB, bagOfWordsB)\n",
"tfC = computeTF(numOfWordsC, bagOfWordsC)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Y7CVv-FMvsc_",
"outputId": "d71eeabd-00c8-499b-8fe8-61568c60b7a7",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 188
}
},
"source": [
"tfA"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'and': 0.0,\n",
" 'dogs': 0.3333333333333333,\n",
" 'hate': 0.0,\n",
" 'hobby': 0.0,\n",
" 'i': 0.3333333333333333,\n",
" 'is': 0.0,\n",
" 'knitting': 0.0,\n",
" 'love': 0.3333333333333333,\n",
" 'my': 0.0,\n",
" 'passion': 0.0}"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "zK6VUFxSv0aG"
},
"source": [
"def computeIDF(documents):\n",
" import math\n",
" N = len(documents) #numero de documentos\n",
" \n",
" idfDict = dict.fromkeys(documents[0].keys(), 0)\n",
" for document in documents:\n",
" for word, val in document.items():\n",
" if val > 0:\n",
" idfDict[word] += 1\n",
" \n",
" for word, val in idfDict.items():\n",
" idfDict[word] = math.log(N / float(val))\n",
" return idfDict"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "8tIwR-LTv0sx"
},
"source": [
"idfs = computeIDF([numOfWordsA, numOfWordsB,numOfWordsC])"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "k5McMyILv4xw",
"outputId": "f5c48cc1-e8d4-468b-b54d-e410472ad93b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 188
}
},
"source": [
"idfs"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'and': 0.4054651081081644,\n",
" 'dogs': 0.4054651081081644,\n",
" 'hate': 1.0986122886681098,\n",
" 'hobby': 1.0986122886681098,\n",
" 'i': 0.4054651081081644,\n",
" 'is': 1.0986122886681098,\n",
" 'knitting': 0.4054651081081644,\n",
" 'love': 1.0986122886681098,\n",
" 'my': 1.0986122886681098,\n",
" 'passion': 1.0986122886681098}"
]
},
"metadata": {
"tags": []
},
"execution_count": 17
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "_pORwkSowC8M"
},
"source": [
"def computeTFIDF(tfBagOfWords, idfs):\n",
" tfidf = {}\n",
" for word, val in tfBagOfWords.items():\n",
" tfidf[word] = val * idfs[word]\n",
" return tfidf"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "0dWBuQqKwPCT"
},
"source": [
"tfidfA = computeTFIDF(tfA, idfs)\n",
"tfidfB = computeTFIDF(tfB, idfs)\n",
"tfidfC = computeTFIDF(tfC, idfs)\n",
"\n",
"df = pd.DataFrame([tfidfA, tfidfB,tfidfC])"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "fKCLpD8cwSgn",
"outputId": "0e505d2e-ac3c-42c8-d019-2562ab263e9f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 141
}
},
"source": [
"df"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>knitting</th>\n",
" <th>passion</th>\n",
" <th>hate</th>\n",
" <th>hobby</th>\n",
" <th>dogs</th>\n",
" <th>and</th>\n",
" <th>love</th>\n",
" <th>my</th>\n",
" <th>i</th>\n",
" <th>is</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.135155</td>\n",
" <td>0.000000</td>\n",
" <td>0.366204</td>\n",
" <td>0.000000</td>\n",
" <td>0.135155</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>0.081093</td>\n",
" <td>0.000000</td>\n",
" <td>0.219722</td>\n",
" <td>0.000000</td>\n",
" <td>0.081093</td>\n",
" <td>0.081093</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.081093</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0.057924</td>\n",
" <td>0.156945</td>\n",
" <td>0.000000</td>\n",
" <td>0.156945</td>\n",
" <td>0.000000</td>\n",
" <td>0.057924</td>\n",
" <td>0.000000</td>\n",
" <td>0.313889</td>\n",
" <td>0.000000</td>\n",
" <td>0.156945</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" knitting passion hate ... my i is\n",
"0 0.000000 0.000000 0.000000 ... 0.000000 0.135155 0.000000\n",
"1 0.081093 0.000000 0.219722 ... 0.000000 0.081093 0.000000\n",
"2 0.057924 0.156945 0.000000 ... 0.313889 0.000000 0.156945\n",
"\n",
"[3 rows x 10 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 22
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "MtJeY_QuwY32",
"outputId": "fd0a5a0e-c39c-439a-d958-d9c965bb60b8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 141
}
},
"source": [
"df_bw"
],
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>and</th>\n",
" <th>dogs</th>\n",
" <th>hate</th>\n",
" <th>hobby</th>\n",
" <th>is</th>\n",
" <th>knitting</th>\n",
" <th>love</th>\n",
" <th>my</th>\n",
" <th>passion</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" and dogs hate hobby is knitting love my passion\n",
"0 0 1 0 0 0 0 1 0 0\n",
"1 1 1 1 0 0 1 0 0 0\n",
"2 1 0 0 1 1 1 0 2 1"
]
},
"metadata": {
"tags": []
},
"execution_count": 21
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment