Skip to content

Instantly share code, notes, and snippets.

@Macorreag
Created April 14, 2020 20:00
Show Gist options
  • Save Macorreag/e11dcc3291e0bae5d0d2e33e9cf3cd4e to your computer and use it in GitHub Desktop.
Save Macorreag/e11dcc3291e0bae5d0d2e33e9cf3cd4e to your computer and use it in GitHub Desktop.
Damerau-Levenshtein
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "ir"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "3.3.1"
},
"colab": {
"name": "Damerau-Levenshtein",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/Macorreag/e11dcc3291e0bae5d0d2e33e9cf3cd4e/damerau-levenshtein.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-JVep_P_EBcq",
"colab_type": "text"
},
"source": [
"\n",
"# Damerau-Levenshtein\n",
"Implementación en R para medir la distancia *Damerau-Levenshtein* de un texto,ademas se muestra un histrograma y desviación estandar entre las distancias de las palabras de un texto.\n",
"\n",
"[Video Guia](https://www.youtube.com/watch?v=MiqoA-yF-0M&t=2s)\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "lmmpiv2BFFEA",
"colab_type": "code",
"outputId": "1dba7827-7eb5-48c2-dac1-e52d3221774e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"# Función para calcular la distancia\n",
"\n",
"#Recibe 2 parametros y crea 2 \n",
"\n",
"lev = function( split1 , split2, m , n){\n",
" \n",
" if (m == 0) return (n)\n",
" if (n == 0) return (m)\n",
"\n",
" if (split1[m] == split2[n]) \n",
" return (lev(split1, split2, m-1, n-1))\n",
"\n",
" return (1 + min( lev(split1, split2, m, n-1), #Insert\n",
" lev(split1, split2, m-1, n), #Remove\n",
" lev(split1, split2, m-1, n-1) #Replace\n",
" ))\n",
"\n",
"}\n",
"\n",
"a = \"yyyamppaaaaaaaaaaaaaaa\"\n",
"b = \"termoaaaaaaaaaaaaa\"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"len1 = nchar(a)\n",
"len2 = nchar(b)\n",
"lev(strsplit(a, \"\")[[1]],strsplit(b, \"\")[[1]],len1,len2)\n",
"# print(split1[31])\n",
"\n",
"\n",
"\n",
"\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"[1] 8"
],
"text/latex": "8",
"text/markdown": "8",
"text/html": [
"8"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "peBwZWFYWEJ4",
"colab_type": "text"
},
"source": [
"# Limpieza del texto de entrada\n",
"\n",
"Se inicia limpiando el texto que ingresa para retirar caracteres innnesesarios, ademas se almacena el texto separado por espacios en un arreglo.\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "54hjFEwPEBcy",
"colab_type": "code",
"colab": {}
},
"source": [
"Clean_String <- function(string){\n",
" # all Characters to Lowercase\n",
" temp <- tolower(string)\n",
" # Remove everything that is not a number or letter (may want to keep more \n",
" # stuff in your actual analyses). \n",
" temp <- stringr::str_replace_all(temp,\"[^a-zA-Z\\\\s]\", \" \")\n",
" # Shrink down to just one white space\n",
" temp <- stringr::str_replace_all(temp,\"[\\\\s]+\", \" \")\n",
" # Split it\n",
" temp <- stringr::str_split(temp, \" \")[[1]]\n",
" # Get rid of trailing \"\" if necessary\n",
" indexes <- which(temp == \"\")\n",
" if(length(indexes) > 0){\n",
" temp <- temp[-indexes]\n",
" } \n",
" return(temp)\n",
"}\n",
"\n",
"\n",
"\n",
"sentence <- \"The term 'data science' (originally used interchangeably with 'datalogy') has existed for over thirty years and was used initially as a substitute for computer science by Peter Naur in 1960.\"\n",
"clean_sentence <- Clean_String(sentence)\n",
" \n",
"\n",
"\n",
"\n",
"\n"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "GAerkNnMWC7J",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "code",
"metadata": {
"id": "AWtcwvoIKW5q",
"colab_type": "code",
"colab": {}
},
"source": [
"#Recorrer el archivo\n",
"test = clean_sentence[1]\n",
"\n",
"# append (first_vector, second_vector)\n",
"for (word1 in clean_sentence){\n",
" for (word in clean_sentence) {\n",
" listA <- c(listA, lev(strsplit(word, \"\")[[1]],strsplit(word1, \"\")[[1]],nchar(word),nchar(word1)))\n",
" \n",
" }\n",
"}"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "631DdRo7ZVD2",
"colab_type": "code",
"outputId": "342d2726-50e9-4292-a558-11a2202ab920",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 891
}
},
"source": [
"print(clean_sentence)\n",
"print(listA)\n",
"hist(listA)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"[1] \"the\" \"science\" \"by\" \"peter\" \"naur\" \"in\" \n",
" [1] 0 0 6 3 4 4 3 6 0 7 6 7 5 3 7 0 5 4 2 4 6 5 0 4 5 4 7 4 4 0 4 3 5 2 5 4 0\n"
],
"name": "stdout"
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAAC/VBMVEUAAAABAQECAgIDAwME\nBAQFBQUGBgYHBwcICAgJCQkKCgoLCwsMDAwNDQ0ODg4PDw8QEBARERESEhITExMUFBQVFRUW\nFhYXFxcYGBgZGRkaGhobGxscHBwdHR0eHh4fHx8gICAhISEiIiIjIyMkJCQlJSUmJiYnJyco\nKCgpKSkqKiorKyssLCwtLS0uLi4vLy8wMDAxMTEyMjIzMzM0NDQ1NTU2NjY3Nzc4ODg5OTk6\nOjo7Ozs8PDw9PT0+Pj4/Pz9AQEBBQUFCQkJDQ0NERERFRUVGRkZHR0dISEhJSUlKSkpLS0tM\nTExNTU1OTk5PT09QUFBRUVFSUlJTU1NUVFRVVVVWVlZXV1dYWFhZWVlaWlpbW1tcXFxdXV1e\nXl5fX19gYGBhYWFiYmJjY2NkZGRlZWVmZmZnZ2doaGhpaWlqampra2tsbGxtbW1ubm5vb29w\ncHBxcXFycnJzc3N0dHR1dXV2dnZ3d3d4eHh5eXl6enp7e3t8fHx9fX1+fn5/f3+AgICBgYGC\ngoKDg4OEhISFhYWGhoaHh4eIiIiJiYmKioqLi4uMjIyNjY2Ojo6Pj4+QkJCRkZGSkpKTk5OU\nlJSVlZWWlpaXl5eYmJiZmZmampqbm5ucnJydnZ2enp6fn5+goKChoaGioqKjo6OkpKSlpaWm\npqanp6eoqKipqamqqqqrq6usrKytra2urq6vr6+wsLCxsbGysrKzs7O1tbW2tra3t7e4uLi5\nubm6urq7u7u8vLy9vb2+vr6/v7/AwMDBwcHCwsLDw8PExMTFxcXGxsbHx8fIyMjJycnKysrL\ny8vMzMzNzc3Ozs7Pz8/Q0NDR0dHS0tLT09PU1NTV1dXW1tbX19fY2NjZ2dna2trb29vc3Nzd\n3d3e3t7f39/g4ODh4eHi4uLj4+Pk5OTl5eXm5ubn5+fo6Ojp6enq6urr6+vs7Ozt7e3u7u7v\n7+/w8PDx8fHy8vLz8/P09PT19fX29vb39/f4+Pj5+fn6+vr7+/v8/Pz9/f3+/v7///9P/i37\nAAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO3dCZhU1Znw8dM0TdOsgiCKsrh9ThYF\nl0QTl2gwbohLTKK4RBQ1EVQ0OEMSF9SoGPyMcYwh0THGdRjXGGMc0aBxj5BMohIVNYko7iAR\nkbXvM7equ09XH5vyneI91Hm7/7/nsep21+1zjk39qe1S5TIA68xVewFAR0BIgAJCAhQQEqCA\nkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCA\nkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCA\nkAAFhAQoICRAASGtH7c5V1+1yW/doXvDFsvarqSqC+qACCmKa5xzi5s2Rzm3b3Wvt0+6gg+a\nv5KGdEb+M8Mboy+uoyCkKD4W0vzLL7/yY3u9Wev+uh4W8y3n+l5+w8rmr5oL+qQFrRlcqO/R\n9bC8joGQovhYSO36d7deQtrfuW+1flXupqhkQQ8418W5CZGX1nEQUhSykHZdPyHlK5jc+lW5\nkEoWdJxzJzg3YFXktXUYhBTFWh8jrfr5qIFdB+508TtZNrr40MVNyr+75KKd+9VttM8vVjf9\nyC927NF/9NOv5Bd+lGXXOrfHqtMGbJRljf+5z8CuvT//74W98u/unt02smHYWSuzeWM26PmV\nZ0unLx3vW03TrO0xUvsLyj7q69wLn3LuN9F/VR0EIUWxtpBW7tl8td7ipZLr7f9s2vzdL7xb\n+Ikzitv1V+Un+Ve3OLfdZc7VZtlRzXsd2JhlM5377K01ha8mvjKgcDbw/dbZ24xXPqT2F1S8\n+LPZec4dtd5+ZcYRUhRrC2mGc//yn4/fd5hzX8qeuzvf6eZHXskW5df7zWf8akpX50bn+/wh\n//bIa278Yq9iPYWfHD6kbuQ22T35o5afPnNtvtetxe9ustlBp+S3G/X7D560c/4jl/vJ2473\n0iM7OXfEI4+sab60bUjtLih3qHM/yJ53rueH6/G3ZhkhRXGNK9Ua0jjnLsvPVo495ZI12Ruu\n6SHJ+c71eT0/vzH/ek6WneTcBvktybJhPiS39YJ84yejRxduLA5y7ptN3/16lt2Rn3V/OVux\njXNj/OTBeO0/Riq3oCxbXO/ci1k2wrlb1sNvqyMgpCjWFtJpzg294c3mnVqut/nVdVzh69X9\nnLsgyz7t3DGFL89pDan02nyqc/s0ffcPeQD5FX5s/s0z8xsxv0cwXtmQ2l1Qcf3b52fT8vuR\nyr+ZjoqQoiiENGRYUffSkP7UoxDWluPvKDwN0Hy9bezadKuQZV8oVpHvcnHhq9tbQ2q6ps86\neIv6Ypijmr5bOFZhM+em5Wf5g6gtW+YOxysbUnsLyuWPnC7Jz/7mXN27MX9PHQchRbHWZ+1m\nf6bpRmr44/56uzQ/+3lx170LNwCN+ZdXFL66z4dUW3x889P8gp6fGjGgJaTieFs6NyM/u7Ik\npGC88iG1t6AsW9Cl9eZ0RrzfUkdCSFGs/RChxscv2K9v4Vm2paW3SJcWL9u5+CxZfdONTLGg\nrPUnP8hvO47Mb4QmfFJI4XjlQ2pnQVk2veR+6e6xfkcdCyFFUf5Yu9V3549e7vXX25HNzzKv\n7O3cD7Nsq+bHSGe3Den3+d5/ahqvfEjheJ8QUjsLKoyw8/iCw52r+Yfy76ZjIqQo1hLSsovH\nHVS8m7aPc3dlb+Y7PZJ/cUF+l63wtNzV+bX2hSw72rm+7+S3QEPahjSr6emF5/J7XXuUDykY\nr1xI7S/oufz8ieLOjUOaHizhkxBSFGu7Rcr/qj/s3jm/P7/O1b+Vra7L7zjd+t/Z4s2c2+qq\nO/41v093Qr7Pg/nPbvfLaz/Xs21Ir+cFjXnm15tu41yfJ94qF1IwXtlbpHYX9H3nhjQf931G\nvpbIv6uOgZCiWFtIz2zW/Mijy7VZ8WjS4mum/kiEr35U+Ilji9s9ftg2pMLz3rnBfysclj21\nXEjheOVCam9BjcOdO6N578fzb7Q5+gjtI6Qo1voY6c0f7DSorse/nPTnwhevHbJB980vyjeW\nXPi5vnWbHPrrpp9Yc+k29Rt97S/3Bg9mVv7w0w2bnvB6NmubrpvNLBtSMF7Zx0jtLOjRfPGP\nNe/dONS57yn/cjokQkrW9fnNT7XXAClCSs68aROOKPzrhYOcO6Taa4EUISVnfk1e0EOPnp7f\nwbq/2muBFCGl57yW10LPqfZKIEZICXrwa5vV1Q87/KFqrwNyhAQoICRAASEBCggJUEBIgAJC\nAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJC\nAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJC\nAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqBgXUJqfHnWnXc++KraWgCzKg9p0eSN\nXNHQC5YpLgiwqOKQFm7uth43dfr0s8cOdiMWaS4JsKfikMbX3dq8tfqqmklKqwGMqjikjY9v\n3T58iMZSALsqDqnuotbt87ppLAWwq+KQhn2jdfvg4RpLAeyqOKRJNZcub9paeq6borUcwKaK\nQ1q8g+s9atwpE4/ds4fb/QPNJQH2VP460oofjawtvIxUt8vVqxUXBFi0TocIffTi3LnzV7Rz\nwZrZs7z7b1qXKTqX5Q/MiuaB5dX+v+vQVI61e3d+8I1XBvbzeruVGnN0Crd36RdNl9ur/X/X\noamENKXcKI+59m6z0J6Zg+KNPWhmvLFBSEkhJLMIKSWEZFbFIe1YYmNC0kFIZlUcUpcu9V4t\nIekgJLMqDmlK79an6rhrp4SQzKo4pJXb7+Sf1iYkJYRkVuVPNsxrOLNlk5CUEJJZ6/Cs3ZL3\nWrYemlZmN0KSIySz4r+LECHJEZJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRk\nFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSU\nEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKL\nkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoI\nySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVI\nKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRk\nFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSU\nEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKL\nkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoI\nySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVI\nKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIyax1Dmnls3OWl92BkOQIyazK\nQ3pwz+H7P5ndN9i5PleV24+Q5AjJrIpDeryr69Ol5+N9hnzzG/3cb8vsSEhyhGRWxSGN2fjP\n2dt7DR2xLMsWDd+vzI6EJEdIZlUc0oY/yE+edr8sbF/YP7hw0YSTvIMJSYyQzKo4pK435CcL\n3W8K29d2DS4kpMoQklkVhzRoan7ykLuisP39cn/+3LWTIySzKg7piP6/W/GXbT819LUsm9fv\na2V2JCQ5QjKr4pD+2ts513/esB57faFr7VNldiQkOUIyq/LXkZ4Zu/O457NnPl/jtvhVuf0I\nSY6QzFr3Q4Q+eLv85YQkR0hmcaxdSgjJLEJKCSGZRUgpISSzCCklhGQWIaWEkMwipJQQklmE\nlBJCMouQUkJIZhFSSgjJLEJKCSGZRUgpISSzCCklhGQWIaWEkMwipJQQklmElBJCMouQUkJI\nZhFSSgjJLEJKCSGZRUgpISSzCCklhGQWIaWEkMwipJQQklmElBJCMouQUkJIZhFSSgjJLEJK\nCSGZRUgpISSzCCklhGQWIaWEkMwipJQQklmElBJCMouQUkJIZhFSSgjJLEJKCSGZRUgpISSz\nCCklhGQWIaWEkMwipJQQklmElBJCMouQUkJIZhFSSgjJLEJKCSGZRUgpISSzCCklhGQWIaWE\nkMwipJQQklmElBJCMouQUkJIZhFSSgjJLEJKCSGZVc2QFs6K56no/18xEJJZ1QxpfLd+sfSp\nWRr9fywCQjKrmiGNGxdt0rluSbSxIyIkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYR\nUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkh\nmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgp\nJYRkFiGlhJDMIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDM\nIqSUEJJZhJQSQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMIqSUEJJZhJQS\nQjKLkFJCSGYRUkoIySxCSgkhmUVIKSEkswgpJYRkFiGlhJDMWteQVv/lsVfL70FIcoRkVuUh\nPTYxP7lxkHNuxMNl9yMkMUIyq+KQZnfr1Zjd5np9fcJXutTPKbMjIckRklkVh7TnRvOzbPNh\nC/PNJxvGlNmRkOQIyayKQ+pzZpa9764obp+4QZkdCUmOkMyqOKSe52TZ8po7itvndw8ufGVg\nP6+3W76WIQgpFDOkup79ojkm3rKtqDikXbf+MMu+eGZhc/mIEcGFa2bP8n7MLZJYzJC6HDor\nluN2jbdsKyoO6R63w3+vmrvJ9R+ufPLL7udlduSunVzUkCZFG/pCQlqHp7+v6ekaPj3M1da6\nmu80ltmPkOQIyax1eEH2zUv3Hda7fsMdT5tbdjdCkiMkszhEKCWEZBYhpYSQzCKklBCSWYSU\nEkIyi5BSQkhmEVJKCMms0pB2+dn7EWYgJDlCMqs0pK6uYez9a7RnICQ5QjKrNKR3fz6q1g05\na77uDIQkR0hmBY+R3p6xVxe323/8U3EGQpIjJLM+/mTDwstHuB7ffkFtBkKSIySzPhbSstsO\na3BD6+rOK3cg6v8FIckRkllBSI+e0Mc1HDU7e/UwN1VpBkKSIySzSkN69cKtndv+J4sL2417\nb6Q0AyHJEZJZpSF1cX2/7d8P6Cc1SjMQkhwhmVUa0u6/XNb6xfw7lWYgJDlCMqvtY6Rn3ymc\n/FF1BkKSIySzSkNaebybnZ9d6catVpyBkOQIyazSkC5zo1/Jz54/3P1YcQZCkiMks0pD2vbA\n5o0DtlKcgZDkCMms0pAaLmvemF6nOAMhyRGSWaUhDTq1eWOC5p8nIckRklmlIR3f4zeFs5VX\nd9V8C1pCkiMks0pDWriJG/qVA3fr7zb5h+IMhCRHSGa1eR3pzW9v6JwbeOJrmjMQkhwhmRUc\ntNr4+ktLlWcgJDlCMos3P0kJIZlVGlLjrQeO/EwTxRkISY6QzCoN6VLnevRtojgDIckRklml\nIW2278sRZiAkOUIyqzSkuidjzEBIcoRkVptbpCdizEBIcoRkVmlI/zohxgyEJEdIZpWG9MG+\nR943b36R4gyEJEdIZpWG5FopzkBIcoRkVmkyY48d30JxBkKSIySzOLIhJYRkVhDSP59drD0D\nIckRklltQnpoR+d+m2VjHtCcgZDkCMms0pCe6tZ73zyktzfuNmet+//fEZIcIZlVGtLooQve\nKNwivTX0YMUZCEmOkMwqDWnDaVkxpOzifoozEJIcIZnV5qMvb2oO6TreRag6CMmsNsfandUc\n0nHDFGcgJDlCMqs0pJP6zS2EtOj7TvOgO0KSIySzSkN6Y0jXHdzIkfVu6JuKMxCSHCGZ1eZ1\npLdOLryL0ICT39KcgZDkCMms8F2E3pyveWtUQEhyhGQWx9qlhJDMKg1plLe74gyEJEdIZrX7\n75F6D1acgZDkCMms0pBWFX347Jl7aF4LCUmOkMxq9zHSd7+tOAMhyRGSWe2G9AR37aqDkMxq\nN6T7eyjOQEhyhGRWaUiLm7w9eyTv/V0dhGRW++8idKPiDIQkR0hmtfmHfU0OOZl/al4lhGQW\nRzakhJDMIqSUEJJZpSGN+PzOpZRmICQ5QjKrNKRBDc65mvy/htoCpRkISY6QzCoNadFuE//4\nUbbk4a/uwyFC1UFIZpWGdFzLFXu/ExRnICQ5QjKrNKSB1zZv/P+NFGcgJDlCMqs0pPqLmjf+\nrV5xBkKSIySzSkPafnDTh8g+OmCE4gyEJEdIZpWGdHet23zvMXtv4WpuV5yBkOQIyay2n0ax\nb3fnXLcvz9KcgZDkCMms4MiGNa+9uGC17gyEJEdIZvFBYykhJLP4oLGUEJJZfNBYSgjJLD5o\nLCWEZBYfNJYSQjKLDxpLCSGZxQeNpYSQzOKDxlJCSGbxQWMpISSz+KCxlBCSWXzQWEoIyaw2\nR38/G2MGQpIjJLNKQ+p+SYwZCEmOkMwqDWnv/ddEmIGQ5AjJrNKQ3hy73y1z5hcpzkBIcoRk\nVvtvoq/5/quEJEdIZpUmc/gxx49vpjgDIckRklm893dKCMksH9KVjxTP/vSa9gyEJEdIZvmQ\nXNPv2U3UnoGQ5AjJLEJKCSGZRUgpISSzCCklhGQWIaWEkMwipJQQklmElBJCMqs1pJ2nFrjP\nFc8UZyAkOUIyqzWkNhRnICQ5QjLLJ3NjG4ozEJIcIZnFsXYpISSzCCklhGQWIaWEkMwipJQQ\nklmElBJCMouQUkJIZq17SEum/LXs5YQkR0hmrXtIC9w9ZS8nJDlCMqvikFreJmX8WLdP2TdL\nISQ5QjKr4pDEhxQRkhwhmVVxSGfUjrxvccFzbubixcGFa2bP8n7c0UJ6aVY0Z20YbdUxQzr7\nM/F+JTPujzf2QsXfQeWPkZ4eWXPy+1n7j5FeGdjP6+2Wr2UEoyHt0dAvlvou0VYdM6R9aqL9\nRvq6ntHG7qb59o3r8GTDqksaBt/eCZ9s2PXCaENPshnS3n2iDf26uyna2KrXv3V61u6lUW7M\nq4Skh5BCnSOkLLuuf6+phKSGkEKdJaTsrSMcIakhpFCnCSnL7p08r+zlhCRHSKFOFNInISQ5\nQgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRHSCFC8ghJjpBChOQRkhwhhQjJ\nIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRHSCFC8ghJjpBChOQRkhwh\nhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRHSCFC8ghJjpBChOQR\nkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRHSCFC8ghJjpBC\nhOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRHSCFC8ghJ\njpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRHSCFC\n8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5hCRH\nSCFC8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOkECF5\nhCRHSCFC8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxCkiOk\nECF5hCRHSCFC8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKIkDxC\nkiOkECF5hCRHSCFC8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJEVKI\nkDxCkiOkECF5hCRHSCFC8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURIHiHJ\nEVKIkDxCkiOkECF5hCRHSCFC8ghJjpBChOQRkhwhhQjJIyQ5QgoRkkdIcoQUIiSPkOQIKURI\nHiHJEVKIkDxCkiOkECF5hCRHSCFC8ghJjpBChOQRkhwhhTpXSIv+VuZCQpIjpFAnCOnPBwzb\n7arVxc0p5UYhJDlCCnX8kB6tdz3q3JcWFbYJSQkhhTp+SKPr7mpc/qO6zy3NCEkNIYU6fkhD\nji6cPtjtgNXthLRowknewdUI6fdu3EmxbHxytGUTUqjjh1R3bvHsBndaiiHd7I6KFlK3vaMt\nm5BCHT+kzQ5qOv+em57gXbub3WvRxu5DSAFCWoeQTqu5cmXhvPFYd/qphKSDkEIdP6R3h7qm\nK1Tjac4Rkg5CCnX8kLJ3JpzevHXHloSkg5BCnSAkKUKSI6QQIXmEJEdIIULyCEmOkEKE5BGS\nHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKSI6QQIXmEJEdIIULyCEmOkEKE\n5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKSI6QQIXmEJEdIIULyCEmO\nkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKSI6QQIXmEJEdIIULy\nCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKSI6QQIXmEJEdI\nIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKSI6QQIXmE\nJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKSI6QQ\nIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQPEKS\nI6QQIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckRUoiQ\nPEKSI6QQIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREgeIckR\nUoiQPEKSI6QQIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5AgpREge\nIckRUoiQPEKSI6QQIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q5Agp\nREgeIckRUoiQPEKSI6QQIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQiJI+Q\n5AgpREgeIckRUoiQPEKSI6QQIXmEJEdIIULyCEmOkEKE5BGSHCGFCMkjJDlCChGSR0hyhBQi\nJI+Q5AgpREgeIckRUqhThNT48qw773zw1U/Yi5DkCCnUCUJaNHkjVzT0gmXl9iMkOUIKdfyQ\nFm7uth43dfr0s8cOdiMWldmRkOQIKdTxQxpfd2vz1uqrasr9GRGSHCGFOn5IGx/fun34kDI7\nEpIcIYU6fkh1F7Vun9ctuPCVgf283m7lWoYY361fLD1d32hj19RFG7reRRu6n6uPNnRdTbSh\n+7qe0cbuNr7SK387Kg5p2Ddatw8eHly4ZvYs7/61/pWycFY098+IN/b1d0Ub+t5rog0965p7\now191/XRhp414/54Yy+s9MrfjopDmlRz6fKmraXnuilaywFsqjikxTu43qPGnTLx2D17uN0/\n0FwSYE/lryOt+NHI2sLLSHW7XL1acUGARet0iNBHL86dO39tz8kBnUj8Y+2AToCQAAWEBCgg\nJEABIQEKCAlQQEiAAkICFBASoICQAAWEBCggJEABIQEKCAlQQEiAAkICFBASoKCaIe3igCra\nRfHKXM2QjhwzJ5ab3MPRxh4xIdrQF/ePNvSc/hdHG3rCiGhDP+xuijb2mCMVr8zVDCniO63O\ndUuijb3rhdGGnjko2tDZoJnRhr5w12hDL3Fzo42dxjutKiCkECGFCEmAkEKEFCIkAUIKEVKI\nkAQIKURIIUISIKQQIYUISYCQQoQUIiQBQgoRUoiQBAgpREghQhIgpBAhhQhJ4KSTog39TJcP\no4291/RoQ99Z7sN419GQO6MNPX2vaEN/2OWZaGOrXv+qGdKiRfHGfjne0G/Ea3TV36MNnf19\nVbShP3wj2tAx/yBVr3/8MwpAASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBI\ngAJCAhQQEqCAkAAFhAQoqF5IiycNq9tk/MIoY6/8bpcdowycLZo8tNvwg5+IMvbLJ27RbcDB\nT0UZO3eGGx9j2OuaP9rhBzEGv3ePXn33mh1j5PqWj6T4m854VQtpxQ7usIuOr9s8xr+SnbdD\n70ghvTfcjT7nqK7d/xJh7Oc37Hb01KPq6h6PMHbu6do4IV3uxk4p+F2EsX/htjz7zIHdHosw\n9NnFRU8Z3v09nfGqFtKP3A/z0/9yk/WHXtKw0/z6OCFNdFfmp3e4AyKM/ZWah/PTO903Ioyd\nZatGjogT0lT3dIxhC97qtf3SLJvfa0KsCbI5tVrvZFO1kEb2Xl4422qjRvWh35u8MosU0umj\nVuanjQ3DIox99vcKp6vrRkQYO8suqfltnJAmufkxhi241N1XONO/hrRYvf2nVigNVa2QPqod\nVTwf5+K8u0WkkJosr+IHM08AAAUeSURBVIv3/lOvuUNiDPtSw8mL44R0rHtn9YJ3Yoyc7duw\nMlse733VCndLZ2sNVa2QXnRNbyo21c2KMn7UkK4o3sGL4cPZ2/WOck9p1CbvRwrpEHdWP+f+\n380Rhh726T/uWuO2vC7C0EVLB45SG6taIc11E4vnl7o4b7cWM6SHuu0W6a2t+jp3dJRb6Ovc\n7VmkkPZ0W0y74Xt93M/0h+49bJPJt18x1MWItOAS93u1saoX0inF8+nurijjRwzplvodlJ7p\n+ZjvnvTFLrtFKOmt/gdmsUJ68Pal+elz9f21Hm20qnfX56cLe228Wn3ogmUD9tAbrFohzXfH\nFs/Pdg9EGT9aSI3nuv3+GWnsgtk9t1ujPugRvf4RLaRmh7o/qI+5YW3xvTi/7mK82pBlNxU7\nVVKtkFZ03bN4Ptb9I8r4sUJqPN6dGufvxxZHunnaQ97rzlmwYMFzbuyCeI/cv+X0X0jasbbw\nHGk2wcV4ISnLxtQu1husak9/79yj8LfNmsGR3u06VkiT3MVxBs5e2+6Y4vlX9V+XmdzyKr6b\noj109sFPbyme7xbh6ddT3JOFs33cq+pD51b03ElxtKqFdLU7Lz+d4c6PM3ykkO5wk6KMW7BZ\nt8LV5oVevT7SHnnePQUz3T73/FV76GzNpr0Kg/7Kba8+dDan5svLs+zpLtvpD537k+pd3aqF\ntHp3d/D5R9RsG+Ed6R+aMmVK7cb5ybvqQ2/pTm06tCTCkU131dYdcda4nu4n+kMXRXqMdHdN\nz/HnHFrTJ8bHr5zuRp5/YkO32RGGzrKZTvPzeap30OoHZw6r23RijOe/prXckdF/yd0pH+rY\nxpOHDKzdYO9fRxi5KNaTDY/vv0HXwd+McnhD489GdO97gP6zGEUz3BWKo/HPKAAFhAQoICRA\nASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRA\nASEBCggJUEBIgAJCAhQQEqCAkAAFhAQoICRAASEBCggJUEBIgAJCAhQQEqCAkAAFhGTH4W5B\n/t8b1V4G2kNIdhRCmrZvm0+vndbyiZOTXd9lVVgSWhCSHYWQ2lroftu0sWJAF3f9el8QWhGS\nHR8P6e6WkG5xE2p2W+8LQitCsqPlMdLy6dv16bXt9DXZ6MLnqz9SuGhP9+Lubl61F9iZEZId\nLSEd546c8bND3cTsiWPcuXe9l1/ygvtido37TrUX2JkRkh0tIfX4QuGrMw5bnU1rvms32V2T\n/bPHgBVVXV7nRkh2tITUd/Bbzd9pDmn5gIYlWXaMm1nFxXV2hGRHS0hXuD7H/OK1wneaQ7rZ\nHZ2f/s7tXdXldW6EZId/QfbBQ3q6mgP+7kP6kvuP+fPnvzio5uVqL7HzIiQ7So5sWD7r2Jqt\nVjSH9Lxr8f0qr7ATIyQ72h4idLJ7qjmk77gTbiu4sXaTVdVcX6dGSHY0h/TE4OIxDBPdH7Pp\n7s781mnD+rebdjjM/aqa6+vUCMmO5pBWfbbbiVf99PguuzVmt7vPX/aHm91xzTs85EZXdYGd\nGSHZ0XLX7r3Tt+zRd8TFH2TZysMa+t22h/uflj22rQ0PIsJ6QkiAAkICFBASoICQAAWEBCgg\nJEABIQEKCAlQQEiAAkICFBASoICQAAWEBCggJEABIQEKCAlQQEiAAkICFBASoICQAAWEBCgg\nJEABIQEKCAlQQEiAAkICFBASoICQAAWEBCggJEABIQEK/hdwWEpfFF96NQAAAABJRU5ErkJg\ngg==",
"text/plain": [
"Plot with title “Histogram of listA”"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "qztmtBHGJGRD",
"colab_type": "code",
"outputId": "653d9a1d-3455-4973-cdb6-35ab462202e4",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 129
}
},
"source": [
"matrix(1:dim(clean_sentence))"
],
"execution_count": 0,
"outputs": [
{
"output_type": "error",
"ename": "ERROR",
"evalue": "ignored",
"traceback": [
"Error in 1:dim(clean_sentence): argument of length 0\nTraceback:\n",
"1. matrix(1:dim(clean_sentence))"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "XJPzRtpghMVc",
"colab_type": "code",
"outputId": "9657e3ba-34f3-4be2-965a-61bc39a3c4ef",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"\n",
"\n",
"length(clean_sentence)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"[1] 29"
],
"text/latex": "29",
"text/markdown": "29",
"text/html": [
"29"
]
},
"metadata": {
"tags": []
}
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment