Skip to content

Instantly share code, notes, and snippets.

@anandology
Created September 24, 2013 13:46
Show Gist options
  • Save anandology/6684974 to your computer and use it in GitHub Desktop.
Save anandology/6684974 to your computer and use it in GitHub Desktop.
Python Training Notes - June 2012
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "2-working-with-data"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working with Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lists"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [1, 2, 3]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 3,
"text": [
"1"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"len(x)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 4,
"text": [
"3"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [1, \"foo\", 1.5, [1, 2]]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[3]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 6,
"text": [
"[1, 2]"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[3][0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 7,
"text": [
"1"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = [1, 2]\n",
"b = [a, 4]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 9,
"text": [
"[[1, 2], 4]"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 10,
"text": [
"1"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a[0] = 11"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 12,
"text": [
"[11, 2]"
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 13,
"text": [
"[[11, 2], 4]"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.append(4)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 15,
"text": [
"[11, 2, 4]"
]
}
],
"prompt_number": 15
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.append(5)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 17,
"text": [
"[11, 2, 4, 5]"
]
}
],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.insert(2, 3)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 28,
"text": [
"[11, 2, 3, 4, 5]"
]
}
],
"prompt_number": 28
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"range(10)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 18,
"text": [
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"range(3, 7)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 19,
"text": [
"[3, 4, 5, 6]"
]
}
],
"prompt_number": 19
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [1, 2, 3, 4, 5]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 21
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"4 in x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 22,
"text": [
"True"
]
}
],
"prompt_number": 22
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"4 not in x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 23,
"text": [
"False"
]
}
],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"9 in x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 24,
"text": [
"False"
]
}
],
"prompt_number": 24
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"if 3 in x: \n",
" print \"you are lucky!\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"you are lucky!\n"
]
}
],
"prompt_number": 25
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### For Loop"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [1, 2, 3]\n",
"for a in x:\n",
" print a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n"
]
}
],
"prompt_number": 29
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for i in range(5):\n",
" print i, i*i"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"0 0\n",
"1 1\n",
"2 4\n",
"3 9\n",
"4 16\n"
]
}
],
"prompt_number": 30
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.index(3) # given a value, it finds the index"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 32,
"text": [
"2"
]
}
],
"prompt_number": 32
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets write a function `my_sum` that works like built-in `sum` function."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def my_sum(values):\n",
" sum = 0\n",
" for v in values:\n",
" sum = sum + v\n",
" return sum\n",
" \n",
"print my_sum([1, 2, 3, 4])\n",
"print my_sum(range(100))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"10\n",
"4950\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `product` to compute product of a list of numbers.\n",
"\n",
" print product([1, 2, 3, 4, 5]) # should print 120"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `factorial` to compute factorial of the given number. Can you use the previous implementation of `product` in computing it?\n",
"\n",
" print factorial(5) # should print 120"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `squares`, that takes a list of numbers as arguments and returns a new list containing squares of each of the elements in the given list.\n",
"\n",
" print squares([1, 2, 3]) # should print [1, 4, 9]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `evens` that takes a list of numbers and returns a new list containing only the even numbers from the given list.\n",
"\n",
" print evens([2, 3, 4, 5, 6, 7]) # should print [2, 4, 6]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### List Slicing"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [1, 2, 3, 4]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 34
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[3]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 35,
"text": [
"4"
]
}
],
"prompt_number": 35
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[len(x)-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 36,
"text": [
"4"
]
}
],
"prompt_number": 36
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 37,
"text": [
"4"
]
}
],
"prompt_number": 37
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[0]+x[-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 38,
"text": [
"5"
]
}
],
"prompt_number": 38
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = range(10)\n",
"x[2:7]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 40,
"text": [
"[2, 3, 4, 5, 6]"
]
}
],
"prompt_number": 40
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[:7]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 41,
"text": [
"[0, 1, 2, 3, 4, 5, 6]"
]
}
],
"prompt_number": 41
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[2:]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 42,
"text": [
"[2, 3, 4, 5, 6, 7, 8, 9]"
]
}
],
"prompt_number": 42
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[:]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 43,
"text": [
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
]
}
],
"prompt_number": 43
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[2:7:2]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 44,
"text": [
"[2, 4, 6]"
]
}
],
"prompt_number": 44
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[2::2]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 45,
"text": [
"[2, 4, 6, 8]"
]
}
],
"prompt_number": 45
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[::2]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 46,
"text": [
"[0, 2, 4, 6, 8]"
]
}
],
"prompt_number": 46
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[::-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 47,
"text": [
"[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]"
]
}
],
"prompt_number": 47
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[2:4][::-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 48,
"text": [
"[3, 2]"
]
}
],
"prompt_number": 48
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[3:1:-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 49,
"text": [
"[3, 2]"
]
}
],
"prompt_number": 49
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 50,
"text": [
"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
]
}
],
"prompt_number": 50
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[3:1:-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 51,
"text": [
"[3, 2]"
]
}
],
"prompt_number": 51
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sorting Lists"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [2, 6, 3, 8, 4]\n",
"x.sort()\n",
"print x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[2, 3, 4, 6, 8]\n"
]
}
],
"prompt_number": 52
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = [2, 6, 3, 8, 4]\n",
"print sorted(x)\n",
"print x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[2, 3, 4, 6, 8]\n",
"[2, 6, 3, 8, 4]\n"
]
}
],
"prompt_number": 53
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"names = [\"Perl\", \"Python\", \"JAVA\", \"c\", \"c++\", \"go\"]\n",
"sorted(names)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 54,
"text": [
"['JAVA', 'Perl', 'Python', 'c', 'c++', 'go']"
]
}
],
"prompt_number": 54
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def upper(s): \n",
" return s.upper()\n",
"\n",
"sorted(names, key=upper)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 57,
"text": [
"['c', 'c++', 'go', 'JAVA', 'Perl', 'Python']"
]
}
],
"prompt_number": 57
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"students = [\n",
" [\"Alice\", 40],\n",
" [\"Bob\", 64],\n",
" [\"Charlie\", 58], \n",
" [\"Dave\", 43]]\n",
"\n",
"def get_marks(record): \n",
" return record[1]\n",
"\n",
"print sorted(students, key=get_marks)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[['Alice', 40], ['Dave', 43], ['Charlie', 58], ['Bob', 64]]\n"
]
}
],
"prompt_number": 63
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `lensort` to sort the given list of strings using the length of each string. The function should return a new list without modifying the given list.\n",
" \n",
" print lensort([\"Perl\", \"Python\", \"JAVA\", \"c\", \"c++\", \"go\", \"Haskell\"])\n",
" # should print [\"c\", \"go\", \"c++\", \"Perl\", \"Java\", \"Python\", \"Haskell\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tuples"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = (1, 2, 3)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 64
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 65,
"text": [
"(1, 2, 3)"
]
}
],
"prompt_number": 65
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"len(a)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 66,
"text": [
"3"
]
}
],
"prompt_number": 66
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 67,
"text": [
"1"
]
}
],
"prompt_number": 67
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a[0] = 1"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "TypeError",
"evalue": "'tuple' object does not support item assignment",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-68-71b8ae05f2fb>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment"
]
}
],
"prompt_number": 68
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = 1, 2"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 69
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 70,
"text": [
"(1, 2)"
]
}
],
"prompt_number": 70
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"1,2"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 71,
"text": [
"(1, 2)"
]
}
],
"prompt_number": 71
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 72,
"text": [
"(1, 2)"
]
}
],
"prompt_number": 72
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x, y = a"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 73
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 74,
"text": [
"1"
]
}
],
"prompt_number": 74
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"y"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 75,
"text": [
"2"
]
}
],
"prompt_number": 75
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"1 + (2*3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 76,
"text": [
"7"
]
}
],
"prompt_number": 76
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"1 + (6)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 77,
"text": [
"7"
]
}
],
"prompt_number": 77
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"(6)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 78,
"text": [
"6"
]
}
],
"prompt_number": 78
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"(6,)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 79,
"text": [
"(6,)"
]
}
],
"prompt_number": 79
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Strings"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"len(\"abrakadabra\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 80,
"text": [
"11"
]
}
],
"prompt_number": 80
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Slicing works even for strings."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = \"hello\"\n",
"x[:-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 81,
"text": [
"'hell'"
]
}
],
"prompt_number": 81
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[::-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 82,
"text": [
"'olleh'"
]
}
],
"prompt_number": 82
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hell' in 'hello'"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 83,
"text": [
"True"
]
}
],
"prompt_number": 83
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x[-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 84,
"text": [
"'o'"
]
}
],
"prompt_number": 84
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets look at spliting strings."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = \"when in doubt, use brute force\""
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 85
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.split()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 86,
"text": [
"['when', 'in', 'doubt,', 'use', 'brute', 'force']"
]
}
],
"prompt_number": 86
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.split(\",\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 87,
"text": [
"['when in doubt', ' use brute force']"
]
}
],
"prompt_number": 87
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `count_words`, that takes a string and returns the number of words in that string.\n",
"\n",
" print count_words(\"hello world\") # should print 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `getext` to get extension from a filename.\n",
" \n",
" print getext(\"hello.py\") # should print py\n",
" print getext(\"a.tar.gz\") # should print gz"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `extsort` to sort given list of filenames by extension.\n",
"\n",
" >>> extsort(['x.c', 'a.py', 'b.py', 'bar.txt', 'foo.txt', 'a.c'])\n",
" ['a.c', 'x.c', 'a.py', 'b.py', 'bar.txt', 'foo.txt']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Joining Strings"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"hello\" + \"world\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 1,
"text": [
"'helloworld'"
]
}
],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"hello\" * 3"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 2,
"text": [
"'hellohellohello'"
]
}
],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print \"-\" * 20"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"--------------------\n"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = \"a,b,c,d,e,f\"\n",
"words = x.split(\",\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"words"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 5,
"text": [
"['a', 'b', 'c', 'd', 'e', 'f']"
]
}
],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\",\".join(words)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 6,
"text": [
"'a,b,c,d,e,f'"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"-\".join(words)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 7,
"text": [
"'a-b-c-d-e-f'"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `pathjoin` that takes a list of names and constructs a path by joining them with `/`.\n",
"\n",
" >>> pathjoin(['a', 'b', 'c'])\n",
" 'a/b/c'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Search and Replace in a string"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"hell\" in \"hello\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 8,
"text": [
"True"
]
}
],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"tell\" in \"hello\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 9,
"text": [
"False"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"\"tell\" not in \"hello\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 10,
"text": [
"True"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = \"mathematics\""
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.replace(\"mat\", \"rat\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 12,
"text": [
"'ratheratics'"
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.count(\"mat\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 13,
"text": [
"2"
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a function `make_slug` to take a title and creates a slug to be used in the URL by replaceing all spaces with `-` characters.\n",
"\n",
" >>> make_slug(\"hello world\")\n",
" 'hello-world'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Stripping trailing and leading characters"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"' hello \\n'.strip()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 15,
"text": [
"'hello'"
]
}
],
"prompt_number": 15
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"' hello \\n'.strip(\"\\n\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 16,
"text": [
"' hello '"
]
}
],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hello'.strip(\"o\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 17,
"text": [
"'hell'"
]
}
],
"prompt_number": 17
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working with Files"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file numbers.txt\n",
"one 1\n",
"two 2\n",
"three 3\n",
"four 4\n",
"five 5\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting numbers.txt\n"
]
}
],
"prompt_number": 22
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"open(\"numbers.txt\").read()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 21,
"text": [
"'one 1\\ntwo 2\\nthree 3\\nfour 4\\nfive 5'"
]
}
],
"prompt_number": 21
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a program `cat.py` that takes a filename as argument and prints its contents.\n",
" \n",
" $ python cat.py numbers.txt\n",
" <contents of numbers.txt printed here>"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = open(\"numbers.txt\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 23
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.read()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 24,
"text": [
"'one 1\\ntwo 2\\nthree 3\\nfour 4\\nfive 5\\n'"
]
}
],
"prompt_number": 24
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.read()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 25,
"text": [
"''"
]
}
],
"prompt_number": 25
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = open(\"numbers.txt\")\n",
"f.read()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 26,
"text": [
"'one 1\\ntwo 2\\nthree 3\\nfour 4\\nfive 5\\n'"
]
}
],
"prompt_number": 26
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.seek(0)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.read()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 28,
"text": [
"'one 1\\ntwo 2\\nthree 3\\nfour 4\\nfive 5\\n'"
]
}
],
"prompt_number": 28
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.tell()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 29,
"text": [
"34"
]
}
],
"prompt_number": 29
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"open(\"numbers.txt\").readlines()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 30,
"text": [
"['one 1\\n', 'two 2\\n', 'three 3\\n', 'four 4\\n', 'five 5\\n']"
]
}
],
"prompt_number": 30
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = open(\"numbers.txt\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 31
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 32,
"text": [
"'one 1\\n'"
]
}
],
"prompt_number": 32
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 33,
"text": [
"'two 2\\n'"
]
}
],
"prompt_number": 33
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 34,
"text": [
"'three 3\\n'"
]
}
],
"prompt_number": 34
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 35,
"text": [
"'four 4\\n'"
]
}
],
"prompt_number": 35
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 36,
"text": [
"'five 5\\n'"
]
}
],
"prompt_number": 36
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 37,
"text": [
"''"
]
}
],
"prompt_number": 37
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example: Word Count**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file wc.py\n",
"import sys\n",
"\n",
"def linecount(fname):\n",
" return len(open(fname).readlines())\n",
"def wordcount(fname):\n",
" return len(open(fname).read().split())\n",
"def charcount(fname):\n",
" return len(open(fname).read())\n",
"def main(fname):\n",
" print linecount(fname), wordcount(fname), charcount(fname), fname\n",
" \n",
"main(sys.argv[1])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting wc.py\n"
]
}
],
"prompt_number": 44
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!python wc.py numbers.txt"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"5 10 34 numbers.txt\r\n"
]
}
],
"prompt_number": 45
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Make the above wc.py program accept more than one file as command-line arguments and print counts for each file. \n",
"\n",
" $ python wc.py numbers.txt\n",
" 5 10 34 numbers.txt\n",
"\n",
" $ python wc.py numbers.txt a.txt\n",
" 5 10 34 numbers.txt\n",
" 1 2 3 a.txt\n",
"\n",
" $ python wc.py numbers.txt a.txt b.txt\n",
" 5 10 34 numbers.txt\n",
" 1 2 3 a.txt\n",
" 1 2 3 b.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a program `reverse.py` to print lines of a file in reverse order. first line should come at the end and the last line should come at the beginning.\n",
"\n",
" $ cat numbers.txt\n",
" one 1\n",
" ...\n",
" five 5\n",
"\n",
" $ python reverse.py numbers.txt\n",
" five 5\n",
" ...\n",
" one 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a program `reverse_words.py` to print words in each line in reverse order.\n",
"\n",
" $ cat numbers.txt\n",
" one 1\n",
" two 2\n",
" ...\n",
"\n",
" $ python reverse_words.py numbers.txt\n",
" 1 one\n",
" 2 two\n",
" ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a program `head.py` that takes a filename as command line argument and prints first 10 lines of that file.\n",
"\n",
" $ python head.py a.txt\n",
" <first 10 lines of a.txt printed here>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Writing to files"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = open(\"a.txt\", \"w\")\n",
"f.write(\"one\\n\")\n",
"f.write(\"two\\n\")\n",
"f.write(\"three\\n\")\n",
"f.write(\"four\\n\")\n",
"f.close()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 79
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!cat a.txt"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"one\r\n",
"two\r\n",
"three\r\n",
"four\r\n"
]
}
],
"prompt_number": 80
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a program `cp.py` that takes two file names as command line arguments and copies the first one into the second one. \n",
"\n",
"WARNING: Don't call that file `copy.py` as it interferes with a standard library module with that name.\n",
"\n",
" $ python cp.py a.txt b.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can opem a file in append mode by specifying mode `'a'`."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = open(\"a.txt\", \"a\")\n",
"f.write(\"five\")\n",
"f.close()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 81
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!cat a.txt"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"one\r\n",
"two\r\n",
"three\r\n",
"four\r\n",
"five"
]
}
],
"prompt_number": 82
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# List Comprehensions"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def square(v):\n",
" return v*v\n",
"\n",
"def squares(values):\n",
" result = []\n",
" for v in values:\n",
" result.append(square(v))\n",
" return result"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 46
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"squares(range(10))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 48,
"text": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
}
],
"prompt_number": 48
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[i*i for i in range(10)]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 49,
"text": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
}
],
"prompt_number": 49
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[square(i) for i in range(10)]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 50,
"text": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
}
],
"prompt_number": 50
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[square(i) for i in range(10) if i % 2 == 0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 51,
"text": [
"[0, 4, 16, 36, 64]"
]
}
],
"prompt_number": 51
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def squares2(values):\n",
" return [square(v) for v in values]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 52
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"squares2(range(10))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 53,
"text": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
}
],
"prompt_number": 53
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example: Parsing CSV files**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file a.csv\n",
"1,1,1\n",
"2,4,8\n",
"3,8,27\n",
"4,16,64\n",
"5,25,125"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting a.csv\n"
]
}
],
"prompt_number": 57
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file parse_csv.py\n",
"import sys\n",
"def parse_csv_using_for(filename):\n",
" rows = []\n",
" for line in open(filename):\n",
" row = line.strip(\"\\n\").split(\",\")\n",
" rows.append(row)\n",
" return rows\n",
"\n",
"def parse_csv(filename):\n",
" return [line.strip(\"\\n\").split(\",\") for line in open(filename)]\n",
"\n",
"print parse_csv(sys.argv[1])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting parse_csv.py\n"
]
}
],
"prompt_number": 73
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!python parse_csv.py a.csv"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[['1', '1', '1'], ['2', '4', '8'], ['3', '8', '27'], ['4', '16', '64'], ['5', '25', '125']]\r\n"
]
}
],
"prompt_number": 71
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Improve the above implementation of `parse_csv` to ignore comments. A line is considered as comment if it starts with a `#` character."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"n = 25\n",
"[(x, y, z) for x in range(1, n) \n",
" for y in range(x, n) \n",
" for z in range(y, n) \n",
" if x*x+y*y == z*z]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 78,
"text": [
"[(3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15), (12, 16, 20)]"
]
}
],
"prompt_number": 78
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sets"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = set([1, 2, 3, 2])\n",
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 86,
"text": [
"set([1, 2, 3])"
]
}
],
"prompt_number": 86
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(4)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 87
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 88,
"text": [
"set([1, 2, 3, 4])"
]
}
],
"prompt_number": 88
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(3)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 89
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 90,
"text": [
"set([1, 2, 3, 4])"
]
}
],
"prompt_number": 90
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Testing if an element present in a set is very fast. But doing the same thing in a list requires going over all elements of the list and check against each element."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import time\n",
"\n",
"n = 10000000\n",
"x = range(n)\n",
"t0 = time.time()\n",
"'a' in x\n",
"t1 = time.time()\n",
"print 'search in list took', t1-t0\n",
"\n",
"y = set(x)\n",
"t0 = time.time()\n",
"'a' in y\n",
"t1 = time.time()\n",
"print 'search in set took', t1-t0"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"search in list took 0.593917131424\n",
"search in set took"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
" 0.000133037567139\n"
]
}
],
"prompt_number": 102
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python 2.7 introduced a new way of writing sets."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"{1, 2, 3}"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 104,
"text": [
"set([1, 2, 3])"
]
}
],
"prompt_number": 104
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And you have set comprehensions!"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"names = ['x.c', 'a.py', 'b.py', 'bar.txt', 'foo.txt', 'a.c']\n",
"exts = {name.split(\".\")[-1] for name in names}\n",
"print exts"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"set(['txt', 'c', 'py'])\n"
]
}
],
"prompt_number": 105
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The older way of doing that is:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"exts = set([name.split(\".\")[-1] for name in names])\n",
"print exts"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"set(['txt', 'c', 'py'])\n"
]
}
],
"prompt_number": 106
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = {1, 2, 3}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 107
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for a in x:\n",
" print a"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n"
]
}
],
"prompt_number": 108
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(4)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 109
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(0)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 110
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 111,
"text": [
"set([0, 1, 2, 3, 4])"
]
}
],
"prompt_number": 111
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(123456)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 112
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 113,
"text": [
"set([0, 1, 2, 3, 4, 123456])"
]
}
],
"prompt_number": 113
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(\"foo\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 114
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 115,
"text": [
"set([0, 1, 2, 3, 4, 123456, 'foo'])"
]
}
],
"prompt_number": 115
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.add(-1)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 116
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 117,
"text": [
"set([0, 1, 2, 3, 4, 123456, 'foo', -1])"
]
}
],
"prompt_number": 117
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"list(x)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 118,
"text": [
"[0, 1, 2, 3, 4, 123456, 'foo', -1]"
]
}
],
"prompt_number": 118
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[a*a for a in {1, 2, 3}]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 120,
"text": [
"[1, 4, 9]"
]
}
],
"prompt_number": 120
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `count_unique_words` to count the number of unique words in a given sentense.\n",
"\n",
" >>> count_unique_words(\"a b c d a b e\")\n",
" 5"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dictionaries"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = {\"type\": \"event\", \"title\": \"Python Training\"}\n",
"print x['type']\n",
"print x['title']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"event\n",
"Python Training\n"
]
}
],
"prompt_number": 121
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"marks = {\"person1\": 10, \"person2\": 2, \"person3\": 44}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 122
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print marks['person1']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"10\n"
]
}
],
"prompt_number": 123
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d = {\"x\": 1, \"y\": 2, \"z\": 3}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 124
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d['x'] = 2\n",
"d"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 126,
"text": [
"{'x': 2, 'y': 2, 'z': 3}"
]
}
],
"prompt_number": 126
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d['w'] = 9\n",
"d"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 128,
"text": [
"{'w': 9, 'x': 2, 'y': 2, 'z': 3}"
]
}
],
"prompt_number": 128
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"len(d)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 129,
"text": [
"4"
]
}
],
"prompt_number": 129
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'a' in d"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 130,
"text": [
"False"
]
}
],
"prompt_number": 130
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'w' in d"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 131,
"text": [
"True"
]
}
],
"prompt_number": 131
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.keys()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 132,
"text": [
"['y', 'x', 'z', 'w']"
]
}
],
"prompt_number": 132
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.values()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 134,
"text": [
"[2, 2, 3, 9]"
]
}
],
"prompt_number": 134
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.items()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 135,
"text": [
"[('y', 2), ('x', 2), ('z', 3), ('w', 9)]"
]
}
],
"prompt_number": 135
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for k in d.keys():\n",
" print k"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"y\n",
"x\n",
"z\n",
"w\n"
]
}
],
"prompt_number": 136
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for v in d.values(): \n",
" print v"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"2\n",
"2\n",
"3\n",
"9\n"
]
}
],
"prompt_number": 137
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for k, v in d.items():\n",
" print k, v"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"y 2\n",
"x 2\n",
"z 3\n",
"w 9\n"
]
}
],
"prompt_number": 138
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d2 = {1: 3, 2: 4}\n",
"d3 = {1: \"a\", 3: \"b\"}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 141
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a function `count_unique_values` to compute the number of unique values present in a dictionary.\n",
" \n",
" >>> count_unique_values({\"x\": 1, \"y\": 2, \"z\": 1})\n",
" 2"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d = {'x': 1, 'y': 2, 'z': 3}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 142
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.get('x', 0)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 143,
"text": [
"1"
]
}
],
"prompt_number": 143
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.get('xx', 0)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 144,
"text": [
"0"
]
}
],
"prompt_number": 144
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 146,
"text": [
"{'x': 1, 'y': 2, 'z': 3}"
]
}
],
"prompt_number": 146
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.setdefault('x', 0)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 147,
"text": [
"1"
]
}
],
"prompt_number": 147
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d.setdefault('xx', 0)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 148,
"text": [
"0"
]
}
],
"prompt_number": 148
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"d"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 149,
"text": [
"{'x': 1, 'xx': 0, 'y': 2, 'z': 3}"
]
}
],
"prompt_number": 149
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example: Word Frequency**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets try to write a program to compute frequency of words in a given file."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file wordfreq.py\n",
"\n",
"def readwords(filename):\n",
" return open(filename).read().split()\n",
"\n",
"def wordfreq(words):\n",
" freq = {}\n",
" for w in words:\n",
" freq[w] = freq.get(w, 0) + 1 \n",
" return freq\n",
"\n",
"def printfreq(freq):\n",
" for w, count in freq.items():\n",
" print count, w\n",
"\n",
"def main(filename):\n",
" words = readwords(filename)\n",
" freq = wordfreq(words)\n",
" printfreq(freq)\n",
" \n",
"import sys\n",
"main(sys.argv[1])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting wordfreq.py\n"
]
}
],
"prompt_number": 157
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!python wordfreq.py wc.py"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1 charcount(fname),\r\n",
"3 return\r\n",
"1 fname\r\n",
"1 wordcount(fname),\r\n",
"1 linecount(fname):\r\n",
"1 main(fname):\r\n",
"1 sys\r\n",
"1 print\r\n",
"1 charcount(fname):\r\n",
"1 len(open(fname).readlines())\r\n",
"1 import\r\n",
"1 wordcount(fname):\r\n",
"1 main(sys.argv[1])\r\n",
"1 linecount(fname),\r\n",
"1 len(open(fname).read())\r\n",
"4 def\r\n",
"1 len(open(fname).read().split())\r\n"
]
}
],
"prompt_number": 156
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## More about functions"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"min(1, 2)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 1,
"text": [
"1"
]
}
],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"min([8, 3, 5, 2])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 2,
"text": [
"2"
]
}
],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"min(['a', 'b', 'C'])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 3,
"text": [
"'C'"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def upper(s): return s.upper()\n",
"min(['a', 'b', 'C'], key=upper)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 4,
"text": [
"'a'"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"min(['a', 'b', 'C'], key=lambda s: s.upper())\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 7,
"text": [
"'a'"
]
}
],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = lambda s: s.upper()\n",
"print f('foo')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"FOO\n"
]
}
],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"min(['a', 'b', 'C'], key=f)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 9,
"text": [
"'a'"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"functions = {'add': lambda a, b: a+b, 'mul': lambda a, b: a*b}"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"functions['add'](1, 2)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 11,
"text": [
"3"
]
}
],
"prompt_number": 11
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Default arguments"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def incr(x, amount=1):\n",
" return x+amount\n",
"\n",
"print incr(5)\n",
"print incr(5, 4)\n",
"print incr(5, amount=3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"6\n",
"9\n",
"8\n"
]
}
],
"prompt_number": 12
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def f(x, y=1, z=0):\n",
" return x*y+z\n",
"\n",
"print f(10)\n",
"print f(x=10, z=3)\n",
"print f(10, z=3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"10\n",
"13\n",
"13\n"
]
}
],
"prompt_number": 23
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Implement a function `my_min` that takes a list of values as first argument and computes their minimim. If optional second argument `key` is specified, it should call that function for each value and use that for comparision.\n",
"\n",
" >>> my_min([4, 8, 3, 8])\n",
" 3\n",
" >>> my_min([\"a\", \"b\", \"C\"])\n",
" 'C'\n",
" >>> my_min([\"a\", \"b\", \"C\"], key=lambda s: s.upper())\n",
" 'a'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets see implementation for `my_min` without the key argument."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def my_min(values):\n",
" m = values[0]\n",
" for v in values:\n",
" if v < m:\n",
" m = v\n",
" return m"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 24
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def my_min(values, key=None):\n",
" if key is None:\n",
" key = lambda x: x\n",
" # what change do we need to make here?\n",
" m = values[0]\n",
" for v in values:\n",
" if key(v) < key(m):\n",
" m = v\n",
" return m "
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 30
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_min([\"a\", \"b\", \"C\"], key=lambda s: s.upper())"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 31,
"text": [
"'a'"
]
}
],
"prompt_number": 31
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"my_min([\"a\", \"b\", \"C\"])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 29,
"text": [
"'C'"
]
}
],
"prompt_number": 29
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment