Skip to content

Instantly share code, notes, and snippets.

@anandology
Created September 24, 2013 13:46
Show Gist options
  • Save anandology/6684974 to your computer and use it in GitHub Desktop.
Save anandology/6684974 to your computer and use it in GitHub Desktop.
Python Training Notes - June 2012
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "4-modules"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Modules"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import time\n",
"time.asctime()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 1,
"text": [
"'Sun Jun 23 14:29:36 2013'"
]
}
],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"time"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 2,
"text": [
"<module 'time' from '/Users/anand/pyenvs/sandbox27/lib/python2.7/lib-dynload/time.so'>"
]
}
],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x = time"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x.asctime()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 4,
"text": [
"'Sun Jun 23 14:30:08 2013'"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example: num module**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file num.py\n",
"\n",
"def square(x):\n",
" return x*x\n",
"\n",
"def cube(x):\n",
" return x*x"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Writing num.py\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import num"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 7
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"num"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 8,
"text": [
"<module 'num' from 'num.py'>"
]
}
],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"num.square(3)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 9,
"text": [
"9"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"dir(num)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 10,
"text": [
"['__builtins__',\n",
" '__doc__',\n",
" '__file__',\n",
" '__name__',\n",
" '__package__',\n",
" 'cube',\n",
" 'square']"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file num2.py\n",
"\"\"\"Module to demonstates user-defined modules and docstrings.\n",
"\"\"\"\n",
"def square(x):\n",
" \"\"\"Computes square of a number.\n",
" \"\"\"\n",
" return x*x\n",
"\n",
"def cube(x):\n",
" \"\"\"Computes cube of a number.\n",
" \n",
" Examples:\n",
"\n",
" >>> cube(2)\n",
" 8\n",
" >>> cube(3)\n",
" 27\n",
" \"\"\"\n",
" return x*x\n",
"\n",
"if __name__ == \"__main__\":\n",
" print square(2)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting num2.py\n"
]
}
],
"prompt_number": 22
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"help(\"num2\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Help on module num2:\n",
"\n",
"NAME\n",
" num2 - Module to demonstates user-defined modules and docstrings.\n",
"\n",
"FILE\n",
" /Users/anand/Dropbox/Trainings/2013/python-june2013/notebook/num2.py\n",
"\n",
"FUNCTIONS\n",
" cube(x)\n",
" Computes cube of a number.\n",
" \n",
" Examples:\n",
" \n",
" >>> cube(2)\n",
" 8\n",
" >>> cube(3)\n",
" 27\n",
" \n",
" square(x)\n",
" Computes square of a number.\n",
"\n",
"\n"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 17,
"text": [
"<module 'num2' from 'num2.py'>"
]
}
],
"prompt_number": 17
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Standard Library Modules"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**os module**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import os\n",
"print os.getcwd()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"/Users/anand/Dropbox/Trainings/2013/python-june2013/notebook\n"
]
}
],
"prompt_number": 24
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a program `pwd.py` that prints the current working directory."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"os.listdir(\"..\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 26,
"text": [
"['5835376', 'notebook', 'notebook.tgz']"
]
}
],
"prompt_number": 26
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a program `lspy.py` that prints all `.py` files in the current directory. If directory is specified as argument, it should print .py files in that directory instead of current directory."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"os.stat(\"a.txt\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 27,
"text": [
"posix.stat_result(st_mode=33188, st_ino=53498985, st_dev=234881026L, st_nlink=1, st_uid=501, st_gid=20, st_size=23, st_atime=1371979045, st_mtime=1371892575, st_ctime=1371892575)"
]
}
],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# length of a file\n",
"os.stat(\"a.txt\").st_size"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 28,
"text": [
"23"
]
}
],
"prompt_number": 28
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem** Write a program `ls_by_size.py` to print all the files in the given directory ordered by file size."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = os.popen(\"pwd\")\n",
"print f.read()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"/Users/anand/work/trainings/2013/python-june2013/notebook\n",
"\n"
]
}
],
"prompt_number": 32
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"f = os.popen(\"pwd\")\n",
"print f.read().strip().split(\"/\")[-1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"notebook\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# using the unix sort to sort a.txt and reading \n",
"# the first line from its output\n",
"f = os.popen(\"sort a.txt\")\n",
"print f.readline()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"five\n",
"\n"
]
}
],
"prompt_number": 36
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**urllib module**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import urllib\n",
"response = urllib.urlopen(\"http://python.org/\")"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 44
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"html = response.read()"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 42
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"html[:200]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 43,
"text": [
"'<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\\n\\n\\n<html xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\" lang=\"en\">\\n\\n<head>\\n'"
]
}
],
"prompt_number": 43
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print response.headers"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Date: Sun, 23 Jun 2013 09:48:44 GMT\r\n",
"Server: Apache/2.2.16 (Debian)\r\n",
"Last-Modified: Sun, 23 Jun 2013 07:06:54 GMT\r\n",
"ETag: \"105800d-5245-4dfccf0da3f80\"\r\n",
"Accept-Ranges: bytes\r\n",
"Content-Length: 21061\r\n",
"Vary: Accept-Encoding\r\n",
"Connection: close\r\n",
"Content-Type: text/html\r\n",
"\n"
]
}
],
"prompt_number": 45
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Problem:** Write a program `wget.py` that takes a URL as command-line argument, downloads it and saves to a file. The filename should be base name of the URL. If the URL ends with a `/` then it should be saved to `index.html`.\n",
"\n",
" $ python wget.py http://en.wikipedia.org/wiki/Python\n",
" saved to file Python\n",
" $ python wget.py http://python.org/\n",
" saved to file index.html\n",
" $ python wget.py http://python.org/images/python-logo.gif\n",
" saved to file python-logo.gif"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**re module**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import re"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 47
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How to validate phone numbers?\n",
"\n",
"examples:\n",
"\n",
"* 1234567890\n",
"* 123-456-7890\n",
"* 123 456 7890\n",
"\n",
"bad examples:\n",
"\n",
"* 1234acd"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def is_valid_phone(number):\n",
" m = re.match(\"^[0-9 -]*$\", number)\n",
" return m is not None\n",
"\n",
"print is_valid_phone(\"1234567890\")\n",
"print is_valid_phone(\"123-456-7890\")\n",
"print is_valid_phone(\"123 456 7890\")\n",
"print is_valid_phone(\"12345dfkafgkd\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"True\n",
"True\n",
"True\n",
"False\n"
]
}
],
"prompt_number": 59
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def is_valid_email(email):\n",
" m = re.match(\"[a-zA-Z0-9]+@[a-zA-Z0-9]+\\.[a-zA-Z0-9]$\", email)\n",
" return m is not None\n",
"\n",
"print is_valid_email(\"[email protected]\")\n",
"print is_valid_email(\"[email protected]\")\n",
"print is_valid_email(\"[email protected]@some$foo\")\n",
"print is_valid_email(\"[email protected]\")\n",
"print is_valid_email(\"name-example.com\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"True\n",
"False\n",
"False\n",
"False\n",
"False\n"
]
}
],
"prompt_number": 68
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%%file antihtml.py\n",
"import re\n",
"def antihtml(html):\n",
" return re.sub(\"<[^<>]+>\", \"\", html)\n",
"\n",
"print antihtml('begin <b>hello <i>world</i></b> end')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Overwriting antihtml.py\n"
]
}
],
"prompt_number": 81
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"!python antihtml.py"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"begin hello world end\r\n"
]
}
],
"prompt_number": 82
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Third-party Modules"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**tablib module**"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import tablib\n",
"data = tablib.Dataset()\n",
"data.append([\"a\", 1])\n",
"data.append([\"b\", 2])\n",
"\n",
"f = open(\"a.xls\", \"wb\")\n",
"f.write(data.xls)\n",
"f.close()\n",
"print \"wrote a.xls\""
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"wrote a.xls\n"
]
}
],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**beautifulsoup**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"pip install beautifulsoup4"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from bs4 import BeautifulSoup\n",
"import urllib\n",
"\n",
"html = urllib.urlopen(\"http://news.ycombinator.com/\").read()\n",
"soup = BeautifulSoup(html)\n",
"\n",
"links = soup.select(\"td.title a\")\n",
"for a in links[:5]:\n",
" print a.get_text()\n",
" print a['href']\n",
" print "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Snowden Leaves Hong Kong on Commercial Flight to Moscow\n",
"http://www.scmp.com/news/hong-kong/article/1267261/snowden-leaves-hong-kong-commercial-flight-moscow\n",
"\n",
"HKSAR Government issues statement on Edward Snowden\n",
"http://www.info.gov.hk/gia/general/201306/23/P201306230476.htm\n",
"\n",
"Snowden's destination is Venezuela through Havana\n",
"http://www.interfax.com/news.asp\n",
"\n",
"Fmr NSA worker Edward Snowden leaves Hong Kong for Cuba\n",
"http://www.rte.ie/news/2013/0623/458289-edward-snowden/\n",
"\n",
"This is a regular text file. People will read it. Maybe.\n",
"http://slopjong.de/words.txt\n",
"\n"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment