Skip to content

Instantly share code, notes, and snippets.

@gawbul
Last active January 4, 2016 05:39
Show Gist options
  • Save gawbul/8576272 to your computer and use it in GitHub Desktop.
Save gawbul/8576272 to your computer and use it in GitHub Desktop.
Introduction to loops for biologists
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to loops for biologists\n",
"\n",
"![Introduction to loops](http://www.kelloggs.ie/content/dam/workarea/assetpushqueue/images/web-raw-approved/eng%20IE/38/56/prod_img-253856.jpg/jcr:content/renditions/cq5dam.thumbnail.319.319.png \"Introduction to loops\")\n",
"\n",
"## Background\n",
"\n",
"* **This introduction assumes a very basic knowledge of programming, such as variable usage, assignment, and list indexing.**\n",
"* We can use programming languages to solve problems that require a defined, logical approach.\n",
"* Programming languages have different structures such as semantics, grammar and syntax, just as spoken languages do.\n",
"* In the same way spoken languages share common lexical terms and grammar, so do different programming languages.\n",
"* One type of structure shared between programming languages is the control of the flow of a program.\n",
"\n",
"## Control flow\n",
"\n",
"* Generally the instructions (code) of a program are executed in a stepwise manner from the beginning to the end.\n",
"* It is possible to add statements that change the flow of the program.\n",
"* Loops are one such way of controlling the flow of a program.\n",
"* Loops, as the name suggests, allow one to repeat a certain portion of code - generally based on certain conditions.\n",
"* There are a number of different ways of implementing loops, most of which are shared across programming languages.\n",
"\n",
"## Loop types\n",
"\n",
"* Some languages, such as the Perl programming language have a detailed repertoire of different loop types that are beyond the scope of this introduction.\n",
"* I will use Python here, as it has a very simple syntax for beginners, and only implements **for** loops and **while** loops."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### for loop"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# for loop\n",
"for i in range(1, 11):\n",
" print i"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n",
"4\n",
"5\n",
"6\n",
"7\n",
"8\n",
"9\n",
"10\n"
]
}
],
"prompt_number": 30
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### while loop"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# while loop\n",
"i = 1\n",
"while i <= 10:\n",
" print i\n",
" i += 1"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"1\n",
"2\n",
"3\n",
"4\n",
"5\n",
"6\n",
"7\n",
"8\n",
"9\n",
"10\n"
]
}
],
"prompt_number": 31
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### for loops for biologists\n",
"\n",
"* The for loop allows you to iterate over a list of items\n",
"* This may be useful to iterate over all GenBank IDs in a list in order to fetch the FASTA formatted sequence corresponding to each ID."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# import modules required\n",
"from Bio import Entrez, SeqIO\n",
"\n",
"# set email so Entrez can tell us off if we send too many queries\n",
"Entrez.email = \"[email protected]\"\n",
"\n",
"# set a list of genbank ids\n",
"genbank_ids = [\"119395733\", \"568974803\", \"110626132\", \"347446670\", \"442628803\"]\n",
"\n",
"# iterate over gebank ids\n",
"for id in genbank_ids:\n",
" print \"Fetching data for id %s\" % id\n",
" print\n",
" handle = Entrez.efetch(db=\"nucleotide\", id=id, rettype=\"fasta\", retmode=\"text\")\n",
" record = SeqIO.read(handle, \"fasta\")\n",
" handle.close()\n",
" print record\n",
" print"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"Fetching data for id 119395733\n",
"\n",
"ID: gi|119395733|ref|NM_000059.3|\n",
"Name: gi|119395733|ref|NM_000059.3|\n",
"Description: gi|119395733|ref|NM_000059.3| Homo sapiens breast cancer 2, early onset (BRCA2), mRNA\n",
"Number of features: 0\n",
"Seq('GTGGCGCGAGCTTCTGAAACTAGGCGGCAGAGGCGGAGCCGCTGTGGCACTGCT...ATT', SingleLetterAlphabet())"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Fetching data for id 568974803\n",
"\n",
"ID: gi|568974803|ref|XM_006533728.1|\n",
"Name: gi|568974803|ref|XM_006533728.1|\n",
"Description: gi|568974803|ref|XM_006533728.1| PREDICTED: Mus musculus contactin associated protein-like 1 (Cntnap1), transcript variant X1, mRNA\n",
"Number of features: 0\n",
"Seq('TCATCGTACCCGGAGTAAAGTCCCCAGGAGACCTGGTGGCATCAAAATGAGAAG...CCC', SingleLetterAlphabet())"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Fetching data for id 110626132\n",
"\n",
"ID: gi|110626132|ref|NM_001030280.1|\n",
"Name: gi|110626132|ref|NM_001030280.1|\n",
"Description: gi|110626132|ref|NM_001030280.1| Danio rerio solute carrier family 24, member 5 (slc24a5), mRNA\n",
"Number of features: 0\n",
"Seq('GTAAGCCGCGGCGGTGTGTGTGTGTGTGTGTGTTCTCCGTCATCTGTGTTCTGC...CCT', SingleLetterAlphabet())"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Fetching data for id 347446670\n",
"\n",
"ID: gi|347446670|ref|NM_001244612.1|\n",
"Name: gi|347446670|ref|NM_001244612.1|\n",
"Description: gi|347446670|ref|NM_001244612.1| Bos taurus insulin-like growth factor 1 receptor (IGF1R), mRNA\n",
"Number of features: 0\n",
"Seq('GAGAAAGGGGAATTTGGTCCCAAATAAAAGGAATGAAGTCTAGCTCCGGAGGAG...AAC', SingleLetterAlphabet())"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n",
"Fetching data for id 442628803\n",
"\n",
"ID: gi|442628803|ref|NM_165390.2|\n",
"Name: gi|442628803|ref|NM_165390.2|\n",
"Description: gi|442628803|ref|NM_165390.2| Drosophila melanogaster Cullin-2 (Cul-2), transcript variant A, mRNA\n",
"Number of features: 0\n",
"Seq('CGATAGATTATATCGATATCGTCTTCGTCTGACAAACCATTTCACGATACCAAA...ATT', SingleLetterAlphabet())"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"\n"
]
}
],
"prompt_number": 32
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### while loops for biologists\n",
"\n",
"* The while loop allows you to repeat a task while a certain condition is true\n",
"* This can be useful when you need to run something only a given number of times or until you break the condition"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# set a list of genbank ids\n",
"spp_names = [\"homo_sapiens\", \"mus_musculus\", \"bos_taurus\", \"danio_rerio\", \"drosophila_melanogaster\"]\n",
"\n",
"# search through list\n",
"found = False\n",
"list_index = 0\n",
"while found == False:\n",
" name = spp_names[list_index]\n",
" if name == \"danio_rerio\":\n",
" print \"Found %s at list position %d.\" % (name, list_index + 1)\n",
" found = True\n",
" else:\n",
" print name\n",
" list_index += 1"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"homo_sapiens\n",
"mus_musculus\n",
"bos_taurus\n",
"Found danio_rerio at list position 4.\n"
]
}
],
"prompt_number": 33
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"* Loops allow you to alter the flow of a program.\n",
"* These can loop (iterate) over a list of items or repeat at task based on certain conditions.\n",
"* Different types of loops evaluate the conditions at different times in the execution of the program.\n",
"* It is wise to be certain exactly how the loop is executing your code in order to minimalise the need for debugging."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Further information\n",
"\n",
"* [Python for Biologists](http://pythonforbiologists.com)\n",
"* [A Primer on Python for Life Science Researchers](http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0030199)\n",
"* [Rosalind: Python Village](http://rosalind.info/problems/list-view/?location=python-village)"
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment