Skip to content

Instantly share code, notes, and snippets.

@MikeTrizna
Created July 13, 2016 20:16
Show Gist options
  • Save MikeTrizna/92b5d87a757b24a083c5bff2e1a031fa to your computer and use it in GitHub Desktop.
Save MikeTrizna/92b5d87a757b24a083c5bff2e1a031fa to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll start out with a short DNA sequence as our example string."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"dna = 'ACTAGCTACGCTCGATACGCATCG'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's check out the type of this variable, just to be sure."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'str'>\n"
]
}
],
"source": [
"print(type(dna))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's try running another **function** on the variable. Here's a good example of something that computers are good at: counting."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"24\n"
]
}
],
"source": [
"print(len(dna))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's see what kinds of different ways we can change this string.\n",
"The first one we'll try is to convert it to lowercase. We'll use the lower() **method**. Notice how this **method** is different than **functions** that we've used previously."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"actagctacgctcgatacgcatcg\n"
]
}
],
"source": [
"print(dna.lower())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Also, it's important to point out that the \"dna\" variable did not change by applying that method."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ACTAGCTACGCTCGATACGCATCG\n"
]
}
],
"source": [
"print(dna)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To truly change a string variable, you need to re-save those changes back to the same variable name."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"actagctacgctcgatacgcatcg\n"
]
}
],
"source": [
"dna = dna.lower()\n",
"print(dna)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that the lower() method just used an empty parenthesis. We'll now use the replace() method to show how to pass **parameters** that tell the method what to do. To demonstrate this, we'll tell Python to replace the \"thymine\" bases with \"uracil\" to convert it to an rna sequence."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"acuagcuacgcucgauacgcaucg\n"
]
}
],
"source": [
"rna = dna.replace('t', 'u')\n",
"print(rna)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Another very useful action that Python can perform is called **slicing**. Say we realized that we only wanted to use the first 10 bases of the dna sequence for some analysis. We use **brackets[]** to do this, and then we tell Python where to start and where to end. It's important to note that **counting in Python starts with 0**."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"actagctacg\n"
]
}
],
"source": [
"print(dna[0:10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In fact, if we're just starting with the beginning of the string, we don't need to write out the 0."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"actagctacg\n"
]
}
],
"source": [
"print(dna[:10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, what if a colleague tells us about a 5-bp stretch that we were forgetting from our sequence? We can add that to our original sequence, using the **+ operator**. Remember from last class, that the \"+\" is used to add numbers, but it can add strings too?? This is an example of **operator overloading**, where an operator can be programmed to do different operations based on the type of object it's working with."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"actagctacgctcgatacgcatcgagtca\n",
"29\n"
]
}
],
"source": [
"missing_stretch = 'agtca'\n",
"dna = dna + missing_stretch\n",
"print(dna)\n",
"print(len(dna))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Another example of operator overloading in strings is the *** operator**. The \\* will \"multiply\" a string howevery many times we tell it. Here we make this new sequence into a repeat."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"actagctacgctcgatacgcatcgagtcaactagctacgctcgatacgcatcgagtca\n",
"58\n"
]
}
],
"source": [
"repeat_dna = dna * 2\n",
"print(repeat_dna)\n",
"print(len(repeat_dna))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment