Created
May 20, 2016 10:02
-
-
Save Swarchal/ad86f36ea47e769a5ac7c8bfcff01d1b to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Transitions and Transversions\n", | |
"\n", | |
"For two DNA strings ($s_1$ and $s_2$) of the same length, their transition/transversion ratio $R(s_1, s_2)$ where symbol substitutions are inferred from mismatched corresponding symbols as when calculating Hamming distance\n", | |
"\n", | |
"\n", | |
"<img src=\"http://rosalind.info/media/problems/tran/transitions-transversions.png\" width=\"350\">\n", | |
"\n", | |
"\n", | |
"**Given:** Two DNA strings $s_1$ and $s_2$ of equal length (at most 1 kbp) \n", | |
"**Return:** The transition/transversion ratio $R(s_1,s_2)$\n", | |
"." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"from __future__ import division\n", | |
"from Bio import SeqIO\n", | |
"\n", | |
"def zip_strings(path):\n", | |
" fasta_list = list(SeqIO.parse(open(path), \"fasta\"))\n", | |
" s1 = fasta_list[0].seq\n", | |
" s2 = fasta_list[1].seq\n", | |
" return zip(s1, s2)\n", | |
"\n", | |
"def is_change(t):\n", | |
" # TODO fix this mess\n", | |
" if t[0] == \"A\" and t[1] == \"G\":\n", | |
" return 1\n", | |
" elif t[0] == \"G\" and t[1] == \"A\":\n", | |
" return 1\n", | |
" elif t[0] == \"C\" and t[1] == \"T\":\n", | |
" return 1\n", | |
" elif t[0] == \"T\" and t[1] == \"C\":\n", | |
" return 1\n", | |
" elif t[0] == t[1]:\n", | |
" pass\n", | |
" else: # is transversion\n", | |
" return 0\n", | |
"\n", | |
"def is_transx(x):\n", | |
" pre_out = map(is_change, x)\n", | |
" out = [i for i in pre_out if i is not None]\n", | |
" return sum(out) / (len(out) - sum(out))\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"2.580246913580247" | |
] | |
}, | |
"execution_count": 2, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"path = \"/home/scott/Dropbox/rosalind/rosalind_tran.txt\"\n", | |
"is_transx(zip_strings(path))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"\n", | |
"--------------------\n", | |
"\n", | |
"# More details\n", | |
"\n", | |
"The first 10 elements of zip strings" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[('G', 'G'),\n", | |
" ('G', 'C'),\n", | |
" ('A', 'A'),\n", | |
" ('G', 'G'),\n", | |
" ('G', 'A'),\n", | |
" ('G', 'A'),\n", | |
" ('A', 'A'),\n", | |
" ('C', 'C'),\n", | |
" ('C', 'C')]" | |
] | |
}, | |
"execution_count": 3, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"zip_strings(path)[1:10]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Then loop through the nucleotides and assign `1` if the change is a transition, or `0` is it's a transversion\n", | |
"\n", | |
"```python\n", | |
"def is_change():\n", | |
" if transition:\n", | |
" return 1\n", | |
" if transversion:\n", | |
" return 0\n", | |
" else:\n", | |
" return None\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[None, 0, None, None, 1, 1, None, None, None]" | |
] | |
}, | |
"execution_count": 4, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"map(is_change, zip_strings(path)[1:10])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Then calculate the ratio of `1`'s to `0`'s in `out`:\n", | |
"\n", | |
"````python\n", | |
"sum(out) / len(out) - sum(out)\n", | |
"\n", | |
"```" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.6" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment