Skip to content

Instantly share code, notes, and snippets.

@radaniba
radaniba / readxml.py
Created November 29, 2012 17:08
To those who are using large xml files here is a simple python script that could be useful
from xml.etree import ElementTree as ET
import re
def strip_whitespace(my_string):
"""Removes spaces, tabs, and newline characters from a string.
\s matches any whitespace character, this is equivalent to the class [\t\n\r\f\v]."""
return re.sub("\s", "", my_string)
my_xml = """
<root>
@radaniba
radaniba / csv-to-mysql.py
Created November 29, 2012 17:08
Python CSV to MySQL
#!/usr/bin/env python
# Run with no args for usage instructions
#
# Notes:
# - will probably insert duplicate records if you load the same file twice
# - assumes that the number of fields in the header row is the same
# as the number of columns in the rest of the file and in the database
# - assumes the column order is the same in the file and in the database
#
@radaniba
radaniba / searchmysql.pl
Created November 29, 2012 17:07
A quick and simple way to search a MySQL database.
if (!function_exists('mysql_search')) {
function mysql_search($table, $columns, $query = '', $options = Array()) {
if (empty($query)) { return Array(); }
$sql_query = Array();
$options['columns'] = isset($options['columns'])?$options['columns']:'*';
$options['method'] = isset($options['method'])?$options['method']:'OR';
@radaniba
radaniba / searchmysql.pl
Created November 29, 2012 17:07
A quick and simple way to search a MySQL database.
if (!function_exists('mysql_search')) {
function mysql_search($table, $columns, $query = '', $options = Array()) {
if (empty($query)) { return Array(); }
$sql_query = Array();
$options['columns'] = isset($options['columns'])?$options['columns']:'*';
$options['method'] = isset($options['method'])?$options['method']:'OR';
@radaniba
radaniba / backup.pl
Created November 29, 2012 17:06
Creates a backup of a MySQL database in SQL format.
if (!function_exists('mysql_dump')) {
function mysql_dump($database) {
$query = '';
$tables = @mysql_list_tables($database);
while ($row = @mysql_fetch_row($tables)) { $table_list[] = $row[0]; }
for ($i = 0; $i < @count($table_list); $i++) {
@radaniba
radaniba / getnucseq.py
Created November 29, 2012 17:06
Obtaining the actual sequence making up a gene from NCBI is simple using a browser, but not so much when wanting to do it in batch. This script obtains the nucleotide sequences, mRNA's, CDS's and protein sequences associated with a list of gene IDs.
from Bio.SeqRecord import SeqRecord
from Bio.Seq import Seq
from Bio import Entrez
from Bio import SeqIO
import time
# Obtain your gene IDs from somewhere (file, past text directly, etc.)
ids = []
ids = ",".join( lines )
@radaniba
radaniba / convert.rb
Created November 29, 2012 17:05
Converts all of the BioRuby capable sequence formats to a Tipdate file. A space is left for the tree to be included.
#!/usr/bin/ruby
# Read in a fast file and spit it out as tipdate
### IMPORTS
require 'test/unit/assertions'
require 'pp'
require 'csv'
require 'bio'
include Test::Unit::Assertions
@radaniba
radaniba / seqlogo.rb
Created November 29, 2012 17:04
A simple script to draw sequence logos from residue frequency data. It doesn't do everything, but can at least serve as a basis for improvements. Usage is 'drawlogo.rb [options] FILE1 [FILE2 ...]'. Options can can be listed with the '-h' option, but it in
#!/usr/bin/env ruby
# Draw a sequence logo in SVG.
### IMPORTS
require 'test/unit/assertions'
require 'optparse'
require 'pp'
require 'csv'
require 'ostruct'
@radaniba
radaniba / retrieveprot.java
Created November 29, 2012 17:03
Retrieving a protein by its accession and accessing some of its data
public static void main(String[] args) {
if (args.length != 1) {
System.out.println("The program expects one parameter: \n"
+ "1. Protein accesion\n");
} else {
String inputSt = args[0];
Bio4jManager manager = null;
@radaniba
radaniba / rsid-diff.pl
Created November 29, 2012 17:02
Detect HugeNET-specific RS IDs in comparison with NHGRI GWAS data
#! usr/bin/perl
use warnings;
use strict;
my $gwas_file=$ARGV[0];
my $hugenet_file=$ARGV[1];
my $num_args = $#ARGV + 1;