Skip to content

Instantly share code, notes, and snippets.

View drio's full-sized avatar
🐢
I don't know

David Rio drio

🐢
I don't know
View GitHub Profile
I am trying to get my head around the sorting example. I use this as a input:
$ hadoop fs -cat /input/small*
9971681
9686036
2592322
4518219
1467363
607354
....
+ cartera DNI
+ iphone
+ pasaporte + documentacion
+ laptop, power adaptor, international adaptor
+ book -- the economist
+ papers
+ gomina
+ deodorant
+ cuchilla + crema ?
require 'pp'
def mark_them(l, poss)
i=0
poss.each do |p|
l.insert(i + p[0], '*')
i = i + 1
l.insert(i + p[1] + 1, '*')
i = i + 1
end
DATA.each do |line|
i = 0
# For each match of "any character followed by the same character" do:
marked = line.gsub(/(.)\1/) do |match|
i += 1
# return the match surrounded by asterisks from the block, which will
# tell gsub to replace the match with that
"*#{ match }*"
end
puts "{#{ i }} #{marked}"
#!/usr/bin/env ruby
#
# vim: tw=80 ts=2 sw=2
#
# merge_eg_output.rb: Merges multiple egenotype outputs
#
# Usage:
#
# 1. Concat all your eg outputs in 1 file:
#
#!/usr/bin/env ruby
in_file = ARGV[0]
sqtocs = { "AA" => 0, "AC" => 1, "AG" => 2, "AT" => 3,
"CA" => 1, "CC" => 0, "CG" => 3, "CT" => 2,
"GA" => 2, "GC" => 3, "GG" => 0, "GT" => 1,
"TA" => 3, "TC" => 2, "TG" => 1, "TT" => 0 }
# seq space Old/NEW format:
# 1 775852 rs2980300 A/G A GACTTCACTAACTCANAGAGACACAGTCATT
#!/usr/bin/env ruby
#
# Author:: David Rio Deiros (mailto:[email protected])
# Copyright:: Copyright (c) 2009 David Rio Deiros
# License:: BSD
#
# vim: tw=80 ts=2 sw=2
#
# genome_hasher.rb: Hashes a whole genome and allows you to query it for k-mers
# sequence
#!/usr/bin/env ruby
#
# Author:: David Rio Deiros (mailto:[email protected])
# Copyright:: Copyright (c) 2009 David Rio Deiros
# License:: BSD
#
# vim: tw=80 ts=2 sw=2
#
# genome_hasher.rb: Hashes a whole genome and allows you to query it for k-mers
# sequence
class EgOutput
attr_accessor :chrm, :pos, :id, :al_cs, :al_seq, :counters
def initialize(others, counters)
@chrm, @pos, @id, @ref, @var = others
@counters = counters.map{|c| c.to_i }
end
def +(ec)
new_obj = self.clone
2009-07-06 12:45:41 -0500: 100000
2009-07-06 12:45:45 -0500: 200000
2009-07-06 12:45:50 -0500: 300000
2009-07-06 12:45:54 -0500: 400000
2009-07-06 12:46:00 -0500: 500000
2009-07-06 12:46:05 -0500: 600000
2009-07-06 12:46:10 -0500: 700000
2009-07-06 12:46:16 -0500: 800000
2009-07-06 12:46:24 -0500: 900000
2009-07-06 12:46:31 -0500: 1000000