This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I am trying to get my head around the sorting example. I use this as a input: | |
$ hadoop fs -cat /input/small* | |
9971681 | |
9686036 | |
2592322 | |
4518219 | |
1467363 | |
607354 | |
.... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+ cartera DNI | |
+ iphone | |
+ pasaporte + documentacion | |
+ laptop, power adaptor, international adaptor | |
+ book -- the economist | |
+ papers | |
+ gomina | |
+ deodorant | |
+ cuchilla + crema ? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'pp' | |
def mark_them(l, poss) | |
i=0 | |
poss.each do |p| | |
l.insert(i + p[0], '*') | |
i = i + 1 | |
l.insert(i + p[1] + 1, '*') | |
i = i + 1 | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DATA.each do |line| | |
i = 0 | |
# For each match of "any character followed by the same character" do: | |
marked = line.gsub(/(.)\1/) do |match| | |
i += 1 | |
# return the match surrounded by asterisks from the block, which will | |
# tell gsub to replace the match with that | |
"*#{ match }*" | |
end | |
puts "{#{ i }} #{marked}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# | |
# vim: tw=80 ts=2 sw=2 | |
# | |
# merge_eg_output.rb: Merges multiple egenotype outputs | |
# | |
# Usage: | |
# | |
# 1. Concat all your eg outputs in 1 file: | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
in_file = ARGV[0] | |
sqtocs = { "AA" => 0, "AC" => 1, "AG" => 2, "AT" => 3, | |
"CA" => 1, "CC" => 0, "CG" => 3, "CT" => 2, | |
"GA" => 2, "GC" => 3, "GG" => 0, "GT" => 1, | |
"TA" => 3, "TC" => 2, "TG" => 1, "TT" => 0 } | |
# seq space Old/NEW format: | |
# 1 775852 rs2980300 A/G A GACTTCACTAACTCANAGAGACACAGTCATT |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# | |
# Author:: David Rio Deiros (mailto:[email protected]) | |
# Copyright:: Copyright (c) 2009 David Rio Deiros | |
# License:: BSD | |
# | |
# vim: tw=80 ts=2 sw=2 | |
# | |
# genome_hasher.rb: Hashes a whole genome and allows you to query it for k-mers | |
# sequence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# | |
# Author:: David Rio Deiros (mailto:[email protected]) | |
# Copyright:: Copyright (c) 2009 David Rio Deiros | |
# License:: BSD | |
# | |
# vim: tw=80 ts=2 sw=2 | |
# | |
# genome_hasher.rb: Hashes a whole genome and allows you to query it for k-mers | |
# sequence |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class EgOutput | |
attr_accessor :chrm, :pos, :id, :al_cs, :al_seq, :counters | |
def initialize(others, counters) | |
@chrm, @pos, @id, @ref, @var = others | |
@counters = counters.map{|c| c.to_i } | |
end | |
def +(ec) | |
new_obj = self.clone |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2009-07-06 12:45:41 -0500: 100000 | |
2009-07-06 12:45:45 -0500: 200000 | |
2009-07-06 12:45:50 -0500: 300000 | |
2009-07-06 12:45:54 -0500: 400000 | |
2009-07-06 12:46:00 -0500: 500000 | |
2009-07-06 12:46:05 -0500: 600000 | |
2009-07-06 12:46:10 -0500: 700000 | |
2009-07-06 12:46:16 -0500: 800000 | |
2009-07-06 12:46:24 -0500: 900000 | |
2009-07-06 12:46:31 -0500: 1000000 |