Skip to content

Instantly share code, notes, and snippets.

@philippbayer
Created April 5, 2016 06:00
Show Gist options
  • Select an option

  • Save philippbayer/b36b46ab9ed5d8e343f220cf448cc1e0 to your computer and use it in GitHub Desktop.

Select an option

Save philippbayer/b36b46ab9ed5d8e343f220cf448cc1e0 to your computer and use it in GitHub Desktop.
This tiny script takes bedtools (or similar) output and prints the first and last 10 rows for a contig
import Queue
import sys
# prints first 10 and last 10 rows for each contig
# does NOT guarantee that the first and 10 don't overlap (use uniq)
fh = open(sys.argv[1])
header = fh.readline()
q = Queue.Queue(10) # for the last elements
current_contig = ""
for line in fh:
ll = line.split()
contig = ll[0]
if contig != current_contig:
# print last queue
while not q.empty():
print q.get(),
# now make new one
current_contig = contig
seen_counter = 1
print line,
continue
seen_counter += 1
if seen_counter <= 10:
print line,
try:
q.put(line, block=False)
except Queue.Full:
# throw away one
q.get()
q.put(line, block=False)
# print last queue
while not q.empty():
print q.get(),
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment