Skip to content

Instantly share code, notes, and snippets.

@zhoujj2013
Last active December 26, 2015 18:09
Show Gist options
  • Save zhoujj2013/7192619 to your computer and use it in GitHub Desktop.
Save zhoujj2013/7192619 to your computer and use it in GitHub Desktop.
Given a sequence then separate the sequence to piece with certain overlap.
#!/usr/bin/perl -w
use strict;
#use Data::Dumper;
# Usage:
# perl ./separate_seq.pl <piece length> <ovelap> <input file *.fa>
my ($l,$o,$f) = @ARGV;
open IN,"$f" || die $!;
$/ = ">"; <IN>; $/ = "\n";
while(<IN>){
my $id = $1 if(/^(\S+)/);
$/ = ">";
my $seq = <IN>;
chomp($seq);
$/ = "\n";
$seq =~ s/\n//g;
my $total_len = length($seq);
my $pos = 0;
#print "total length: $total_len\n";
for(my $i = 0; $i < $total_len; $i = $i + $l - 1){
my $sub_seq;
my $pos_old = $pos;
#print $i,"\t";
if($pos - $o < 0){
if($pos + $l > $total_len){
$sub_seq = substr($seq, $pos, $total_len - ($pos + 1) + 1);
my $sub_len = $total_len - ($pos + 1) + 1;
#print "$pos\t$sub_len\n";
}else{
$sub_seq = substr($seq, $pos, $l);
#print "$pos\t$l\n";
}
$pos = $i + $l - $o;
}else{
if($pos + $l > $total_len){
$sub_seq = substr($seq, $pos, $total_len - ($pos + 1) + 1);
my $sub_len = $total_len - ($pos + 1) + 1;
#print "$pos\t$sub_len\n";
}else{
$sub_seq = substr($seq, $pos - $o, $l);
#print "$pos\t$l\n";
}
$pos = $i + $l - $o;
}
my $index_num = $pos_old+1;
print ">",$id."_".$index_num,"\n",$sub_seq,"\n";
}
#print $id,"\n";
#print $seq,"\n";
}
close IN;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment