Skip to content

Instantly share code, notes, and snippets.

@lindenb
Last active August 13, 2019 19:55
Show Gist options
  • Save lindenb/372571a0952e9ad19b07ce499c46fab1 to your computer and use it in GitHub Desktop.
Save lindenb/372571a0952e9ad19b07ce499c46fab1 to your computer and use it in GitHub Desktop.
https://www.biostars.org/p/394289/ : Extracting reads mapping to CDS regions but not the first 45bp after ATG gtf cds
stream().
flatMap(G->G.getTranscripts().stream()).
filter(T->T.getCodonStart().isPresent() && T.getCodonStop().isPresent() && T.hasStrand()).
forEach(T->{
List<Integer> positions = new ArrayList<>();
for(com.github.lindenb.jvarkit.util.bio.structure.Cds cds : T.getAllCds()) {
for(int x=cds.getStart();x<=cds.getEnd();++x) positions.add(x);
}
if(positions.size() <= 45) return;
if(T.isPositiveStrand()) {
positions = positions.subList(45,positions.size());
}
else {
positions = positions.subList(0,positions.size()-45);
}
int i=0;
while(i<positions.size()) {
int beg =i;
while(i+1 < positions.size() && positions.get(i)+1==positions.get(i+1))
{
i++;
}
println(T.getContig()+"\t"+(positions.get(beg)-1)+"\t"+positions.get(i)+"\t"+T.getId());
i++;
}
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment