This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
""" | |
This script processes SAM (Sequence Alignment/Map format) inputs from standard | |
input and extracts alignment information that is then provided in a tab-separated | |
table. The following fields are produced: query, target, query_length, query_start, | |
query_end, target_start, target_end, alignment_length, alignment_identity. | |
This script was designed for use with SAM files produced by minimap2. However, | |
it will work with any SAM data that: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from pathlib import Path | |
from typing import Iterator, Optional, Union | |
import polars as pl | |
from needletail import parse_fastx_file | |
from polars.io.plugins import register_io_source | |
def scan_fastx(fastx_file: Union[str, Path]) -> pl.LazyFrame: | |
schema = pl.Schema( |
OlderNewer