deepbiop.fa¶
Classes¶
Options for configuring the FASTA sequence encoder. |
|
An encoder for converting FASTA records to Parquet format. |
|
Functions¶
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Module Contents¶
- class deepbiop.fa.EncoderOption¶
Options for configuring the FASTA sequence encoder.
This struct provides configuration options for encoding FASTA sequences, such as which bases to consider during encoding.
# Fields
bases - A vector of valid bases (as bytes) to use for encoding. Defaults to “ATCGN”.
# Example
``` use deepbiop_fa::encode::option::EncoderOption;
- class deepbiop.fa.ParquetEncoder¶
An encoder for converting FASTA records to Parquet format.
This struct provides functionality to encode FASTA sequence data into Parquet files, which are an efficient columnar storage format.
# Fields
option - Configuration options for the encoder, including which bases to consider
# Example
``` use deepbiop_fa::encode::{option::EncoderOption, parquet::ParquetEncoder};
let options = EncoderOption::default(); let encoder = ParquetEncoder::new(options); ```
- class deepbiop.fa.RecordData¶
- deepbiop.fa.convert_multiple_fas_to_one_fa(paths, result_path, parallel)¶
- Parameters:
paths (Sequence[str | os.PathLike | pathlib.Path])
result_path (str | os.PathLike | pathlib.Path)
parallel (bool)
- Return type:
None
- deepbiop.fa.encode_fa_path_to_parquet(fa_path, bases, result_path=None)¶
- Parameters:
fa_path (str | os.PathLike | pathlib.Path)
bases (str)
result_path (str | os.PathLike | pathlib.Path | None)
- Return type:
None
- deepbiop.fa.encode_fa_path_to_parquet_chunk(fa_path, chunk_size, parallel, bases)¶
- Parameters:
fa_path (str | os.PathLike | pathlib.Path)
chunk_size (int)
parallel (bool)
bases (str)
- Return type:
None
- deepbiop.fa.encode_fa_paths_to_parquet(fa_path, bases)¶
- Parameters:
fa_path (Sequence[str | os.PathLike | pathlib.Path])
bases (str)
- Return type:
None
- deepbiop.fa.select_record_from_fa(selected_reads, fq, output)¶
- Parameters:
selected_reads (Sequence[str])
fq (str | os.PathLike | pathlib.Path)
output (str | os.PathLike | pathlib.Path)
- Return type:
None
- deepbiop.fa.select_record_from_fa_by_random(fq, number, output)¶
- Parameters:
fq (str | os.PathLike | pathlib.Path)
number (int)
output (str | os.PathLike | pathlib.Path)
- Return type:
None
- deepbiop.fa.write_fa(records_data, file_path=None)¶
- Parameters:
records_data (Sequence[RecordData])
file_path (str | os.PathLike | pathlib.Path | None)
- Return type:
None
- deepbiop.fa.write_fa_parallel(records_data, file_path, threads)¶
- Parameters:
records_data (Sequence[RecordData])
file_path (str | os.PathLike | pathlib.Path)
threads (int)
- Return type:
None