genome_entropy.orf.finder

ORF finder wrapper using get_orfs binary.

Functions

find_orfs(sequences[, table_id, ...])

Find ORFs in DNA sequences using get_orfs binary.

reverse_complement(seq)

Return the reverse complement of a DNA sequence.

genome_entropy.orf.finder.find_orfs(sequences, table_id=11, min_nt_length=90, binary_path='get_orfs')[source]

Find ORFs in DNA sequences using get_orfs binary.

This function wraps the external get_orfs binary (https://github.com/linsalrob/get_orfs). The binary must be installed and available in PATH or specified via binary_path.

Parameters:
  • sequences (Dict[str, str]) – Dictionary mapping sequence IDs to DNA sequences

  • table_id (int) – NCBI genetic code table ID (default: 11, bacterial)

  • min_nt_length (int) – Minimum ORF length in nucleotides (default: 90)

  • binary_path (str) – Path to get_orfs binary (default: from config/environment)

Returns:

List of OrfRecord objects

Raises:

OrfFinderError – If get_orfs binary is not found or fails

Return type:

List[OrfRecord]

genome_entropy.orf.finder.reverse_complement(seq)[source]

Return the reverse complement of a DNA sequence.

Parameters:

seq (str)

Return type:

str