View on GitHub

ComputationalGenomicsManual

Robs manual for the computational genomics and bioinformatics class.

Kraken2

Kraken2 uses k-mers to identify the taxonomy of the microbes in your sample. In essence, they have taken all complete genomes, and then identified all k-mers that are unique to each taxonomic level. Through some nifty computing, and special data structures, they have figured out how to search this very efficiently.

There are a wide range of pre-built kraken databases that you can download, so you do not need to go to the effort of building them yourself.

When installing Kraken2, I recommend setting the KRAKEN2_DB_PATH and KRAKEN2_DEFAULT_DB variables, and then you do not need to specify them on the command line.

To run Kraken2, use this incantation:

kraken2 --paired --threads 4 --report kraken_taxonomy.txt --output kraken_output.txt \
	fastq/reads_1.fastq fastq/reads_2.fastq

This will output two files:

For more information about Kraken2, see the wiki page

If you are using the HPC at Flinders University, the details on this page will show you how to install and use Kraken2