View on GitHub

ComputationalGenomicsManual

Robs manual for the computational genomics and bioinformatics class.

Coral and Algae Data Set

This is coral, algae, CCA, and water (control) samples from Kevin Walsh in Liz Dinsdaleā€™s lab.

This is a random community metagenome data set

There are 50,000 reads in these data sets.

Abstract:

Coral reefs are undergoing global microbialization as carbon and energy from higher trophic levels shift into the microbial food web. Increase in labile carbon resources has altered microbial community composition from phototrophs to copiotrophs and super-heterotrophs. Even in oligotrophic systems, super-heterotrophs persist in the rare biosphere. These rare organisms are likely to become abundant with microbialization, but their scarcity inhibits deep sequencing and metabolic analysis with metagenomics. Therefore, we utilized enrichment after pre-exposure of the water column microbiome to benthic organisms, 1) water control, 2) alga, Stypopodium zonale, 3) crustose coralline algae, and 4) the coral, Mussismilia hartti to identify super-heterotrophs in the coral reef environment. We compared enriched communities to metagenomes from coral reef water and the coral Mussismilia braziliensis to compare the presence of dominant genera as population genomes in these native microbial communities. Enrichment selected for super-heterotrophs in the genus, Vibrio, Pseudoalteromonas and Arcobacter with greater sequence coverage of Arcobacter in coral exposures. We assembled two Vibrio, a Pseudoatleromonas and an Arcobacter, which identified previously unannotated sequences. To determine genes that define the ecotypes of the environmental microbes, we compared the population genomes to genomes of related organisms sequenced from cultured isolates. We found the coral reef associated Vibrio population had a higher proportion of genes in the metabolic pathways: Type IV secretion and conjugative transfer, maltose utilization and urea decomposition. Pseudoalteromonas population had more genes involved in sugar utilization pathways while Arcobacter population was distinguished by genes contributed to type VI secretion systems and utilization of alkylphosphates and aromatic compounds. By assembling population genomes, we identified novel genes defining the ecotype relative to the pangenome of culture isolates. Novel gene identification informs the ecology of super-heterotrophs on corals reef independent of the limitations of reference genomes.

Data

We have made all the data sets available either as separate tarballs. Note: when you download these you will most likely need to change the name and then use the command:

tar xf Algae.tgz

to extract the data. (Of course, changing Algae to CCA, Control or Coral as appropriate.)

Algae

CCA

Control

Coral