Expanding the ATCC Genome Portal With Mycology

We teased a few months (blogs) back that we were working on fungal data. We are excited to finally announce the addition of reference-quality genomes from our mycology collection to the ATCC Genome Portal.

The addition of fungal genomes to the ATCC Genome Portal was a significant step for us. Extracting high-quality, concentrated gDNA from fungi is often more complicated than extracting similar quality gDNA from bacteria or viruses. Significantly larger genomes are less compatible with higher throughput sequencing technologies than the smaller viral genomes. Nevertheless, we are excited to provide researchers with our small but growing collection of ATCC mycology genomes.

During the last six months, One Codex and ATCC worked together to research and build a genome assembly pipeline for fungi. After much testing, we identified MaSuRCA1 as the best assembly tool to build into this pipeline, following our read QC. MaSuRCA is built around a selection of assemblers; of these, we chose FLYE2 as it generated the highest quality hybrid assemblies in excellent time.

Many researchers in the field of mycology rely on BUSCO3 to estimate the completeness of their assemblies. To stay aligned with the field, we provide a BUSCO completeness score for each of ours. To accompany these assemblies, we’ve also leveraged the annotations generated by BUSCO, which relies on Augustus4 for gene calling and provides OrthoDB5 assignments of up to approximately 700 single-copy orthologs per genome. With these annotations and a selection of other QC metrics, we publish only the highest quality assemblies for each of our mycology products.

As part of ATCC’s Enhanced Authentication Initiative, we are committed to providing reference-quality genomes to the scientific community. In late January, we published the complete genomes for three newly accessioned strains of multidrug-resistant Candida auris including the type strain (ATCC® MYA-5001™). To date, we have published the complete, hybrid assemblies for 74 ATCC mycology genomes and will continue to publish additional genomes as soon as they become available.

MYA-5002, 4 days at 37C on MD-200 (YM agar)

MYA-5002, 4 days at 37C on MD-200 (YM agar)

Over the next few months, we will continue to add new features to our mycology collection and improve the available annotations in order to help you browse fungal genomes with more detail. We’ll continue to add new fungal genomes to the portal as they become available—stay tuned for updates and new accessions!

References

  1. Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics. 2013; 29(21): 2669-2677.
  2. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 2019; 37(5): 540-546.
  3. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015; 31(19): 3210-3212.
  4. Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003; 19(Suppl 2): ii215-ii225.
  5. Waterhouse RM, Tegenfeldt F, Li J, Zdobnov EM, Kriventseva EV. OrthoDB: a hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Res. 2013; 41(Database issue): D358-D365.
← Back to the One Codex blog Cataloging the Global Diversity of the Human Gut Microbiome →