Improved database, new features

We are happy to say that we’ve just released a new and improved reference database! Here are a few highlights of the things we’re proud of.

Spanning the tree of life

We’ve increased our coverage of Eukaryotes, with 874 fungal genomes and 133 protists, including 8 strains of Toxoplasma gondii and our first reference of Geotrichum candidum. We’ve also added a number of fungal and protozoan pathogens: we have two times the number of Plasmodium falciparum and three times the number of Trichophyton rubrum.

Finding the species in the haystack

By improving the quality of our reference genomes, we are able to classify more nucleotide sequences to their species of origin (rather than the genus, family, etc.). Out of our entire reference database of ~40 billion k-mers, an additional 5% of that content is now classified to the species level. This means that we’re doing an even better job of sensitively and accurately detecting microbes in your data.

Organism abundance and sequencing depth

In addition to all of the new reference genomes, we’re also rolling out an added layer of statistical analysis. The summary table now shows you the relative abundance and sequencing depth of the high-confidence species in your sample.

screenshot

By using sequencing depth and genome coverage data, we’re able to give you a high-fidelity picture of the collection of organisms in your sample. More details on this to follow.

You can probably tell that we think this is pretty exciting and cool. We’d love to hear what you think or answer any questions, so don’t hesitate to be in touch (mailto:sam@onecodex.com).

– One Codex Team

← Back to the One Codex blog New features: Whole genome clustering and BaseSpace integration →