With 85 species, the Euphausiidae is the largest family in the order Euphausiacea (86
species). Euphausiids, commonly called "krill", are the main prey for many seabirds, fish, and
marine mammals, and they represent both a direct link between the extreme levels of the
food chain and its pillar. They are also a good model organism, given their geographical
diversity and wide distribution within the ocean latitudes. to study the diversity of
physiological adaptations to temperature.
Indeed, the geographic distribution of ectothermic species is determined in part by
temperature, as evidenced by the distribution of species along the latitudinal thermal
gradient and the bathymetric sinking of boreal marine species along this gradient [1, 2]. In
addition to the species-specific evolutionary history, understanding the physiological
mechanisms underlying this temperature-dependent distribution should help explain the
current distribution of these species but will also help predict how global warming is likely to
impact species where they occur and how they will be able to cope [3, 4].
The EuphausiiDB portal offers the possibility for users to explore the database by using
“simple” and “advanced” search functions for a specific taxonomic level, a specific
geographic location or project origin, and soon, a specific annotation. Statistical interactive
charts, readsets location maps and tables, and the resulting list of datasets are associated
with the search functions. For each selected dataset, the user can access readset and
assembly short summary pages with cross-references to external databases (EBI SRA, NCBI
taxID, WORMS), which allow better traceability and homogeneity across databases, as well
as the possibility of downloading all resulting files.
All datasets were assembled and analyzed using the same workflows dedicated to de novo
assembly and functional annotation. The assembly workflow includes evaluation, filtering,
and trimming of raw data as well as de novo assembly and evaluation of assembled
transcripts. Indeed, assemblies were first performed jointly with Trinity and rnaSpades, then
selected for EuphausiiDB based on remapping rates (Salmon), BUSCO completeness rates
(against eukaryotadb10 and arthropodadb10 databases) and average contig length. The
annotation workflow was therefore performed primarily on rnaSpades transcriptomes. We
predicted protein structures with transdecoder and assigned functional annotation with
Interproscan and Diamond (vs. Uniprot/Swissprot, Uniref90) tools. We also performed an
rRNA search with the barnap tool and a signal peptide search with SignalP.
Erwan CORRE
Mark HOEBEKE
Fatoumata BARRY
Loraine GUEGUEN
Jean-Yves TOULLEC