A³Cat

Current A³Cat version: 2025-03-01.

Assembly metrics and BUSCO results for all assemblies in the taxon Arthropoda (NCBI TaxID: 6656).

BUSCO version: 5.4.0

Assemblies without BUSCO results for Arthropoda were excluded because of their small size or low N50 values.

If you use the A³Cat in your work, please cite this paper (2022).

Interpreting BUSCO results

BUSCO attempts to provide a quantitative assessment of the completeness in terms of expected gene content. The results are simplified into categories of 'Complete and single-copy', 'Complete and duplicated', 'Fragmented', or 'Missing' BUSCOs. These labels are simplifications of the most likely scenario, described below along with other, less-likely but still theoretically possible, interpretations:

Complete

If found to be complete, whether single-copy or duplicated, the BUSCO matches have scored within the expected range of scores and within the expected range of length alignments to the BUSCO profile. If in fact an orthologue is not present in the input dataset, or the orthologue is only partially present (highly fragmented), and a high-identity full-length homologue is present, it is possible that this homologue could be mistakenly identified as the complete BUSCO. The score thresholds are optimised to minimise this possibility, but it can still occur.

Fragmented

If found to be fragmented, the BUSCO matches have scored within the range of scores but not within the range of length alignments to the BUSCO profile. For genome assemblies this could indicate either that the gene is only partially present or that the sequence search and gene prediction steps failed to produce a full-length gene model even though the full gene could indeed be present in the assembly. Matches that produce such fragmented results are given a 'second chance' with a second round of sequence searches and gene predictions with parameters trained on those BUSCOs that were found to be complete, but this can still fail to recover the whole gene. Some fragmented BUSCOs from genome assembly assessments could therefore be complete but are just too divergent or have very complex gene structures, making them very hard to locate and predict in full.

Missing

If found to be missing, there were either no significant matches at all, or the BUSCO matches scored below the range of scores for the BUSCO profile. For genome assemblies this could indicate either that these orthologues are indeed missing, or that the sequence search step failed to identify any significant matches, or that the gene prediction step failed to produce even a partial gene model that might have been recognised as a fragmented BUSCO match. Like for fragments, BUSCOs missing after the first round are given a 'second chance' with a second round of sequence searches and gene predictions with parameters trained on those BUSCOs that are complete, but this can still fail to recover the gene. Some missing BUSCOs from genome assembly assessments could therefore be partially present, and even possibly (but unlikely) complete, but they are just too divergent or have very complex gene structures, making them very hard to locate and predict correctly or even partially.