Handle co-features in annotation files for add_annotations
Gff3 files are notoriously difficult to parse; that is why we switched to htsjdk for parsing them. However, it appeared that co-features – which we did not handle previously – occur a lot in organellar genomes in the form of trans-spliced genes. Therefore, this merge request adds code to handle these co-features:
- When creating all feature nodes in the pangenome database, it adds all co-features to a list.
- After creating all feature nodes but before creating protein sequences, we make all co-features one feature with an updated address that contains all locations (needed for later sequence extraction).
- Per co-feature, we keep only node such node and connect the children (CDS and exon) of all other (to be deleted) co-features to it. Also, we create one mRNA for them.
The only thing not handled now are co-features (genes) that have multiple mRNAs. This could occur when a trans-spliced gene undergoes alternative splicing.