Skip to content

Add (sub)graph export (AKA region of interest)

Workum, Dirk-Jan van requested to merge add_gfa_export into develop

NB: Still under active development (this branch is a small side project of mine).

This merge request will add a subcommand to retrieve the cDBG from PanTools in GFA format (and others?).

TODO:

  • Check accuracy GFA v1 output
  • Whole pangenome export
    • Decide on subcommand name
    • Decide on what output formats should be supported (only GFA; which is slow)
    • Check speed on large pangenomes
  • Add subcommand for building nucleotide layer from existing graph (GFA v1 format)
    • => edit: to be done with !198
  • Add subcommand for extracting a subgraph in GFA format, including annotations for Bandage
    • Get separate subcommand for regions only
    • Define outputs for region (see below for implementation status)
  • Write all output formats
    • GFAv1
    • Include Bandage annotation CSV for outputs
    • Fasta for each genome
    • Gff3 for each genome
    • PAV for each homology group
    • PAV for each kmer/node
    • Collinearity file (/visualization)
Edited by Workum, Dirk-Jan van

Merge request reports