Draft: construct pangenome database from GFA
1 unresolved thread
1 unresolved thread
Related to !147, I would like to not only go from pantools_DB -> GFA
but also vice versa. This merge request collects my tries of doing that. It's marked as draft for now because I'm not sure if I'll use this code in the end.
TODO:
-
Fix kmer index -
Make working on cyclic graphs -
Find out what happens with degenerate nodes -
Don't require genome locations file if GFA is provided (if GFA contains sequences) -
Get it to run on any valid GFA -
Ensure neo4j -> GFA -> neo4j gives identical database -
Ensure GFA -> neo4j -> GFA gives identical GFA file -
Update documentation accordingly
Edited by Workum, Dirk-Jan van
Merge request reports
Activity
assigned to @worku005
added 1 commit
- 840a99ef - remove sequence retrieval from GFA and add tag to pangenome node about type
added 447 commits
-
ab0e004c...11c9c2e3 - 446 commits from branch
add_gfa_export
- 926702fe - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
ab0e004c...11c9c2e3 - 446 commits from branch
added 36 commits
-
926702fe...a725713c - 35 commits from branch
add_gfa_export
- c4323698 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
926702fe...a725713c - 35 commits from branch
added 1 commit
- 05140b51 - put graph reading back that went missing after merging in develop
added 1 commit
- 724a27ec - start replacing duplicated graph classes by better ones
added 34 commits
-
624cc62b...8bc7cd67 - 33 commits from branch
add_gfa_export
- bf066516 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
624cc62b...8bc7cd67 - 33 commits from branch
added 20 commits
-
bf066516...5292abba - 19 commits from branch
add_gfa_export
- 36c14157 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
bf066516...5292abba - 19 commits from branch
mentioned in merge request !147
added 116 commits
-
36c14157...72deb51e - 114 commits from branch
add_gfa_export
- 7ed7d2a0 - fix typo
- 8906de56 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
36c14157...72deb51e - 114 commits from branch
added 1 commit
- 46dff639 - fix Vertex.java compilation error introduced in merge
added 22 commits
-
46dff639...67d94b64 - 21 commits from branch
add_gfa_export
- 68c98ae6 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
46dff639...67d94b64 - 21 commits from branch
added 33 commits
-
68c98ae6...be05107b - 2 commits from branch
add_gfa_export
- be05107b...40d2f39f - 21 earlier commits
- d9b8f602 - Revert "updated changelog."
- 3d970397 - changed exception to runtimexception
- 400982fc - Merge branch 'develop' of https://git.wur.nl/bioinformatics/pantools into phased_pangenomics_bugfix
- 6234a12e - Merge branch 'bugfix_grouping' into 'develop'
- 776221c9 - Merge branch 'develop' of https://git.wur.nl/bioinformatics/pantools into phased_pangenomics_bugfix
- df0c998c - Merge branch 'phased_pangenomics_bugfix' into 'develop'
- 4a38b0ac - added exception with error message to transaction in alleleStatistics.
- 2655b1b2 - Merge branch 'gene_classification_error_message' into 'develop'
- 40096575 - Merge remote-tracking branch 'origin/develop' into add_gfa_export
- 42c12741 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
Toggle commit list-
68c98ae6...be05107b - 2 commits from branch
added 23 commits
-
42c12741...3237ea61 - 21 commits from branch
add_gfa_export
- 633a4dd4 - correct placement current changes in changelog
- 2d88eefb - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
42c12741...3237ea61 - 21 commits from branch
added 52 commits
-
2d88eefb...5f80b023 - 51 commits from branch
add_gfa_export
- 46f59af1 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
2d88eefb...5f80b023 - 51 commits from branch
added 1 commit
- 56e18e0b - make GFA backbone build_pangenome work with localisation
added 1 commit
- b79cfe2f - implement first and last kmer for build_pangenome with gfa
added 1 commit
- 619e3219 - add check for validity of GFA files to build_pangenome
added 12 commits
- 434ef60e...8aa7e189 - 2 earlier commits
- 6f4f343d - turning off trace logging in the code for build_pangenome to speed it up a bit
- cf15da6c - update changelog
- bc77b565 - fix build_pangenome for the last k-1 bases
- 282093c8 - implement more clever way of finding links in a GFA file
- 9021c920 - try implementing proper parsing of path names from both gfa to and from neo4j
- 738d1561 - let build_pangenome be able to deal with multiple sequences per genome
- 374c0433 - implement loosely matching the links in the GFAgit status
- 1947a48f - try improving speed a lot by disallowing double degenerate nodes
- 6377ba31 - implement cache for nodes when building graph from GFA
- a2580484 - Merge branch 'rewrite_legacy_code_node_localization_for_gfa_build_pangenome'...
Toggle commit listThis merge request still needs the following fixed in order to be ready for merging back into
add_gfa_export
:- Fix localization for generic graphs (see !239 (merged) for more information).
- Check for easy speed optimizations
- Add property to pangenome node with the type of pangenome and use this in if/else statements to not negatively impact cDBG creation (especially relevant for localization process)
added 1 commit
- bf3ab0fd - add some notes to the code so future me knows what went on
added 56 commits
-
bf3ab0fd...a60ef561 - 55 commits from branch
add_gfa_export
- f4ea67d4 - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
bf3ab0fd...a60ef561 - 55 commits from branch
added 1 commit
- 58710984 - build separate localization for build_pangenome from GFA
added 7 commits
-
9c50b5ad...aa4366e0 - 5 commits from branch
add_gfa_export
- f538e807 - removing temporary genomes fasta directory if it exists
- ea2936ff - Merge branch 'add_gfa_export' into add_gfa_build_pangenome
-
9c50b5ad...aa4366e0 - 5 commits from branch
added 1 commit
- f2765635 - also update frequency property on nucleotide nodes
Please register or sign in to reply