-
Workum, Dirk-Jan van authoredWorkum, Dirk-Jan van authored
To find the state of this project's repository at the time of any of these versions, check out the tags.
- [UNRELEASED]
- Added
- Changed
- Fixed
- [4.3.1] - 08-12-2023
- Changed
- Fixed
- [4.3.0] - 23-11-2023
- Added
- Changed
- Fixed
- [4.2.3] - 22-09-2023
- Added
- Changed
- Fixed
- [4.2.2] - 23-06-2023
- Changed
- Fixed
- [4.2.1] - 05-06-2023
- Added
- Changed
- Fixed
- [4.2.0] - 08-05-2023
- Added
- Changed
- Fixed
- Removed
- [4.1.1] - 30-01-2023
- Added
- Changed
- Fixed
- [4.1.0] - 23-12-2022
- Added
- Changed
- Fixed
- [4.0.0] - 21-12-2022
- Added
- Changed
- Fixed
- [3.4.0] - 2022-05-04
- Added
- Changed
- Fixed
- [3.3.0] - 2021-12-23
- Changed
- Fixed
- [3.2.0] - 2021-11-25
- Added
- Changed
- [3.1.0] - 2021-03-31
- Added
- Changed
- Fixed
- [3.0.0] - 2021-03-08
- Added
- Changed
- Fixed
- [2.0.0] - 2019-10-11
- Added
CHANGELOG.md 16.23 KiB
All notable changes to Pantools will be documented in this file.
[UNRELEASED]
Added
- Available resources are now added to the log files and validated for some processes (!224 (merged)).
- Added error catching for a set of pangenome node parameters that can cause issues down the line (!225 (merged)).
- Added check for whether MCScanX is installed before running
pantools calculate_synteny(!244 (merged)). - Added parsing of location information from functional annotation GFF files and write to graph (!245 (merged)).
Changed
- No longer accept tab characters in fasta input files (!232 (merged)).
Fixed
-
pantools blastno longer crashes if not all mrna nodes contain protein IDs (!226 (merged)). -
pantools calculate_syntenynow gives an exception if MCScanX runs out of memory (!226 (merged)). -
pantools sequence_visualizationnow requires the "sequence" rule to be set for unphased pangenomes (!226 (merged)). - Fixed a number of bugs causing grouping results to slightly differ based on the order of genome or proteome input files (!223 (merged)).
- Fixed
pantools blastfor TBLASTN and TBLASTX where the wrong output file was written (!241 (merged)). - Fixed
pantools add_functionswhen protein names have special characters (!242 (merged)). - Fixed the KMC command needed for
pantools add_genomes(!243 (merged)).
[4.3.1] - 08-12-2023
Changed
-
pantools pangenome_structurereceived a huge speed improvement and restored randomization (!216 (merged)).
Fixed
- Fixed a bug where variable/informative positions were incomplete, affecting PanVA instances (!212 (merged)).
- Fixed a bug that caused the pangenome growth curves to not show any randomization (!216 (merged)).
[4.3.0] - 23-11-2023
Added
-
pantools add_phasingfor adding phasing information about a genome to the pangenome database (!148 (merged)). -
pantools add_repeatsandpantools repeat_overviewfor incorporating repeats in the pangenome database (!148 (merged)). -
pantools calculate_synteny,pantools add_synteny,pantools synteny_overviewfor incorporating synteny in the pangenome database (!148 (merged)). -
pantools calculate_dn_dsfor calculating dn/ds values in a pangenome database (!148 (merged)). -
pantools gene_retentionandpantools sequence_visualizationfor visualizing gene retention and other visualizations of sequences in the pangenome (!148 (merged)). -
pantools blastfor doing a BLAST search against a pangenome database (!148 (merged)). - Added pal2nal, paml and r-cowplot dependencies to
conda_linux.yamlandconda_macos.yaml(!148 (merged)).
Changed
-
pantools busco_protein,pantools gene_classification,pantools kmer_classification,pantools group_info,pantools rename_phylogeny,pantools create_tree_template,pantools core_phylogeny,pantools rename_matrixto be compatible with possible added phasing information in a pangenome database (!148 (merged)). -
busco_proteinis no longer functional with BUSCO v3 and odb9 datasets (!146 (merged)). -
conda_linux.yamlandconda_macos.yamlnow unified in oneconda.yamlfile (!203 (merged)). - Updated KMC version to >=3.1.0 (!203 (merged)).
- Increased performance for calculating var/inf positions in
pantools msa(!202 (merged)). -
move_groupingis now nameddeactivate_groupingto better reflect its function (!201 (merged)). - Reorganized the documentation for Read the Docs (!201 (merged)).
Fixed
-
pantools core_phylogenyandpantools consensus_treenow ignore MSAs that could not be trimmed (!197 (merged)). -
pantools aniwith mode is now case-insensitive (!200 (merged)). - Fixed a bug where single occurrence (variable) positions in an alignment were counted as informative (!202 (merged)).
- Fix the seed option for
pantools pangenome_structurefor variants included (!205 (merged)). - Fix the order of the phenotype overview for consistency purposes (!206 (merged)).
[4.2.3] - 22-09-2023
Added
-
pantools pangenome_structurenow has a--seedoption to override the random seed (!179 (merged)). -
pantools msanow has a--trim-using-proteinsoptions for correct phylogenetics using nucleotide alignments (!195 (merged)).
Changed
- Pinned python to version 3.7 for CI/CD pipeline and highlighted this in developer docs (!183 (merged)).
-
pantools remove_functionsnow has --mode as optional argument instead of positional argument (!189 (merged)). -
pantools core_phylogenyandpantools consensus_treedon't run MSA under the hood anymore to prevent unwanted behaviour (!195 (merged)). -
pantools msa --align-nucleotidenow aligns CDS sequences instead of full (unspliced) mRNA sequences (!195 (merged)).
Fixed
-
pantools add_pavsandpantools add_variantsare now consistent in their creation of variant nodes (!176 (merged)). -
pantools msanow writes non-redundant headers insequences.infofiles (!180 (merged)). - Renamed remaining mentions of
core_snp_treetocore_phylogeny, except the command alias (!181 (merged)). - Fixed the trimming length of protein sequences with
pantools msa(!186 (merged)). -
pantools group_infonow correctly interprets signalP information (!190 (merged)). - Fixed kmc running on # cores regardless of given threads, added
--threadsoption topantools add_genomes(!191 (merged)). - Restored the possibility of trimming nucleotide sequences based on protein alignment (!195 (merged)).
- Fixed distinct kmer count for a subset of genomes using
pantools kmer_classification(!196 (merged)).
[4.2.2] - 23-06-2023
Changed
-
pantools group_infodefaults to all homology groups (!166 (merged)). -
pantools mapno longer accepts a genome numbers file and uses--include/--excludeinstead (!170 (merged)).
Fixed
-
pantools consensus_treenow properly checks if groups were excluded based on trimming (!168 (merged)).
[4.2.1] - 05-06-2023
Added
- Check if
bcftoolsandtabixare installed before running them (!159 (merged)). - Developer guide in the documentation (!160 (merged)).
- Update to CI/CD pipeline and pre-commit hooks to make sure no warnings or errors are thrown for the documentation (!160 (merged)).
Changed
- Simplified user guide for installing PanTools (!160 (merged)).
-
pantools groupalways requires to specify relaxation settings (!161 (merged)).
Fixed
- Fixed issue where FastTree was told protein sequences are nucleotide sequences (!157 (merged)).
[4.2.0] - 08-05-2023
Added
- New flag
--ignore-invalid-featuresforadd_annotationsto ignore GFF features that do not match the fasta (!133 (merged)). - Subcommands for handling VCF and PAV information for genes in a pangenome:
add_variants,add_pavs,remove_variants,remove_pavsandvariation_overview(!128 (merged)). - Flag
--variantstomsa,core_phylogenyandconsensus_treefor including consensus gene sequences for MSA (!128 (merged)). - Flag
--pavstogene_classification,pangenome_structure,msa,core_phylogenyandconsensus_treefor indication presence/absence of genes (!128 (merged)).
Changed
- Parameter
-Dforadd_functionsis changed to-Fbecause of conflict with JVM settings (!129 (merged)). - Removed no longer used functional databases from code base (!129 (merged)).
-
add_annotationsnow validates all input files before adding annotations (!133 (merged)). - Changed the default behavior for interpreting GFF files without mRNA features for
add_annotations(!132 (merged)). - Changed default behavior for
msa: only one type of nucleotide/protein is aligned by default, depending on the database type (!142 (merged)). - The
--phenotypeargument inmsano longer requires an included value. Output is generated for all phenotype properties if the argument is included (!145 (merged)).
Fixed
-
msais now backwards compatible with pangenome databases from before !60 (merged) (!134 (merged)). - The provided conda YAML files now work even when
channel_priority: strictwas set by the user (!135 (merged)). - Issue #54 (closed): functional annotations are now correctly linked with mRNA nodes for
add_functionscommand (!136 (merged)). - Resolved issue in
add_phenotypewhere all phenotypes values were recognized as a String (!145 (merged)). - Fixed issue where the ITOL tree templates from msa —method=multiple-groups did not match the gene trees (!145 (merged)).
Removed
-
MLSAno longer finds phenotype specific positions. This can still be done viamsa(!145 (merged)). - Removed support for genbank input files from
add_annotations(!133 (merged)).
[4.1.1] - 30-01-2023
Added
- Added
sphinx-lintto both pre-commit hooks and CI/CD pipeline to ensure correctness documentation (!112 (merged) !116 (merged)). - Option to add read group to alignment files produced by
pantools map(!102 (merged)). - Added more BLOSUM and PAM protein scoring matrix options (!123 (merged)).
Changed
-
add_functionscan now specify a directory where functional databases are stored (!117 (merged)). - Parameter
-vis now required forchange_grouping(!124 (merged)). - Scoring matrices are now stored in the resources directory as readable files (!123 (merged)).
Fixed
-
msanow works correctly with 'alt_id' properties of GO nodes (!118 (merged)).
[4.1.0] - 23-12-2022
Added
- New log4j2 logger with console and file appender (!85 (merged)).
- New globally accessible flags to regulate console logging output: silent, quiet, debug and trace (!85 (merged)).
- New feature,
export_pangenome, to export a pangenome to a number of files for comparison with other pangenomes (!108 (merged)). - Added CI validation test for build_pangenome and map_reads (!108 (merged)).
Changed
- Optimized localization for
build_pangenomeby makinglocalize_nodes()parallel; this code was sanity-checked with two small yeast datasets (!95 (merged)). -
remove_phenotypewas renamed toremove_phenotypesto be consistent withadd_phenotypes(!111 (merged)).
Fixed
-
group_infonow retrieves the correct homology groups with -H/-G (!110 (merged)). - Resolved issue where
add_phenotypesincorrectly binned columns when not every value was numeric (!111 (merged)).
[4.0.0] - 21-12-2022
Added
- CI/CD pipeline that for now only runs
mvn test(merge request !41 (merged)). - It is now possible to only add a specific functional annotation with
add_functions --label(merge request !51 (merged)). - Versioned documentation is included with readthedocs (merge request !50 (merged)).
- New function
remove_phenotypeallows for removal of phenotype nodes or properties on these nodes (merge request !58 (merged)). - New package
clicontaining command line interface classes for each pantools subcommand (merge request !43 (merged)). - Picocli argument parsing in command line interface classes (merge request !43 (merged)).
- New package
cli.mixinsfor reusable picocli options--threads,--includeand--exclude(merge request !43 (merged)). - Bean hibernate validation for command line arguments (merge request !43 (merged)).
- Custom Bean validation constraints, validators and payloads in
cli.validation,cli.validation.validatorsandcli.validation.payloadsrespectively (merge request !43 (merged)). - Renamed many option flags to more conventional naming practices (merge request !43 (merged)).
- Required file parameters are now positional parameters and no longer have command line flags (merge request !43 (merged)).
- New global option
--manualopens the read the docs manual on local browser (merge request !43 (merged)). - New global options
--forceand--no-inputto ignore user prompts (merge request !43 (merged)). - New function
remove_functionsallows removal off all function nodes and properties (merge request !43 (merged)). - New package for Junit5 unit tests (pantools.src.tests.java) (merge request !62 (merged)).
- New Junit5 test class ProteomeLayerTest to test functions in ProteomeLayer.java (merge request !62 (merged)).
- New custom assertion assertExits in ExitAssertions.java overriding System.exit for test purposes (merge request !62 (merged)).
- New
--nodeargument included forgroup_info(merge request !66 (merged), !69 (merged)). - New
--nodeargument included forgroup_info(merge request !66 (merged)). - Added log4j2 dependencies and configuration options (merge request !68 (merged)).
- New flags --debug and --quiet to manage console log levels (merge request !68 (merged)).
- New Junit5 test class ConstraintTest to test custom bean constraints made for argument validation (merge request !74 (merged)).
Changed
- htsjdk version has been updated to 2.24.1.
-
add_annotationsuses htsjdk instead of a custom implementation for parsing gff3 (merge request !38 (merged)). - Updated ASTER commit to ASTER v1.3 and adjusted code accordingly (merge request !54 (merged)).
- Improved recognition of file types in
add_functions(merge request !55 (merged)). - Argument parsing and initial validation moved from Pantools.java to
clipackage (merge request !43 (merged)). - The default number of threads for functions that allow
--threadsis now the number of cores or 8, whichever is lower (merge request !43 (merged)). - Versioned documentation now matches the new command line interface (merge request !43 (merged)).
- Function
remove_nodesno longer removes function nodes in groups (see Added section) (merge request !43 (merged)). - Function
remove_nodesnow allows removal of all nodes, dangerous nodes require user confirmation (merge request !43 (merged)). - Split
conda.yamlfile inconda_linux.ymlandconda_macos.yml(merge request !77 (merged)). - Removed ASTER submodule and add ASTER v1.3 to conda yml files (merge request !77 (merged)).
- Grouping now requires setting the relaxation or all its sub-values (commit id 72bc5a8d).
Fixed
-
mlsafunctionalities have been updated to work with the updatedadd_annotationsimplementation (merge request !44 (merged)). - Header names of kmer_classification_overview.csv are now correct (merge request !45 (merged)).
- Translation of mRNA to protein is now correct (merge request !46 (merged)).
- Inconsistency between the header and body of matrices generated in kmer_classification (merge request !61 (merged)).
-
add_annotationshas an updated and more informative output (and runs faster on fragmented genomes) (merge request !60 (merged)). -
add_antismashno longer crashes when identifiers of antiSMASH output do not match in the database (merge request !66 (merged)). - Resolved an issue where
go_enrichmentcrashed if COG functions were included in the pangenome (merge request !66 (merged)). - Replaced method for Fisher exact test to correctly deal with larger values (merge request !66 (merged)).
-
go_enrichmentno longer crashes when antiSMASH geneclusters were added to the pangenome (merge request !78 (merged)). -
add_annotationsnow handles co-features correctly (merge request !79 (merged)). - Distinguish between homology groups as file or as list for
msa,core_snp_treeandconsensus_tree(merge request !86 (merged)). -
build_panproteomeno longer creates inconsistent 'header' and 'protein_ID' properties (merge request !89 (merged)).
[3.4.0] - 2022-05-04
Added
- Version and commit ID are reported when PanTools is initialized.
-
--allow-polytomiesargument forconsensus_tree. - Included option to bin numerical values in
add_phenotypes.
Changed
-
msa_group,msa_of_multiple_groups,msa_of_regionsare reorganised in new functionmsa. -
pangenome_structureuses a colorblindfriendly palette. -
create_tree_templatesuses a colorblindfriendly palette when using 8 phenotypes or less. -
add_antismashnow only works with Antismash versions >= 6.0.
Fixed
- Changed the orientation of the 'has_busco' relationship
[3.3.0] - 2021-12-23
Changed
- Migrate to Maven
- Executable .jar file moved from
pantools/disttopantools/target
Fixed
- Reading gzip-compressed input files
[3.2.0] - 2021-11-25
Added
- busco_protein can now also use busco v5.
- add_functions can now read custom functional annotation files
- add_functions can now also read SignalP 5.0 output
- consensus_tree new method to create a consensus tree by combining gene trees with ASTRAL-properties
- reroot_phylogeny, new function that can reroot trees using the Ape package
Changed
- The interpro.xml file must now first be downloaded manually when adding InterproScan output to the pangenome
- Improved the k-mer classification method to scale to a higher number of genomes
[3.1.0] - 2021-03-31
Added
- core_snp_tree can now be run on protein sequences.
- group can be run using only the longest transcript of a gene
- add_annotations is able to use GFF files that only have 'CDS' properties
Changed
-
--versionargument instead--referencein change_grouping and remove_grouping. - File names and extension of several output files
Fixed
- Including
-rafwith map no longer results in a crash - Single-end read mapping with alignment mode of 0 or higher no longer results in a crash
- 'accessory_combinations.csv' no longer misses the first group for a genome combination
- Several issues that caused incorrect SAM output in map
- Removed code and tools that were used for development
[3.0.0] - 2021-03-08
Added
- Gene classification
- Functional annotations
- Phylogentic methods
- Optimal homology grouping using BUSCO
Changed
Fixed
- Improved similarity calculation in homology grouping
[2.0.0] - 2019-10-11
Added
- Read mapping functionality