Skip to content

Grouping inconsistency

Discovered during merge request !223 (merged).. For testing two arabidopsis pangenomes with identical input files in different orders, a difference between homology groups was found in the allocation of a single protein: ahah_7_AT3G28550.2.

The input proteomes for database1:

image.png

And for database 2:

image.png

I compared the MCL input files between the two databases and there was a differents in truncation between the adjusted similarity scores. My suggested course of followup investigation:

  1. truncate both files to the same number of digits and compare again. If there is still a difference the proteins that have different scores can be investigated.
  2. Run the truncated files in MCL and see if it is a difference in order or in rounding that matters
  3. Contact MCL developers in case the difference in order is the culprit