Upgrade Neo4j to 3.5.30
In very specific circumstances PanTools will crash during pangenome construction where incorrect relationships are returned in this section of split_node()
.
PanTools with the following error:
sequence 1/23260 of genome 2 length=259277713 current position=118850021Exception in thread "main" org.neo4j.graphdb.NotFoundException: No such property, 'offset'.
at org.neo4j.kernel.impl.core.RelationshipProxy.getProperty(RelationshipProxy.java:354)
at nl.wur.pangenome.GenomeLayer.split(GenomeLayer.java:2561)
at nl.wur.pangenome.GenomeLayer.follow_reverse(GenomeLayer.java:2845)
at nl.wur.pangenome.GenomeLayer.construct_pangenome(GenomeLayer.java:2954)
at nl.wur.pangenome.GenomeLayer.initializePangenomeParallel(GenomeLayer.java:3005)
at nl.wur.pantools.Pantools.main(Pantools.java:969)
After a fix in 47da47e6 PanTools crashes during localisation:
Processing sequence 55/4619 of genome 28 length=257848
Processing sequence 55/4619 of genome 28 length=257848
Processing sequence 56/4619 of genome 28 length=253580
Processing sequence 56/4619 of genome 28 length=253580
Processing sequence 57/4619 of genome 28 length=248169 This is NOT a 'starts' relationship
PanTools uses Neo4j 3.5.3, and upgrading to Neo4j 3.5.30 solves this issue. This Neo4j issue might be the one we are hitting. I've also done a test run where information about the relationship and its start and end nodes is dumped for troubleshooting purposes. This is the result:
==== RELATIONSHIP
(2503173)-[FR,12115928]->(4639371)
FR
==== START NODE
Node[2503173]
number -> 2349
genome -> 4
identifier -> 4_2349
offset -> 54600288
length -> 649
title -> BPHDCC0000588_2349
sequence
==== END NODE
Node[4639371]
address -> [I@c6df8a2
last_kmer -> 31026935
length -> 78
first_kmer -> 44986548
nucleotide
Neo4j returns an FR
relationship whereas we ask for starts
relationship types only. The delete()
call is not the issue, verified with a version of the code where the relationships to be deleted are deleted outside of the for loop.