[Dxspider-support] Cluster Lock-Up
Dirk Koopman
djk at tobit.co.uk
Tue Oct 9 22:32:41 CEST 2007
Danilo Brelih wrote:
> Anthony (N2KI) pravi:
>
>> Unfortuneately this is an intermittant problem. The only good lead is that
>> it happens at the same time (2351Z) when it does.
>
> We crashed tonight same time seems same problem !
>
> http://s50clx.infrax.si:41115/statistika/msg.html
>
> DX Spider Cluster version 1.54 (build 0.164) on Linux
> Copyright (c) 1998-2007 Dirk Koopman G1TLH
> S50U de S50CLX 9-Oct-2007 1225Z 2.3522 dxspider >
>
I am embarrassed to say that there appears to be a long standing error
in the PC9x sentence deduping code that caused a enormous loop last
night. This caused a huge spike in the amount of traffic. Something that
I have not seen before, somewhat surprisingly.
As some of you may know, I am using the number of seconds in a day as a
deduping mechanism. This means that at midnight, the number rolls over
from 86399 -> 0. Last night we had a loop because we were getting PC92,
from an (italian) station whose clock was / is wildly out, PC92 C
records with a time of 85967 and an A record of 518. Now the time
checking that was being done only worked for clocks that were nearer UTC
than this station.
I believe that I have fixed this so I don't rely on other nodes being
syncronised to UTC (although you should really all be).
This means, sadly, and rather embarrassingly, I have ask people to
upgrade again. However I would wait until tomorrow (or at least 02:00Z)
to see whether, this time, I really have fixed it...
Mea culpa
Dirk
PS Please just upgrade using CVS or CVSlatest.zip/tgz silently. You
don't need to tell the list that you have done it. I will know (or at
least I can find out by looking at the PC92s you send) anyway.
More information about the Dxspider-support
mailing list