Advanced search

Message boards : Number crunching : More persistant problems, not just gerald

Author Message
Profile JStateson
Avatar
Send message
Joined: 31 Oct 08
Posts: 154
Credit: 2,552,228,028
RAC: 2,731,312
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41573 - Posted: 28 Jul 2015 | 11:55:12 UTC

I have four rigs running this app and serious problems on one.

The rig that seems to have the fewest problems (19 valid out of 22) runs 347.52 on 3 NVidia: gtx770, gtx650ti and gtx570 and sometimes the wu is passed among all 3.

The others all run 353.30 and looking at many of the work units I see that sometimes 4 other users have the same error that I do and sometimes I see the message "too many errors, may have bug" and no validations by any users.


The 1Quad has only 5 valid out of 120 and a pair of gtx670, no SLI.

The 2Quad has 8 valid out of 21 with a pair of gtx570, no SLI.

The PC-7 system has 19 valid out of 25 total and pair of gtx670, SLI disabled.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2048
Credit: 14,826,285,069
RAC: 2,412,335
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41578 - Posted: 28 Jul 2015 | 15:27:39 UTC - in response to Message 41573.

The 1Quad has only 5 valid out of 120 and a pair of gtx670, no SLI.

The 2Quad has 8 valid out of 21 with a pair of gtx570, no SLI.

The errors ("the simulation became unstable" & "file force.cpp line 513: TCL evaluation of [calcforces]") on these two hosts are definitely caused by the "summer overclock bug", that is your cards couldn't take these high clocks (GPU and/or RAM) at these high temperatures.

Robby1959
Send message
Joined: 11 May 13
Posts: 34
Credit: 412,672,178
RAC: 69,715
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 41596 - Posted: 31 Jul 2015 | 2:53:29 UTC

my machine is erroring out for 10 days 650
ti I had a 340 driver just moved it to a 350 and reseated the card

Robby1959
Send message
Joined: 11 May 13
Posts: 34
Credit: 412,672,178
RAC: 69,715
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 41606 - Posted: 2 Aug 2015 | 16:20:50 UTC - in response to Message 41596.

fyi I pointed a fan into the area as well [ no overclocking btw ] and so far I have run several task with no problems

Post to thread

Message boards : Number crunching : More persistant problems, not just gerald