Advanced search

Message boards : Graphics cards (GPUs) : Anomalies in the speed of processing WU

Author Message
TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10179 - Posted: 26 May 2009 | 8:01:51 UTC
Last modified: 26 May 2009 | 8:03:28 UTC

Whether somebody can explain me why this sample was counted by GTX 280 in the time of 25145s but through GTX 260 in 80785s.

Where from so big difference during counting the sample?
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 10188 - Posted: 26 May 2009 | 13:28:09 UTC

There is a bug I have been chasing and can hit many versions of the BOINC Client but seemed to be especially bad in 6.6.20... this bug causes tasks to run long ... up to 2 to 4 times as long as expected. One participant was running AP tasks on his CPU and was hit ... I get it with GPU Grid tasks ... and that may be what happened here ...

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10194 - Posted: 26 May 2009 | 18:08:21 UTC - in response to Message 10188.
Last modified: 26 May 2009 | 18:14:30 UTC

I don't know what is going on....

I am procesing now: 2-KASHIF_HIVPR_n1_for_ba3-9-100-RND0099_0 and it is crunching now 15h40m, it's says 7H remaining....

I see another task waiting, and it's says it will take 15H.....

Usualy on GTX260 it takes 8h per WU.

Are new WU more complicated or what?

Longer runtime is set too them?

2009-05-26 07:01:42 Starting BOINC client version 6.6.28 for windows_intelx86
2009-05-26 07:01:42 log flags: task, file_xfer, sched_ops
2009-05-26 07:01:42 Libraries: libcurl/7.19.4 OpenSSL/0.9.8j zlib/1.2.3
2009-05-26 07:01:42 Running as a daemon
2009-05-26 07:01:42 Running under account boinc_master
2009-05-26 07:01:43 Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz [x86 Family 6 Model 23 Stepping 10]
2009-05-26 07:01:43 Processor features: fpu tsc pae nx sse sse2 mmx
2009-05-26 07:01:43 OS: Microsoft Windows XP: Professional x86 Edition, Dodatek Service Pack 2, (05.01.2600.00)
2009-05-26 07:01:43 Memory: 3.25 GB physical, 7.07 GB virtual
2009-05-26 07:01:43 Disk: 32.00 GB total, 4.94 GB free
2009-05-26 07:01:43 Local time is UTC +2 hours
2009-05-26 07:01:43 CUDA device: GeForce GTX 260 (driver version 18250, compute capability 1.3, 896MB, est. 85GFLOPS)
2009-05-26 07:01:43 Not using a proxy

Is this correct that it use Cuda 1.3?
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 10198 - Posted: 26 May 2009 | 18:41:51 UTC - in response to Message 10194.
Last modified: 26 May 2009 | 18:44:17 UTC

I don't know what is going on....

I am procesing now: 2-KASHIF_HIVPR_n1_for_ba3-9-100-RND0099_0 and it is crunching now 15h40m, it's says 7H remaining....

I see another task waiting, and it's says it will take 15H.....

Usualy on GTX260 it takes 8h per WU.

Are new WU more complicated or what?

Longer runtime is set too them?

...

Is this correct that it use Cuda 1.3?

Try stopping and restarting BOINC, use advanced menu and stop the connected client, stop the manager and restart and see if the completion speed picks back up.

{edit} length and to add:

We may be seeing the evidence of my contention that the long running task issue of 6.6.20 may not be fully dead... I have no real hint as to what is happening and forwarded my notes such as they are to BOINC Alpha where I saw this happen to me on 6.6.28 just the other day ... only for me it was a pair of tasks running for 24 hours ... each ... normal time is like yours, about 6 hours and change ... ugh!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10206 - Posted: 26 May 2009 | 20:57:27 UTC - in response to Message 10198.

Problem: as far as I know in the case of "the 6.6.20"-bug the runtime and time per step is displayed as normal, which is not the case here.

It could be that the user with the GTX 260 was putting some heavy graphical load onto the system or the cpu.. sadly we can't know that.

Is this correct that it use Cuda 1.3?


It doesn't say so. It says your card has hardware capability level 1.3, which is indeed correct.

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10208 - Posted: 26 May 2009 | 21:06:13 UTC - in response to Message 10206.
Last modified: 26 May 2009 | 21:33:12 UTC

Hi,

on system with GTX260 was runing only GPUGRID and 4 Rosetta@Home on CPU. Nothing other was runing...

P.S.

this WU was runing 19h, but it says it was runing: 9 h.... strange....
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10211 - Posted: 26 May 2009 | 21:29:40 UTC - in response to Message 10208.

Oh, so we do know what the user was doing :)

Here it seems like your card switched into 2D mode. This is not shown in the task output of the very long one, but it could have switched into 2D after the task was started. That would be enough to explain the speed difference.

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10213 - Posted: 26 May 2009 | 21:35:58 UTC - in response to Message 10211.
Last modified: 26 May 2009 | 22:05:44 UTC

After finished last WU i restart project....

This computer is mine, so I know what user was doing :)

Ok, let's assume that 2D is reason.
How to prevent GTX260 go into 2D mode when gpugid is running?

I instal as a service, computer is crunching without looging into account....
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 10218 - Posted: 27 May 2009 | 1:10:23 UTC

Gotta remember to check for that lower clock mode the next time I see a slow task ...

The question *I* have is why stopping and restarting BOINC is sufficient to kick it back to higher speeds...

Vid Vidmar*
Avatar
Send message
Joined: 27 Aug 08
Posts: 18
Credit: 1,146,374
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 10227 - Posted: 27 May 2009 | 8:25:52 UTC - in response to Message 10218.

It happend to me too with 6.6.28.
Yesterday, before I went to work, I noticed a GPUGRID task sitting @ 72% done. When I got back home 12h later, it was still sitting on that 72% mark and still showing as running (checked the logs and indeed it was "running" all that time). Restarting BOINC kicked it back into motion.
Before this event I had a couple of SETI CUDA MB tasks that showed similar symptoms, however, those were stuck @ 0% and I don't beleive that even restarting BOINC did any good - aborting them did. ;)
So, if I hadn't been experimenting how BOINC handles multiple GPU projects, I'd still be running 6.5.0 on that machine as I do on others. And honestly, I wouldn't reccomend any 6.6 version yet.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10234 - Posted: 27 May 2009 | 21:45:19 UTC - in response to Message 10227.

I don't think Thomasz' problem is hanging WUs, it looks like they're just rather slow. Trying 6.5.0 wouldn't hurt, though.. just to exclude any 6.6.x strangeness.

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10247 - Posted: 28 May 2009 | 7:43:42 UTC - in response to Message 10234.

WTF...

I install new GTX260 216 yesterday on another computer.

This WU was calculated.

It takes 14h..... but it says in log that it takes about 8h.....

So what is hapening... WU's are longer or what?

XPSP3 32bit and 6.6.28 and 182.50...


____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 10250 - Posted: 28 May 2009 | 10:43:04 UTC

Can you test if using boinc 6.5.0 does run more smooth then the one your using now, just to remember you this older version will not show the actual time but only the cpu time.
But i would like to know if units will be completed normal or have the same issues.
I am also if the problems are related to the screensaver since i guess all of use have somehow set the protection of our screen in some way.
So could it be if the screensaver kicks in the cards switch to the lower 2d mode. Even though i have turned off the power consumption mode i have my machine set to show a black screen when after 10 minutes.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 10263 - Posted: 28 May 2009 | 17:01:42 UTC - in response to Message 10247.

WTF...

I install new GTX260 216 yesterday on another computer.

This WU was calculated.

It takes 14h..... but it says in log that it takes about 8h.....

So what is hapening... WU's are longer or what?

XPSP3 32bit and 6.6.28 and 182.50...

This is one of the problems I am chasing... ETA and I are having a debate (sometimes on the side so you don't see it) about how real this is, how common, and on and on and on ...

6.6.20 was REAL bad with this error (it also affected CPU only tasks with .20 but I was never able to pin down a cause) ... it seems less common with other 6.6.x versions, but it does not appear to be gone... What *I* had suspected it was did not pan out the last time I caught a pair of tasks doing this ... now I need to look at something else the next time I get one ...

Question, was this a single GPU system, or did it have multiple GPUs in the box?

Two things, next time it happens, see if the GPU has dropped back into a slower clock regime and secondly if it strikes again, usually stopping and restarting BOINC will clear this up... you have to shut down the client, not just close the BOINC Manager ... a reboot will do it too ...

As always, if you see one, report as much as you can so we can pin this one down ...

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10273 - Posted: 28 May 2009 | 21:28:41 UTC

It's not the WUs. I second trying 6.5.0 and watching the clock speed. Another idea: is someone using the machine? If so it might just be the "run GPU while computer is in use" setting, which is disabled by default in newer BOINC clients.

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10284 - Posted: 29 May 2009 | 6:29:40 UTC - in response to Message 10273.

hmmm

2D?!

My secon system - the clock goes to 2D....

I don't know how to prevent this....

I have enabled using GPU while computer is in use....
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Andrew
Send message
Joined: 9 Dec 08
Posts: 29
Credit: 18,754,468
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 10301 - Posted: 29 May 2009 | 14:54:29 UTC - in response to Message 10284.

When I last updated Boinc (6.6.20 seems to be working fine at the mo and with all the discussion probs best not to update!), it said in the installation wizard that GPU computation wouldn't work if you installed Boinc as a service.

I don't know whether that has changed. If just you use your computer, you could just set it to run at startup. Running as a service only really matters if your computer is going to be left on the login screen for a long time. You can always lock your computer if you're logged in but away.

Hope that helps.

p.s. is your card overclocked? If overclocked then surely it'll have to be in 3D otherwise it might not have high enough voltage to run the high clock rates (2D runs your card at lower voltage)

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10321 - Posted: 30 May 2009 | 8:30:25 UTC - in response to Message 10273.
Last modified: 30 May 2009 | 9:14:49 UTC

ExtraTerrestrial Apes

you have 100% right.....

Card goes into 2D mode....

I left computer yesterday whithout touching him....

He completed 1 WU, and start anothor.

And now it compleated second WU.

This WU

So 2D !!!!

Now it is question: What cause card go to 2D mode? nvidia bug? Boinc bug?

It is not service.... On XP service works quite well.

Maby I will try drivers 182.06.... idn...


P.S

I have PALIT 896MB GTX 260 625/2200 Sonic 55nm :)

And standard Gigabyte GTX260 SP192


I also notice, that remote desktop cause error of acemd 6.64.... it crash...
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10501 - Posted: 12 Jun 2009 | 20:19:34 UTC - in response to Message 10321.

I also notice, that remote desktop cause error of acemd 6.64.... it crash...


That's a known problem, and unfortunately inherent to the way most remote software works (VNC is OK though).

Otherwise.. did you fix the 2D clock speed issue? I never had a NV card with power-saving lower 2D clocks, so I can't tell. Try switching the windows power profile?

MrS
____________
Scanning for our furry friends since Jan 2002

popandbob
Send message
Joined: 18 Jul 07
Posts: 67
Credit: 40,277,822
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10524 - Posted: 13 Jun 2009 | 4:48:11 UTC

The easiest way to stop 2D mode is to install ATItool and set the all the clocks at the 3D rate. (Yes ATItool works on nVidia)
Bob

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10525 - Posted: 13 Jun 2009 | 8:22:12 UTC - in response to Message 10301.

When I last updated Boinc (6.6.20 seems to be working fine at the mo and with all the discussion probs best not to update!), it said in the installation wizard that GPU computation wouldn't work if you installed Boinc as a service.


Running as a service (running as a daemon or protected application mode) is a security issue with Vista. As far as I know it works under XP and Linux fine.

As for BOINC upgrades there are a few cuda related fixes after 6.6.30 so if you are running windows it might be worthwhile upgrading to 6.6.36.
____________
BOINC blog

Johnny Maddux
Send message
Joined: 17 May 09
Posts: 5
Credit: 63,518,568
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 10614 - Posted: 16 Jun 2009 | 20:54:51 UTC

Here is another case, I have 2 gtx295's that usually run a task 10-12 hours. I was supprised to see this yesterday when both had ran for 22 hours and still has 10-20% to go. I'm running 6.6.31 on vista 32bit 4gb ram.

I have heard of this but this is the first time I have seen it.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 10619 - Posted: 17 Jun 2009 | 4:48:59 UTC - in response to Message 10614.

Here is another case, I have 2 gtx295's that usually run a task 10-12 hours. I was supprised to see this yesterday when both had ran for 22 hours and still has 10-20% to go. I'm running 6.6.31 on vista 32bit 4gb ram.

I have heard of this but this is the first time I have seen it.

Stop and restart BOINC, make sure the science apps stop too ... simplest way is to reboot...

This is the dreaded 6.6.20 issue which though less common on later versions is still there. Worse, a friend actually had it hit him on CPU tasks ... so, it is a core problem that we have no clear idea why it is happening.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10638 - Posted: 17 Jun 2009 | 20:56:01 UTC - in response to Message 10614.

If restarting BOINC doesn't help you could / should check if your GPU dropped to 2D clocks (e.g. with GPU-Z).

MrS
____________
Scanning for our furry friends since Jan 2002

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10852 - Posted: 25 Jun 2009 | 17:37:08 UTC - in response to Message 10638.

Hi!

Anyone can explain me this error:

WU

Thnx
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10858 - Posted: 25 Jun 2009 | 20:57:29 UTC - in response to Message 10852.

"incorrect function. (0x1) - exit code 1 (0x1)" with the app being terminated at a random code line. It's the most common error and mostly related to GPU-OC or defective hardware, but there can be other reasons as well.

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : Anomalies in the speed of processing WU

//