Advanced search

Message boards : Number crunching : 0.973 CPUs + 1 GPU

Author Message
loki126
Send message
Joined: 18 Nov 08
Posts: 14
Credit: 30,687,791
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 41311 - Posted: 12 Jun 2015 | 16:21:04 UTC

Do I have any drawbacks running 1 GPUGrid WU and 4 CPU tasks (intel i5) at the same time?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,991,617,060
RAC: 146,649
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41312 - Posted: 12 Jun 2015 | 16:28:19 UTC - in response to Message 41311.

Yes, the GPUGrid task will run a bit slower on your GTX970.
If you configure Boinc to run 3 CPU tasks that will expedite the GPUGrid task.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

loki126
Send message
Joined: 18 Nov 08
Posts: 14
Credit: 30,687,791
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 41314 - Posted: 12 Jun 2015 | 16:52:26 UTC

Wouldnt setting the the core usage to 75% result in the same? Only that 1 GPU and 3 CPU tasks run + 1 idle core?

Or do I have to make changes to my app_config.xml?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,991,617,060
RAC: 146,649
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41315 - Posted: 12 Jun 2015 | 17:08:27 UTC - in response to Message 41314.

Yes, it's sufficient to set CPU usage to 75% in Boinc Manager.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

loki126
Send message
Joined: 18 Nov 08
Posts: 14
Credit: 30,687,791
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 41316 - Posted: 12 Jun 2015 | 17:12:20 UTC

Ok. Thanks very much.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 695
Credit: 1,371,992,468
RAC: 3
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 41317 - Posted: 12 Jun 2015 | 21:19:24 UTC
Last modified: 12 Jun 2015 | 21:38:38 UTC

I just tried out a new GTX 970 to answer that very question, and saw only a minor difference, though it will depend on your CPU, etc. In my case, I have an i5-3550, which has four real cores running CPDN work units, and I did not initially reserve a core for the GPU.

It ran the e2s105_e1s274f90-NOELIA_ETQunboundx1-1-2-RND6708_0 work unit in 4 hours 11 minutes 10 seconds.

Then I reserved a core, and it ran e5s56_e1s534f55-NOELIA_ETQunboundx2-0-2-RND1360_0 in 04 hours 05 minutes 49 seconds.

I ran only one work unit in each case, so it is not much of a sample, but I think the differences look rather small. (Win7 64-bit, 352.86 drivers).

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,991,617,060
RAC: 146,649
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41319 - Posted: 13 Jun 2015 | 6:01:32 UTC - in response to Message 41317.
Last modified: 13 Jun 2015 | 6:16:46 UTC

That is a bit faster, albeit only ~2.5%. This is because the latest apps are quite good at utilizing the GPU without much CPU reliance and you have 4 real cores with 4 true threads. On CPU's with HT the difference tends to be higher.
Using SWAN_SYNC=1 might squeeze out a bit more GPU performance, but again only a few percent.
Increasing the GPU's GDDR5 up to 3500MHz (on the GM204's it typically drops to 3005MHz) also helps but only by 0.5% to 3% IIRC. However if you are running more than one task at a time on your GPU it's likely to be more important as the MCU tends to be higher (varies by task type).
You might also be able to OC slightly.
These improvements multiply and together can make a significant overall improvement:
Using 3/4 CPU's = 102.5%
Using SWAN_SYNC = 102.5%
3500MHz GDDR5 = 101%
Running 2 tasks = 128% (varies by task type)
GPU OC of =3% = 103%
Overal = 1.025*1.025*1.01*1.28*1.03*100% = 139.899% (~40% more work).

The climate models tax the CPU more heavily than other CPU tasks, but with 4cores/4threads it's not a big problem. On an i7 (4cores/8threads) it is a problem as each tasks competes for the same resources (cores). There is almost zero difference in running 7 climate models rather than 8 on an i7.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jim1348
Send message
Joined: 28 Jul 12
Posts: 695
Credit: 1,371,992,468
RAC: 3
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 41323 - Posted: 13 Jun 2015 | 9:24:05 UTC - in response to Message 41319.

Undoubtedly correct, but my main takeaway is that the GTX 970, though very good, is not quite as great as the GTX 750 Ti for efficiency. But I am spoiled by the latter, so I think I will add a couple more of them to GPUGrid, and use the GTX 970 for Folding, where its extra speed is rewarded by the quick-return bonus. It all follows along pretty well to the performance results you gave earlier (https://www.gpugrid.net/forum_thread.php?id=1150&nowrap=true#41294), so it is not really a surprise.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2048
Credit: 14,826,576,669
RAC: 2,426,205
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41324 - Posted: 13 Jun 2015 | 12:21:37 UTC - in response to Message 41323.

... and use the GTX 970 for Folding, where its extra speed is rewarded by the quick-return bonus.

There is a quick return bonus in GPUGrid also, it's 50% for less than 24h, and 25% for less than 48h. Your GTX750Ti finish the longest Gerards just under 24h, so if the workunits gets any longer (I'm sure they will as GPUs gets faster) your GTX750Ti's might miss the 50% bonus. I don't recommend to buy two or more lesser cards instead of one bigger if they are from the same generation.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 695
Credit: 1,371,992,468
RAC: 3
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 41325 - Posted: 13 Jun 2015 | 12:30:10 UTC - in response to Message 41324.

I am well aware of the 24 hour bonus, and the GTX 750 Tis can do it. The work units might get longer, but also GPUGrid might come up with improved software. And the new Noelia ETQunboundx show that they might go down again too. Moving cards around is just part of the process, and the GTX 750 Tis are the most efficient for any project I have used them on, so they will find a home somewhere.

loki126
Send message
Joined: 18 Nov 08
Posts: 14
Credit: 30,687,791
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 41418 - Posted: 27 Jun 2015 | 7:31:43 UTC
Last modified: 27 Jun 2015 | 7:34:47 UTC

I didnt want to open a new thread. Another question.

How do I limit the interval in which WUs are sent. If I get 2 gerards in a row, the second one has a high propability of not getting done in 24 hours and Id like to avoid that.

Also. Has someone figured out what the best clocks for a 970 are when underclocking to get best performance per Watt?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1111
Credit: 1,813,587,539
RAC: 893,726
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41432 - Posted: 28 Jun 2015 | 5:35:42 UTC - in response to Message 41418.
Last modified: 28 Jun 2015 | 5:37:48 UTC

If you don't want work to be cached, then set your cache settings to not get it!

Open BOINC Manager, find the "Computing preferences...", and look for the following 2 settings:
- Store at least x days of work
- Store up to an additional y days of work

(x + y) is, essentially, the maximum that BOINC tries to cache for a resource.
You are in control, and if you want, you can set both to 0, for maximum bonus credits, I believe.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,991,617,060
RAC: 146,649
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 41436 - Posted: 28 Jun 2015 | 9:27:33 UTC - in response to Message 41418.
Last modified: 28 Jun 2015 | 9:47:03 UTC

Also. Has someone figured out what the best clocks for a 970 are when underclocking to get best performance per Watt?

Performance/Watt needs to be taken at the system level (not the GPU level), and measured at the wall.
Your i5-2500K (95W TDP) and supporting hardwares power usage would be as much of a concern as the 970's clocks. You really need to tune the system, not just the GPU (though that in itself helps a bit).

The simplest way to control performance on the Maxwell's is just to throttle the power. You could set it to 90%, 80% or less. The issue then would be it's p-state might drop from p2 to p5 - assuming you don't want to run a 970 at 405MHz! Can be controlled using NVidia Inspector (overclocking), but it's quite confusing when you first start to use it.

My i7-3770K @3.7GHz and 2 GTX970's (W7):
4 GPUGrid tasks running on 2 GTX970's + 2 climate models uses 423W (SWAN_SYNC in use). Estimated throughput gain is >15% from running 2tasks/GPU.

Note that in this setup GPUGrid is using 4 CPU threads to support 4 GPU tasks.

Without the 2 climate models running the power drops to 415 (only 8W less, which is typical given that the CPU is already being well used).

If I then suspend one GPUGrid task the system power at the wall drops to 409W (another 6W less) and with only 1 task running per GPU it drops to 390W (19W less). My GPU's power is capped at 100% so they are using 145W each (290W) and the rest of the system is therefore using 100W (but it is an i7).

In the same configuration, the 390W power usage drops to 370 when I reduce the CPU's clock from 3.7GHz to 3.1GHz - indicating a system (excluding GPU's) power usage of 80W.
With 4 GPUGrid tasks and 2 climate models running and the CPU at 3.1GHz the system power draw is 397W (18W less simply due to reducing the CPU clocks). If I tweaked the CPU voltage that might be reduced further, and I could drop the CPU clocks more. While that would probably not be productive in terms of system performance/Watt it might allow the system to run cooler and more quietly (if that was the desire).
If you go through the above procedure (adapted to your system and configuration) and run a task with different test setups you can work out what configuration brings the best Performance/Watt based on the runtimes.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Post to thread

Message boards : Number crunching : 0.973 CPUs + 1 GPU