Advanced search

Message boards : Number crunching : SWAN_SYNC, To Half or Half Not

Author Message
Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51158 - Posted: 31 Dec 2018 | 18:57:53 UTC

I built a new Rig-27 last night with an E5-2603v4 with 6c/6t (no hyperthreading). The MSI X99 motherboard has 3 slots with an EVGA 1080 Ti in each. Without thinking about it I just set it up like I did pre-SWAN_SYNC but with SWAN_SYNC enabled:
<app_config>
<app>
<name>acemdlong</name>
<gpu_versions>
<cpu_usage>1.0</cpu_usage>
<gpu_usage>0.5</gpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<gpu_versions>
<cpu_usage>1.0</cpu_usage>
<gpu_usage>0.5</gpu_usage>
</gpu_versions>
</app>
</app_config>

The System Monitor shows all 6 CPUs pegged at 100%. I looked at all their stderr files and see no signs of problems.

Am I wasting 3 CPUs???

Is there a way to tell if each CPU is synced (or mapped) to a single WU???

Or are 3 CPUs mapped to 3 GPUs and the other 3 are just sitting there repeating, "Is anyone home? Are we anywhere yet?", i.e. wasted.

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51159 - Posted: 31 Dec 2018 | 19:44:28 UTC

aurum@Rig-27:~$ nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Mon Dec 31 11:01:30 2018
Driver Version : 415.25
CUDA Version : 10.0
Attached GPUs : 3
GPU 00000000:01:00.0
Product Name : GeForce GTX 1080 Ti
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1B0610DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x63913842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 8x
Processes
Process ID : 2950
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 663 MiB
Process ID : 24944
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB

GPU 00000000:02:00.0
Product Name : GeForce GTX 1080 Ti
PCI
Bus : 0x02
Device : 0x00
Domain : 0x0000
Device Id : 0x1B0610DE
Bus Id : 00000000:02:00.0
Sub System Id : 0x65933842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Processes
Process ID : 28280
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB
Process ID : 28304
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB

GPU 00000000:03:00.0
Product Name : GeForce GTX 1080 Ti
PCI
Bus : 0x03
Device : 0x00
Domain : 0x0000
Device Id : 0x1B0610DE
Bus Id : 00000000:03:00.0
Sub System Id : 0x63933842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Processes
Process ID : 999
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 41 MiB
Process ID : 1283
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB
Process ID : 1573
Type : G
Name : cinnamon
Used GPU Memory : 13 MiB
Process ID : 2086
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 673 MiB

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51160 - Posted: 31 Dec 2018 | 19:44:35 UTC
Last modified: 31 Dec 2018 | 19:52:44 UTC

Looking for a Linux command analogous to nvidia-smi for a CPU report.
Tried to install CPU-X but it didn't run.

GPU 1 Process ID: 2950 and 24944
GPU 2 Process ID: 28280 and 28304
GPU 3 Process ID: 1283 and 2086

Now if I knew a way to map these processes to their CPUs...

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51161 - Posted: 31 Dec 2018 | 20:02:35 UTC

aurum@Rig-27:~$ inxi -t cm10
Processes: CPU: % used - top 10 active
1: cpu: 96.0% command: ..acemd.919-80.bin pid: 28280
2: cpu: 95.9% command: ..acemd.919-80.bin pid: 24944
3: cpu: 95.9% command: ..acemd.919-80.bin pid: 1283
4: cpu: 95.8% command: ..acemd.919-80.bin pid: 28304
5: cpu: 95.8% command: ..acemd.919-80.bin pid: 2086
6: cpu: 94.3% command: ..acemd.919-80.bin pid: 2950
7: cpu: 8.8% command: Xorg pid: 999
8: cpu: 8.2% daemon: ~kworker/3:2~ pid: 2961
9: cpu: 3.1% command: nxnode.bin pid: 1711
10: cpu: 2.5% daemon: ~kworker/3:0~ pid: 24148
Memory: MB / % used - Used/Total: 3703.6/7876.0MB - top 10 active
1: mem: 425.96MB (5.4%) command: ..acemd.919-80.bin pid: 2086
2: mem: 417.73MB (5.3%) command: ..acemd.919-80.bin pid: 2950
3: mem: 411.29MB (5.2%) command: ..acemd.919-80.bin pid: 1283
4: mem: 404.81MB (5.1%) command: ..acemd.919-80.bin pid: 24944
5: mem: 404.81MB (5.1%) command: ..acemd.919-80.bin pid: 28304
6: mem: 404.75MB (5.1%) command: ..acemd.919-80.bin pid: 28280
7: mem: 315.56MB (4.0%) command: nxnode.bin pid: 1711
8: mem: 234.98MB (2.9%) command: firefox pid: 3943
9: mem: 214.26MB (2.7%) command: cinnamon pid: 1573
10: mem: 175.70MB (2.2%) command: firefox pid: 4035

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51162 - Posted: 31 Dec 2018 | 20:11:36 UTC

aurum@Rig-27:~$ inxi -v 3
System: Host: Rig-27 Kernel: 4.15.0-43-generic x86_64
bits: 64 gcc: 7.3.0
Desktop: Cinnamon 3.8.9 (Gtk 3.22.30-1ubuntu1)
Distro: Linux Mint 19 Tara
Machine: Device: desktop Mobo: MSI model: X99S MPOWER (MS-7885) v: 4.0 serial: N/A
UEFI: American Megatrends v: M.C0 date: 06/14/2018
CPU: 6 core Intel Xeon E5-2603 v4 (-MT-MCP-)
arch: Broadwell rev.1 cache: 15360 KB

flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 20399
clock speeds: max: 1700 MHz 1: 1699 MHz 2: 1699 MHz 3: 1699 MHz
4: 1699 MHz 5: 1699 MHz 6: 1699 MHz
Graphics: Card-1: NVIDIA GP102 [GeForce GTX 1080 Ti] bus-ID: 01:00.0
Card-2: NVIDIA GP102 [GeForce GTX 1080 Ti] bus-ID: 02:00.0
Card-3: NVIDIA GP102 [GeForce GTX 1080 Ti] bus-ID: 03:00.0
Display Server: x11 (X.Org 1.19.6 )
drivers: modesetting,nvidia,nouveau (unloaded: fbdev,vesa)
Resolution: 640x480
OpenGL: renderer: GeForce GTX 1080 Ti/PCIe/SSE2
version: 4.6.0 NVIDIA 415.25 Direct Render: Yes
Network: Card: Intel I210 Gigabit Network Connection
driver: igb v: 5.4.0-k port: b000 bus-ID: 05:00.0
IF: enp5s0 state: up speed: 1000 Mbps duplex: full
mac: d8:cb:8a:1c:62:79
Drives: HDD Total Size: 500.1GB (2.2% used)
ID-1: model: WDC_WDS500G1B0B
Info: Processes: 231 Uptime: 3:47 Memory: 3708.8/7876.0MB
Init: systemd runlevel: 5 Gcc sys: 7.3.0
Client: Shell (bash 4.4.191) inxi: 2.3.56

kksplace
Send message
Joined: 4 Mar 18
Posts: 53
Credit: 1,400,776,749
RAC: 3,597,733
Level
Met
Scientific publications
wat
Message 51163 - Posted: 31 Dec 2018 | 21:03:18 UTC - in response to Message 51158.

Newbie here to this but a shot at a reply:

1. With your setup, all 6 cores at 100% makes sense to me. You have two WUs running per GPU. To my knowledge, SWAN_SYNC isn't just 'reserving' a CPU core for a GPU, but instead dedicating a core to a job related to a GPU instead of having to interrupt it every time it needs it. I saw this on a Windows machine when I did it with Milkway@Home and SWAN_SYNC enabled -- two cores were used but only one GPU.

2. I am not sure of the best way to see what CPU is doing what job, but try using "TOP" in the terminal. It will show what is being executed and the CPU% it is using. I would expect you will see "acemd.919-80." on all six cores.

3. Regarding hyperthreading: one of my hosts has an i7-7820x with 8 cores. When I first used it, I did not enable H-T due to reading posts on this and other BOINC related forums. However, out of curiosity, I enabled H-T after a couple of weeks, but limited BOINC core usage to 50% to see what happened. It seemed to help a little. My theory is that the Linux scheduler is able to use the H-T 'cores' for the little extra stuff going on without interrupting the BOINC tasks as much. (Not being a techy guy, I am standing by for the critiques of that statement!)

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 18,783,925
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51164 - Posted: 31 Dec 2018 | 21:18:57 UTC - in response to Message 51163.

3. Regarding hyperthreading: one of my hosts has an i7-7820x with 8 cores. When I first used it, I did not enable H-T due to reading posts on this and other BOINC related forums. However, out of curiosity, I enabled H-T after a couple of weeks, but limited BOINC core usage to 50% to see what happened. It seemed to help a little.

My experience on a Windows_10 PC is similar: my CPU has 6 cores. And 6 BOINC tasks (2 GPUGRID + 4 LHC) seem to run somewhat faster with HT on than with HT off.

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51165 - Posted: 31 Dec 2018 | 22:54:46 UTC

The Xeon E5-2603 v4 has no hyperthreading capability. It's only 6c/6t. I just wanted to mention that to eliminate a variable.

Trying to figure out how to define my own columns in top to show %cpu,psr,pid.

top -u boinc -H


This is interesting but not quite enough:
ps -u boinc -o pid,%cpu,sgi_p,psr,fname


If I watch them running it appears that acemd.91 uses about 15% of a CPU. It seems to me that a single core could service 2, 3, 4 even 5 acemd WUs. That would free up other CPUs to do CPU WUs. My guess it must be programmed or require babysitting each WU as it starts.

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 51167 - Posted: 1 Jan 2019 | 1:35:48 UTC - in response to Message 51165.
Last modified: 1 Jan 2019 | 1:46:20 UTC

To select and order columns in top...
Press 'f' (without quotes) whilst top is running.
Then use arrow keys and space bar to select move and display desired columns.
Press q when done.

if you display the 'Last used cpu' column you will see a cpu is not dedicated to a process. This is by design. In Linux, Processor affinity can be set using taskset. Use at your own risk.

Whilst top is running press h for more options.

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 51169 - Posted: 1 Jan 2019 | 5:02:55 UTC - in response to Message 51158.
Last modified: 1 Jan 2019 | 5:11:38 UTC

The System Monitor shows all 6 CPUs pegged at 100%. I looked at all their stderr files and see no signs of problems

I would suspect that SWAN_SYNC and running multi jobs per GPU should not be combined. Using both together would result in 6 CPUs used 100%. Inspecting your task run times will indicate whether you are wasting 3 CPUs.

My understanding of SWAN_SYNC is the processor is SPINning on the one GPU task process waiting for any CPU work.
As per this link, the difference between BLOCKING and SPIN is described for CUDA:
https://www.cs.cmu.edu/afs/cs/academic/class/15668-s11/www/cuda-doc/html/group__CUDART__DEVICE_g18074e885b4d89f5a0fe1beab589e0c8.html


and also here, discussion taken from Nvidia Dev forum:
https://devtalk.nvidia.com/default/topic/794833/100-cpu-usage-when-running-cuda-code/

Nvidia Moderator stated:
busy in a polling loop inside the driver function associated with `cudaDeviceSynchronize()`, waiting for the GPU to finish


In your case either turn off SWAN_SYNC or only run 1 task per GPU, depending on your preferences.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51183 - Posted: 2 Jan 2019 | 20:59:59 UTC - in response to Message 51158.

Am I wasting 3 CPUs???
Yes. There's no point to run two GPUGrid workunits simultaneously per GPU while using SWAN_SYNC under non-WDDM OS.
To be more specific: I rather use SWAN_SYNC than running two GPUGrid workunits per GPU.

Post to thread

Message boards : Number crunching : SWAN_SYNC, To Half or Half Not

//