Advanced search

Message boards : Graphics cards (GPUs) : Cuda 101 / 1121 Problem

Author Message
ZUSE
Avatar
Send message
Joined: 10 Jun 20
Posts: 7
Credit: 417,413,397
RAC: 539,269
Level
Gln
Scientific publications
wat
Message 57564 - Posted: 10 Oct 2021 | 9:33:17 UTC

Hello,
Is there already a solution for the cuda 101 / cuda 1121 problem?

Two Nvidia T600s run in my system, one only gets Cuda 1121 tasks and runs at 73 degrees (100%) (PCIe 3.0 x16)
7 Oct 2021 | 16:52:50 UTC 8 Oct 2021 | 8:52:52 UTC Done and Confirmed 55,535.07 55,421.92 145,759.50 New version of ACEMD v2.18 (cuda1121)

The other gets Cuda 101 tasks, this takes much longer and runs at 41 degrees (100%) (PCIe 3.0 x4)
7 Oct 2021 | 16:35:31 UTC 9 Oct 2021 | 19:04:05 UTC Done and Confirmed 177,537.28 177,318.80 145,759.50 New version of ACEMD v2.18 (cuda101)

System:
Win10
i7 8700T
2x 8GB
2x T600
HP Z2 G4 SFF

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,543,582,483
RAC: 68,082,202
Level
Trp
Scientific publications
wat
Message 57565 - Posted: 10 Oct 2021 | 12:55:20 UTC - in response to Message 57564.

How many PCIe lanes are connecting the “slow” card vs. the fast card? Are they both running at PCIe 3.0? GPUGRID is sensitive to PCIe bandwidth to each card.


____________

ZUSE
Avatar
Send message
Joined: 10 Jun 20
Posts: 7
Credit: 417,413,397
RAC: 539,269
Level
Gln
Scientific publications
wat
Message 57566 - Posted: 10 Oct 2021 | 13:43:04 UTC - in response to Message 57565.

The fast card runs with PCIe 3.0 16 lanes
The slower one with PCIe 3.0 4 lanes

Before that, two Quadro P620s were installed without any problems

Is it because of the 4 lanes that I don't get any Cuda 1121 tasks for the second T600?

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1031
Credit: 35,543,582,483
RAC: 68,082,202
Level
Trp
Scientific publications
wat
Message 57567 - Posted: 10 Oct 2021 | 16:39:35 UTC - in response to Message 57566.

Probably just luck of the draw so far. You only have a handful of tasks to draw your conclusions from. Over time I would expect both cards to receive both apps.

Other than task to task variation with the Cryptic Scout tasks you’ve run, the slow speed could very well be due to the lower bandwidth of the card hooked up via a PCIe 3.0x4 link.

If that link was dedicated, you might not see much impact as Ive measure PCIe requirements to be around that of a PCIe 3.0x4 link. However, your GPU is obviously hooked up via the chipset, since your CPU only has 16 dedicated lanes and they are all occupied by your primary GPU. The chipset has a bottleneck via the DMI 3.0 link between the chipset and CPU. All peripheral devices (NVMe, SATA SSDs/HDDs, networking, USB, etc) share this link. So bandwidth available to the GPU will likely be less than the 3.0x4 theoretical max.

Id give it more time though. The cryptic scout tasks aren’t nearly as homogenous as the ADRIA tasks are.
____________

ZUSE
Avatar
Send message
Joined: 10 Jun 20
Posts: 7
Credit: 417,413,397
RAC: 539,269
Level
Gln
Scientific publications
wat
Message 57588 - Posted: 12 Oct 2021 | 8:14:38 UTC - in response to Message 57567.

I am providing the chipset and CPU bottleneck, but I ran 4 Quadros P620s and a P600 on the system
1 time x16 / 1 time x4 / 3 times PCIe x1 (3.0) there were hardly any problems.

could it also be together with the graphics card bios? these are not identical!
or on the drivers? R465 U2 (466.11) (NFB) are installed. would the R470 U4 (472.12) (PB) be better?

Post to thread

Message boards : Graphics cards (GPUs) : Cuda 101 / 1121 Problem

//