Advanced search

Message boards : News : Beta testing starting soon

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 28114 - Posted: 22 Jan 2013 | 14:35:20 UTC

We will start some beta testing today or tomorrow with the new app for cuda4.2.
If it works we plan to update acemdlong.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 28119 - Posted: 22 Jan 2013 | 15:26:08 UTC - in response to Message 28114.

New app is on in beta. Nate will be submitting some workunits soon.
Only windows for now.

gdf

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 28120 - Posted: 22 Jan 2013 | 17:07:14 UTC

There are 50 new simulations in the beta queue. These are very simple test workunits, but if everything goes well I will submit a whole batch to the long queue tomorrow. This will help test the new app.

This new app also features the smaller upload sizes, so it would be good if we can deploy it for all long tasks ASAP.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28124 - Posted: 22 Jan 2013 | 18:03:30 UTC - in response to Message 28120.
Last modified: 22 Jan 2013 | 18:32:03 UTC

Outcome Computation error
Client state Compute error
Exit status -1 (0xffffffffffffffff)
Computer ID 139859
Report deadline 27 Jan 2013 | 17:58:26 UTC
Run time 9.31
CPU time 1.78
Validate state Invalid
Credit 0.00
Application version ACEMD beta version v6.48 (cuda42)
Stderr output

I13R1-NATHAN_tstdhfr-0-1-RND6731_5

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code -1 (0xffffffff)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"

</stderr_txt>
]]>


22/01/2013 17:55:32 | GPUGRID | update requested by user
22/01/2013 17:55:38 | GPUGRID | Sending scheduler request: Requested by user.
22/01/2013 17:55:38 | GPUGRID | Requesting new tasks for NVIDIA
22/01/2013 17:55:40 | GPUGRID | Scheduler request completed: got 1 new tasks
22/01/2013 17:55:43 | GPUGRID | Started download of acemd.2764.cuda42.exe
22/01/2013 17:55:43 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-LICENSE
22/01/2013 17:55:45 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-LICENSE
22/01/2013 17:55:45 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-COPYRIGHT
22/01/2013 17:55:46 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-COPYRIGHT
22/01/2013 17:55:46 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-coor_file
22/01/2013 17:55:49 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-coor_file
22/01/2013 17:55:49 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-vel_file
22/01/2013 17:55:50 | GPUGRID | Finished download of acemd.2764.cuda42.exe
22/01/2013 17:55:50 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-idx_file
22/01/2013 17:55:52 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-idx_file
22/01/2013 17:55:52 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-pdb_file
22/01/2013 17:55:53 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-vel_file
22/01/2013 17:55:53 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-psf_file
22/01/2013 17:55:56 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-pdb_file
22/01/2013 17:55:56 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-par_file
22/01/2013 17:55:58 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-par_file
22/01/2013 17:55:58 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-conf_file_enc
22/01/2013 17:56:00 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-conf_file_enc
22/01/2013 17:56:00 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-metainp_file
22/01/2013 17:56:01 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-metainp_file
22/01/2013 17:56:01 | GPUGRID | Started download of I13R1-NATHAN_tstdhfr-0-hills_file
22/01/2013 17:56:02 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-hills_file
22/01/2013 17:56:07 | GPUGRID | Finished download of I13R1-NATHAN_tstdhfr-0-psf_file
22/01/2013 17:56:07 | GPUGRID | Starting task I13R1-NATHAN_tstdhfr-0-1-RND6731_5 using acemdbeta version 648 (cuda42) in slot 1
22/01/2013 17:56:20 | GPUGRID | Computation for task I13R1-NATHAN_tstdhfr-0-1-RND6731_5 finished
22/01/2013 17:56:20 | GPUGRID | Output file I13R1-NATHAN_tstdhfr-0-1-RND6731_5_1 for task I13R1-NATHAN_tstdhfr-0-1-RND6731_5 absent
22/01/2013 17:56:20 | GPUGRID | Output file I13R1-NATHAN_tstdhfr-0-1-RND6731_5_2 for task I13R1-NATHAN_tstdhfr-0-1-RND6731_5 absent
22/01/2013 17:56:20 | GPUGRID | Output file I13R1-NATHAN_tstdhfr-0-1-RND6731_5_3 for task I13R1-NATHAN_tstdhfr-0-1-RND6731_5 absent
22/01/2013 17:56:32 | GPUGRID | Started upload of I13R1-NATHAN_tstdhfr-0-1-RND6731_5_0
22/01/2013 17:56:32 | GPUGRID | Started upload of I13R1-NATHAN_tstdhfr-0-1-RND6731_5_7
22/01/2013 17:56:33 | GPUGRID | Finished upload of I13R1-NATHAN_tstdhfr-0-1-RND6731_5_0
22/01/2013 17:56:33 | GPUGRID | Finished upload of I13R1-NATHAN_tstdhfr-0-1-RND6731_5_7
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

RaymondFO*
Send message
Joined: 22 Nov 12
Posts: 72
Credit: 14,040,706,346
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28125 - Posted: 22 Jan 2013 | 18:36:00 UTC - in response to Message 28124.

I already had about 26 beta work units with similar results from one computer. From what I saw of the received beta work units that resulted in errors, others had similar results.

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28126 - Posted: 22 Jan 2013 | 18:54:48 UTC - in response to Message 28124.
Last modified: 22 Jan 2013 | 18:55:26 UTC

The same here: I18R1-NATHAN_tstdhfr-0-1-RND4913_3. W7 64bit GTX560Ti CC 2.1.

Profile Gattorantolo [Ticino]
Avatar
Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 28129 - Posted: 22 Jan 2013 | 21:25:05 UTC
Last modified: 22 Jan 2013 | 21:25:38 UTC

????
Project has no tasks???

22.01.2013 22:23:23 | GPUGRID | Requesting new tasks for NVIDIA
22.01.2013 22:23:23 | GPUGRID | update requested by user
22.01.2013 22:23:25 | GPUGRID | Scheduler request completed: got 0 new tasks
22.01.2013 22:23:25 | GPUGRID | No tasks sent
22.01.2013 22:23:25 | GPUGRID | No tasks are available for ACEMD beta version
22.01.2013 22:23:25 | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)
22.01.2013 22:23:25 | GPUGRID | Project has no tasks available

____________
Member of Boinc Italy.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28130 - Posted: 22 Jan 2013 | 21:27:51 UTC - in response to Message 28129.

Select Normal length tasks too - there are a few being generated, possibly resends.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Gattorantolo [Ticino]
Avatar
Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 28131 - Posted: 22 Jan 2013 | 21:29:38 UTC - in response to Message 28130.

I have now selected ALL... ACEMD standard, beta and long but nothing is coming in :-(
____________
Member of Boinc Italy.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 28133 - Posted: 22 Jan 2013 | 22:19:25 UTC - in response to Message 28131.

as soon as we have finished testing beta.
We will have more tasks.

gdf

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 28134 - Posted: 22 Jan 2013 | 22:20:07 UTC
Last modified: 22 Jan 2013 | 22:20:30 UTC

Yes, we are a little "dry" right now (slight shortage of tasks). Bear with us for the next 24 or 48 hours. We are trying to change to an updated application, which we are testing in the beta queue. As soon as the new app is installed, we will have plenty of new WUs to send.

As for the WUs in the beta queue, the first 50 seem to be erroring. We are going to cancel them and send new ones very soon, which should be fixed.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28137 - Posted: 22 Jan 2013 | 23:57:00 UTC - in response to Message 28134.
Last modified: 23 Jan 2013 | 0:09:55 UTC

Still getting many failures on two systems, but did get one that finished:

I1R1-NATHAN_tstdhfr3-0-10-RND6104_0 completed and reported

Task properties (some of):
0.686 CPU's + 1 NVIDIA GPU (device 1) (GTX 470)
5000000 GFLOPs
CPU time at last checkpoint 16:50
system memory 191/164

I1R1-NATHAN_tstdhfr3-0-10-RND6104_0 4061241 139265 22 Jan 2013 | 23:32:12 UTC 22 Jan 2013 | 23:52:50 UTC Completed and validated 1,122.68 1,122.68 7,500.00 ACEMD beta version v6.48 (cuda42)


http://www.gpugrid.net/result.php?resultid=6361095
Name I1R1-NATHAN_tstdhfr3-0-10-RND6104_0
Workunit 4061241
Created 22 Jan 2013 | 23:30:31 UTC
Sent 22 Jan 2013 | 23:32:12 UTC
Received 22 Jan 2013 | 23:52:50 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 139265
Report deadline 27 Jan 2013 | 23:32:12 UTC
Run time 1,122.68
CPU time 1,122.68
Validate state Valid
Credit 7,500.00
Application version ACEMD beta version v6.48 (cuda42)
Stderr output

<core_client_version>7.0.44</core_client_version>
<![CDATA[
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
# Time per step (avg over 250000 steps): 4.503 ms
# Approximate elapsed time for entire WU: 1125.769 s
called boinc_finish

</stderr_txt>
]]>


I noted this about the tasks that fail:
The Virtual Memory and the Working set memory remain at 0 for the few seconds that the tasks run (6 or 7).
If I suspend a Beta I get a communicating with the Boinc client message, and the tasks still Error out.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28139 - Posted: 23 Jan 2013 | 1:05:28 UTC - in response to Message 28137.
Last modified: 23 Jan 2013 | 1:14:49 UTC

The only Beta task I completed was called I1R1-NATHAN_tstdhfr3-0-10-RND6104
It was the first task generated.
My guess would be that the other tasks were not built correctly.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28144 - Posted: 23 Jan 2013 | 2:40:02 UTC
Last modified: 23 Jan 2013 | 2:45:39 UTC

15x error, 1x success, out of daily quota.
Finished task: I1R1-NATHAN_tstdhfr3-4-10-RND6104_0 . Any other task than I1R failed as skgiven mentioned.

Dylan
Send message
Joined: 16 Jul 12
Posts: 98
Credit: 386,043,752
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwat
Message 28147 - Posted: 23 Jan 2013 | 4:19:41 UTC

I just destroyed something like 20 of the new beta tasks by getting a computer error on them after only 2 seconds of starting them. I don't overclock my cards, and they are at as low as I can get them in terms of temperature, which is 70 C or less, so I don't know why they failed.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 28150 - Posted: 23 Jan 2013 | 10:27:56 UTC
Last modified: 23 Jan 2013 | 10:28:25 UTC

We're aware of the continued problems. We believe it is an issue with the application, but are checking the simulations as well. We'll update as soon as we know something.

You can see why we beta test every change!

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28159 - Posted: 23 Jan 2013 | 17:57:12 UTC - in response to Message 28150.

The NATHAN_tstdhfr4 batch seems to be working fine on all of my hosts.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28160 - Posted: 23 Jan 2013 | 18:05:04 UTC - in response to Message 28159.

However, there's still no detailed info in the stderr output file about the GPU used to process the workunit.... (which would be very useful for figuring out the source of some errors)

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28163 - Posted: 23 Jan 2013 | 18:55:11 UTC - in response to Message 28159.

The NATHAN_tstdhfr4 batch seems to be working fine on all of my hosts.
The same here.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28167 - Posted: 23 Jan 2013 | 20:45:46 UTC

BTW I've received some Ann***_r*-TONI_AGGd8 workunits from the long queue, so there is some new batch also.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 28168 - Posted: 23 Jan 2013 | 20:52:03 UTC

There was an issue with the simulations that has been corrected, and most seem to be finishing successfully now. We are having some other issues with the application that need to be addressed before we can deploy the app on Long. Sometime in the next few days, hopefully.

RaymondFO*
Send message
Joined: 22 Nov 12
Posts: 72
Credit: 14,040,706,346
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28169 - Posted: 23 Jan 2013 | 21:21:56 UTC - in response to Message 28168.

Recently received two Beta's. One ran successfully, the other did not.

Profile Gattorantolo [Ticino]
Avatar
Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 28206 - Posted: 25 Jan 2013 | 9:54:37 UTC - in response to Message 28169.
Last modified: 25 Jan 2013 | 9:54:51 UTC

Project has no tasks available
...sinece yestarday...sob..
____________
Member of Boinc Italy.

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28209 - Posted: 25 Jan 2013 | 15:51:17 UTC

and i still wonder why everybody in Austria except me get work units and make up to half a million points per day O.o
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 817,865,789
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28232 - Posted: 27 Jan 2013 | 8:33:16 UTC

Are the new workunits now with smaller upload files? Because the appversion didnt change so i wonder ;)
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28277 - Posted: 29 Jan 2013 | 11:17:07 UTC - in response to Message 28232.

I'm running another Beta now and the GPU utilization is only 40% (W7x64).
As a result the GPU downclocks a bit (from 1111 FOC to 1032, and from 1250OC to 1032). The fans are at 40% and the GPU temperature is 39°C. Reducing CPU usage for other projects makes no difference. Memory controller load is only 3%.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28279 - Posted: 29 Jan 2013 | 11:59:07 UTC
Last modified: 29 Jan 2013 | 12:08:11 UTC

trypsin_lig_1_-NOELIA_RL_equ-0-1-RND1972_0 finished at 2202s @ W7 64bit GTX560Ti CC2.1/872MHz, driver 310.90. Low GPU load (begun at 40%, finished at 30%), low VRAM load (119 MB), high CPU load (0,7 CPU core of HT 3770K) => low credit.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28281 - Posted: 29 Jan 2013 | 13:27:27 UTC - in response to Message 28279.

My GPU utilization also ended up around 36% for several tasks.

GPU power usage is also extremely low at 34% (only shown for Kepler cards)
The memory controller load of 3% is less than 1/10th of normal GPUGrid tasks, and the 83MB of virtual system memory is also very low.
I expect we are just running very limited tasks to test the app. That sort of credit would mean a 45K daily return, as opposed to a >300K for a GTX660Ti, but again it's an app test.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

werdwerdus
Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28282 - Posted: 29 Jan 2013 | 18:48:36 UTC

getting around 36-39% usage on GTX 470 in win7
____________
XtremeSystems.org - #1 Team in GPUGrid

Rantanplan
Send message
Joined: 22 Jul 11
Posts: 166
Credit: 138,629,987
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28283 - Posted: 29 Jan 2013 | 19:10:50 UTC - in response to Message 28282.

can not force project to send me beta wus, excluded everthing but beta , but nothings been sended.

werdwerdus
Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28284 - Posted: 29 Jan 2013 | 19:51:58 UTC - in response to Message 28283.
Last modified: 29 Jan 2013 | 19:52:28 UTC

must select "Run test applications?"

also, currently seeing 16-19% usage... not a good sign :(
____________
XtremeSystems.org - #1 Team in GPUGrid

Rantanplan
Send message
Joined: 22 Jul 11
Posts: 166
Credit: 138,629,987
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28285 - Posted: 29 Jan 2013 | 19:54:11 UTC - in response to Message 28284.
Last modified: 29 Jan 2013 | 19:54:42 UTC

must select "Run test applications?"

also, currently seeing 16-19% usage... not a good sign :(


i have selected this, but nothing happens, still LongRuns only on the run.

I stop to try, too many stopped wus.

Rantanplan
Send message
Joined: 22 Jul 11
Posts: 166
Credit: 138,629,987
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28286 - Posted: 29 Jan 2013 | 19:56:08 UTC - in response to Message 28284.

also, currently seeing 16-19% usage... not a good sign :(


lol read this:

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3118

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28288 - Posted: 29 Jan 2013 | 22:41:10 UTC
Last modified: 29 Jan 2013 | 22:44:15 UTC

I am running a "acemdbeta version 648 (cuda42)" task called "trypsin_lig_1205_run3-NOELIA_RL2_equ-0-1-RND5309_0"
... and, monitoring GPU Load using GPU-Z
... the task appears to only be using 33% of my EVGA GeForce GTX 660 Ti FTW (clocked at 1045.2 MHz), using 313.95 (beta) drivers.

This means the application is under-utilizing the GPU, right?

Rantanplan
Send message
Joined: 22 Jul 11
Posts: 166
Credit: 138,629,987
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28289 - Posted: 29 Jan 2013 | 22:48:28 UTC - in response to Message 28288.
Last modified: 29 Jan 2013 | 22:50:50 UTC

right, u could use 3 wus at the same time with an "app_config.xml" file. And boinc version 7.0.44. Specified the app name = "acemdbeta".

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28290 - Posted: 29 Jan 2013 | 22:49:11 UTC - in response to Message 28289.

... or is there a bug with the app that leads to poor GPU load?

Rantanplan
Send message
Joined: 22 Jul 11
Posts: 166
Credit: 138,629,987
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28291 - Posted: 29 Jan 2013 | 22:52:06 UTC - in response to Message 28290.

counldn´t run any beta app right now.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 28310 - Posted: 30 Jan 2013 | 16:27:24 UTC
Last modified: 30 Jan 2013 | 16:31:01 UTC

Rantanplan, please try to post these kinds of issues in "Number Crunching" or "Server and website". Make sure you don't have specific location preferences set for your computer under "GPUGRID preferences" It is explained more here: http://boinc.berkeley.edu/wiki/Preferences#Location-specific_preferences


As for those of you noting the low GPU usage on these WUs (Noelia's WUs, to be specific), you are correct. They are experimental and for this reason we put them on the beta queue. We must run a few of them as preparation for many, many more simulations that will come later on the long or short queue (where they will no longer have the low % utilization issue). They shouldn't result in low credit, and typically we compensate heavily for beta WUs. I will look into that.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28311 - Posted: 30 Jan 2013 | 16:56:36 UTC - in response to Message 28310.

Thanks for the confirmation about low GPU usage not being a bug, Nate.

For future beta tests, for beta aspects that function differently than the final version but are still expected behavior, it might be wise to announce them up front. As it was, I (and other users) had no idea that low GPU usage was expected for this beta run.

Thanks,
Jacob

Profile Stoneageman
Avatar
Send message
Joined: 25 May 09
Posts: 224
Credit: 34,057,224,498
RAC: 190
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28316 - Posted: 31 Jan 2013 | 17:35:24 UTC

Pulling out of BETA testing as several Noelia ones on different machines are stalling. This one got to 20 hours before I could abort it!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28320 - Posted: 31 Jan 2013 | 18:45:20 UTC - in response to Message 28316.
Last modified: 31 Jan 2013 | 20:33:34 UTC

6425256 4112331 30 Jan 2013 | 2:46:46 UTC 30 Jan 2013 | 4:18:27 UTC Completed and validated 3,695.83 2,613.81 1,050.00 ACEMD beta version v6.48 (cuda42)
6424748 4114336 30 Jan 2013 | 1:54:08 UTC 30 Jan 2013 | 3:16:39 UTC Completed and validated 1,780.12 188.09 7,500.00 ACEMD beta version v6.48 (cuda42)
6423051 4116566 31 Jan 2013 | 14:38:52 UTC 31 Jan 2013 | 16:17:40 UTC Completed and validated 3,434.75 2,419.86 1,050.00 ACEMD beta version v6.48 (cuda42)
6422891 4116409 31 Jan 2013 | 13:49:52 UTC 31 Jan 2013 | 15:20:13 UTC Completed and validated 2,472.68 1,485.57 1,050.00 ACEMD beta version v6.48 (cuda42)

Credit is still wonky too. The highlighted one would be about right for Long tasks, but the others are still paying too low.

- So the Nathan tasks are paying, but Noelia's arn't, just yet...
I84R1-NATHAN_tstdhfr5-0-1-RND5000_1

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,192,321,966
RAC: 10,573,188
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28328 - Posted: 1 Feb 2013 | 3:44:04 UTC

These Noelia beta units are entirely too dependent on the CPU. I had to free up another CPU to get them to run faster. They also have a habit of starting out fast and then slow down toward the end. I observed GPU usage can be as high as 77% at the beginning of the task, and drop to 13% in the end.

werdwerdus
Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28329 - Posted: 1 Feb 2013 | 4:51:18 UTC

beta units were working ok, albeit very low gpu usage, and then yesterday they started locking up my system. Had to do a hard reset. Now system locks up on boot when BOINC starts. Host id 125227, win7 x64 driver 310.70. Will try to update driver.
____________
XtremeSystems.org - #1 Team in GPUGrid

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28333 - Posted: 1 Feb 2013 | 13:49:12 UTC

Seeing LOTS of errors on the Beta tasks right now.
They say:
ERROR: file mdioload.cpp line 207: Error reading parmtop file

Did I do something wrong? (I recently added the WUProp project)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28334 - Posted: 1 Feb 2013 | 14:28:11 UTC - in response to Message 28333.

Note that different types of tasks have to be tested on the new Beta app.
Expect performance variations and problems - this is real β testing!

Thanks for reporting problems,
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28335 - Posted: 1 Feb 2013 | 14:34:39 UTC

Yesterday, I opted out of the beta testing because of the problems it was causing on my computers. Yet, I still kept getting beta WU's. I had to abort them, then set GPUGRID to "No new tasks". Forcing beta WU's on us is NOT COOL! You are jeopardizing the integrity of BOINC. Please let us know when it's SAFE to VOLUNTEER on this project again.
Regards (but disappointed),
Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28336 - Posted: 1 Feb 2013 | 15:12:04 UTC - in response to Message 28335.
Last modified: 1 Feb 2013 | 15:43:49 UTC

To opt out of these Beta tests you need to go to GPUGRID preferences,
Unselect "Run test applications?"
unselect "ACEMD beta"
and unselect "If no work for selected applications is available, accept work from other applications?"

You may need to do this for more than one Profile:

    Primary (default) preferences
    Separate preferences for home
    Separate preferences for work



Most if not all of the trypsin_lig_xxx_run1-NOELIA tasks failed on two of my systems. I also experienced a restart on one system (310.70 drivers).

All my Beta's fail except those ending in 1:

trypsin_lig_306_run4-NOELIA_RL2_equ-0-1-RND7002_1 4114936 1 Feb 2013 | 14:31:26 UTC 1 Feb 2013 | 15:06:00 UTC Completed and validated 1,929.98 1,431.43 1,050.00 ACEMD beta version v6.48 (cuda42)
trypsin_lig_543_run4-NOELIA_RL2_equ-0-1-RND9858_1 4116019 1 Feb 2013 | 1:46:55 UTC 1 Feb 2013 | 3:05:17 UTC Completed and validated 2,022.81 1,466.24 1,050.00 ACEMD beta version v6.48 (cuda42)
trypsin_lig_796_run3-NOELIA_RL2_equ-0-1-RND9844_1 4117167 1 Feb 2013 | 0:26:30 UTC 1 Feb 2013 | 1:46:55 UTC Completed and validated 3,715.23 3,286.65 1,050.00 ACEMD beta version v6.48 (cuda42)

It's ground hog day, again.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28338 - Posted: 1 Feb 2013 | 16:26:55 UTC - in response to Message 28336.

When I attempted to opt out of the GPUGRID Beta, I did:
"go to GPUGRID preferences,
Unselect "Run test applications?"
unselect "ACEMD beta"
and unselect "If no work for selected applications is available, accept work from other applications?""

I also did it for all my profiles:
Primary (default) preferences
Separate preferences for home
Separate preferences for work
Separate preferences for school

After I did all the above, why did I continue to get beta WU's?

Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28340 - Posted: 1 Feb 2013 | 17:10:05 UTC - in response to Message 28338.

Hi Rick,
If you opt out of everything, the settings will automatically reset to include everything (all applications)! This prevents people being attached to the project and doing nothing (unless they manually suspend the project from Boinc).
So you need to select at least one type of task, other than Beta tasks.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28342 - Posted: 1 Feb 2013 | 17:43:51 UTC - in response to Message 28340.

I have Short & Long projects selected for the default, and all 3 other profiles. Only the beta is unchected. This was the situation when I continued to get beta wu's and had to click "No new tasks" in the BOINC Client in order to keep my computers from freezing up and rebooting. I would love to resume GPUGRID work, but I'm concerned I'll still get more beta wu's, like I did yesterday. Is there a long lag between unchecking beta in the profile, and the scheduler getting the message? Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28343 - Posted: 1 Feb 2013 | 18:44:41 UTC - in response to Message 28342.

Unless there is some server side work being performed (which there might have been) it shouldn't be long at all. If you suspend the project, make the online changes and then click update that should be enough to get the new profiles settings into Boinc. Then when you resume the project you should be using your new profile. I couldn't replicate your problem, but I guess the scheduler might ask for new tasks before checking your profiles settings. Most people just forget to deselect the "Run test applications" options, but the scheduler's logic has taken some criticism of late.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28344 - Posted: 1 Feb 2013 | 20:15:08 UTC - in response to Message 28336.

Regarding Sponholz's inability to turn the betas off.. I haven't tried it myself so I can't verify that it works or does not work but I know this feature fails at 2 other projects and they claim it's a known bug in the server code. I have no idea if they are correct in saying it's a known bug, whether they have the feature misconfigured, whether they're using the same server code as GPUgrid or whatever.

The reason I have not tried turning beta off to test the feature is that I could never be certain that I am not not receiving betas simply because they have none to send, so it's pointless trying to test it.

skgiven wrote:
All my Beta's fail except those ending in 1:


I believe the 1 is referring to the iteration count rather than a particular type of task, if that's what you were thinking.

Many but not all of my beta NOELIA are failing too.I don't see any correlation between failure and iteration count. The failed ones all have "run2" in the name for example

trypsin_lig_965_run2-NOELIA_RL2_equ-0-1-RND9201


all "run1" NOELIAs seem to have crunched error free.

One "run2" completed successfully

All of my failed ones say:

Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
process exited with code 98 (0x62, -158)
</message>
<stderr_txt>
ERROR: file mdioload.cpp line 207: Error reading parmtop file
21:06:01 (6125): called boinc_finish

</stderr_txt>
]]>


I don't seem to be getting any more "run2" NOELIA so I assume the failure has been noted and the tap turned off.


____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28345 - Posted: 1 Feb 2013 | 20:39:47 UTC - in response to Message 28344.
Last modified: 1 Feb 2013 | 20:46:58 UTC

I know its just an iteration count, but thought there might have been something in it's name; for example, the app tried to write to a file thats name was generated presuming the iteration _1. Anyway, it was just coincidence that every other task was failing. When you think about it you are most likely to have successful tasks ending in _0, then _1 and then _2... If a task has already failed, and the more it's failed then the less likely it will succeed. So if you get a task ending in _5 or _6 the chances of a successful run are relatively poor.

Thanks for the 'undebugable' server code bug tip. Might explain some of the scheduler quirks, or not.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28347 - Posted: 1 Feb 2013 | 22:55:55 UTC - in response to Message 28345.

Thanks for the 'undebugable' server code bug tip. Might explain some of the scheduler quirks, or not.


hehe, it's debugable, but only by someone who knows for certain whether there are betas in the queue or not.
____________
BOINC <<--- credit whores, pedants, alien hunters

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28363 - Posted: 2 Feb 2013 | 11:02:20 UTC

Yesterday I got 3 Beta's and 2 errored out very quickly. By the third the system froze after a few seconds running the Beta and had to reboot to get any response. After the reboot I got the message that the graphic driver was restored after failing.
Windows Vista Ultimate x64, i7 12GB Ram, boinc 7.0.28, nVidia 285GTX driver 310.90

Long runs run fine on this machine, but take most of the time around 30 hours.
____________
Greetings from TJ

Profile [AF>WildWildWest] Al Tarf
Send message
Joined: 22 Oct 10
Posts: 6
Credit: 10,043,483
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 28366 - Posted: 2 Feb 2013 | 13:05:47 UTC - in response to Message 28363.

Yesterday I got 3 Beta's and 2 errored out very quickly. By the third the system froze after a few seconds running the Beta and had to reboot to get any response. After the reboot I got the message that the graphic driver was restored after failing.
Windows Vista Ultimate x64, i7 12GB Ram, boinc 7.0.28, nVidia 285GTX driver 310.90

Long runs run fine on this machine, but take most of the time around 30 hours.


Same thing for me :/

SJC_Steve
Send message
Joined: 31 Oct 12
Posts: 19
Credit: 184,741,704
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 28373 - Posted: 2 Feb 2013 | 15:41:06 UTC

Looks like the GPUGRID server is hung up, I've had this message in my Acctivity Log;

Sat 02 Feb 2013 08:27:17 AM MST | GPUGRID | Server can't open database

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28383 - Posted: 3 Feb 2013 | 0:33:46 UTC

Beta 6.48: 4 of 4 successful.

GTX660Ti, i7 3770K, Win 7 64, Driver 310.90, BOINc 7.0.44.
GPU utilization is only ~40%, though. Fan speed, temperature and power consumption are really low. Credits per time aren't good either - seems like it's not working very well.

MrS
____________
Scanning for our furry friends since Jan 2002

Dylan
Send message
Joined: 16 Jul 12
Posts: 98
Credit: 386,043,752
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwat
Message 28385 - Posted: 3 Feb 2013 | 1:48:28 UTC

I was under the impression that fan speed shall be kept at max during crunching, and for me, everything I do.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28391 - Posted: 3 Feb 2013 | 9:09:24 UTC - in response to Message 28385.

Not if GPU temperature is only 44°C.. and I value my ears ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,192,321,966
RAC: 10,573,188
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28419 - Posted: 5 Feb 2013 | 23:25:47 UTC

My observation, on running the NOELIA_RC3 beta units, is they running faster and are less CPU dependent the NOELIA_RL2 units, but they still need work. On Windows XP, the gpu usage is about 60% to 80%, and by freeing another cpu, this increased by a few points. On Windows 7, the gpu usage is about 30% to 50%, with no noticeable increase is gpu usage when I free up a cpu. Also on Windows 7, gpu usage decreases as the unit crunches, though on XP, I didn't notice the drop.

I just finish crunching a I47R1-NATHAN_tstdhfr6-0-1-RND5218_0 on a Windows XP machine. This had gpu usage at 97%. See link:


http://www.gpugrid.net/result.php?resultid=6463597

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,192,321,966
RAC: 10,573,188
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28421 - Posted: 6 Feb 2013 | 2:32:00 UTC - in response to Message 28419.

My observation, on running the NOELIA_RC3 beta units, is they running faster and are less CPU dependent the NOELIA_RL2 units, but they still need work. On Windows XP, the gpu usage is about 60% to 80%, and by freeing another cpu, this increased by a few points. On Windows 7, the gpu usage is about 30% to 50%, with no noticeable increase is gpu usage when I free up a cpu. Also on Windows 7, gpu usage decreases as the unit crunches, though on XP, I didn't notice the drop.

I just finish crunching a I47R1-NATHAN_tstdhfr6-0-1-RND5218_0 on a Windows XP machine. This had gpu usage at 97%. See link:


http://www.gpugrid.net/result.php?resultid=6463597



I have a couple more things to point out.

WU I45R1-NATHAN_tstdhfr6-0-1-RND9717_0 finished crunching on a Windows 7 machine. The gpu usage was 88%. There was no decrease in gpu usage from beginning to end.

http://www.gpugrid.net/result.php?resultid=6463595

The next thing is when the computers run these NOELIA_RC3 beta units, one after the other over several hours, the gpu usage drops for the later units compared to the previous ones, and the computers (both XP and 7) need to be rebooted, quite frequently (every few hours) to get the gpu usage level back up.

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28422 - Posted: 6 Feb 2013 | 13:11:31 UTC - in response to Message 28421.
Last modified: 6 Feb 2013 | 13:19:31 UTC

The next thing is when the computers run these NOELIA_RC3 beta units......
The same here (W7 64bit, GTX560Ti driver 310.90, core 6.12.34, i7-3770K one thread free for GPU, CPU process tamed to high). My system doesn't need to be rebooted, suspending and next enabling GPU computing via GUI is enough. I have noticed spontaneous/autonomic(not sure about the right expression) restarts of the acemd.2764.cuda42.exe process; suspend/enable GPU is necessary just after process restarting. I can see restarts by Balloon Message system of Process Tamer.

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28423 - Posted: 6 Feb 2013 | 14:53:03 UTC - in response to Message 28343.

6 days after changing all my profiles to NOT accept beta WU, they are still being forced on me. This is disgusting behavior, causing my machines to lock up and requiring a reboot. I've had to place a "No new tasks" embargo on GPUGRID until you get your act together. Moderator, please notify us when it's SAFE to resume volenteering for GPUGRID. Dissappointed, Rick
____________

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28424 - Posted: 6 Feb 2013 | 15:06:10 UTC - in response to Message 28423.
Last modified: 6 Feb 2013 | 15:06:56 UTC

Do you have set yes in the "run test apps" check-box? If you do, un-check it.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28425 - Posted: 6 Feb 2013 | 15:09:54 UTC - in response to Message 28422.
Last modified: 6 Feb 2013 | 16:02:22 UTC

Basically I was seeing the same issues on my systems:
My W7x64 system's GTX660Ti dropped to 1032MHz as the GPU utilization was not high enough (it's normally between 1175 to 1201MHz). Tasks ran at around 40 to 45% GPU Utilization. Temperatures were cool, memory usage was low, and I also noticed the 'autonomic' driver restarts, that required me to suspend and resume tasks.

Presently things are slightly better:
The GPU is at 1124MHz, utilization is at 53% and I have not experienced a driver crash for a while. I also noticed a very small increase in GPU utilization by 'freeing-up' another CPU thread (~2%), but this is normal and to be expected.

Rick, I haven't been able to replicate your problem, and I can't check your settings. All I can do is suggest the settings to use. If you used a manager such as BAM! or GridRepublic (and I can't tell that either) then that might be the source of the issue, and you might get help from them. However, if this is a bug in the server code it's down to the Boinc developers to resolve. They might be interested in some Boinc log files.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28426 - Posted: 6 Feb 2013 | 19:16:25 UTC - in response to Message 28425.

I've had the "ACEMD Beta" UNCHECKED for 6 days. I'm NOT using ANY manager. The only project I'm having unwanted beta WU's is from GPUGRID. However, I would really appreciate you letting me (us) know when the beta testing has stopped, so I can resume accepting ANY GPUGRID tasks. Thanks in advance, Rick
____________

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28428 - Posted: 6 Feb 2013 | 20:41:05 UTC - in response to Message 28426.

I've had the "ACEMD Beta" UNCHECKED for 6 days.


If skgiven says he can't replicate your problem then that's a pretty good indication you've got the settings wrong. On the other hand, there is a chance he turned off beta tasks just when there were no beta tasks in the queue and mistakenly figured he received no betas because the settings work as intended.

Or he didn't even bother trying to replicate your problem.

Here's what I'm gonna do... I'll turn off betas and then watch some other hosts' task lists. If I do not get betas while they do then that means there is no bug in the server code in use here and you screwed up.

Just to make this interesting and to attempt to force you to double-check your settings.... if it turns out you've screwed up then you have to attach your fastest host to my account via my weak account key and crunch 500,000 credits for me. I'll PM you and skgiven my password, you can verify my settings for yourselves.

If it turns out there is a bug then skgiven's gonna attach his fastest host to your account via the weak account key and crunch 500,000 credits for you. (Actually this is just a proposal, he hasn't agreed to this so far.)

So who's willing to put their money... errmmm credits.... where their mouth is? Do we have a deal, gentlemen?

btw, you're right, I bear no risk and stand only to gain, because I'm the one proposing a way to break the deadlock and I should be duly rewarded for my magnanimous effort

____________
BOINC <<--- credit whores, pedants, alien hunters

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28429 - Posted: 6 Feb 2013 | 22:58:31 UTC - in response to Message 28426.
Last modified: 6 Feb 2013 | 22:58:58 UTC

I've had the "ACEMD Beta" UNCHECKED for 6 days.

Do you have set yes in the "run test apps" check-box? If you do, un-check it.

These two are separate settings.
Unchecking "ACEMD Beta" won't stop the server sending you beta workunits.
You have to uncheck the "Run test apps" checkbox also.

Rick A. Sponholz
Avatar
Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28430 - Posted: 6 Feb 2013 | 23:30:43 UTC - in response to Message 28429.

Retvari, you are my hero! I also feel real dumb, because I did not see the run test apps box above the other application choices. Thanks all of you for helping me get it right. I'll begin accepting WU's again, and contribute to the cause. Regards,(and embarassed) Rick
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28431 - Posted: 6 Feb 2013 | 23:35:50 UTC - in response to Message 28429.

Dagorath & Rick, I have nothing to lose or gain either, other than fun so it's fine by me and I'm quite prepared to 'up the ante', significantly!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28432 - Posted: 6 Feb 2013 | 23:54:39 UTC - in response to Message 28431.

Dagorath & Rick, I have nothing to lose or gain either, other than fun so it's fine by me and I'm quite prepared to 'up the ante', significantly!


Lol! I had a hunch you had the confidence to up the ante :)

I apologize for doubting everybody's word and now that Rick has solved the problem I hope y'all can understand why I am a doubter. In one of Rick's earlier posts (message 28338) in this thread he assures everybody he unchecked the "Run test apps?" setting. Now it turns out he did not.

Not saying I'm any better, I've done much the same or even dumber on many occasions. First the knees go, then the eyes, before you know it you have to carry a map with your home marked with a big red X so you can find your way home.
____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28433 - Posted: 7 Feb 2013 | 0:24:42 UTC - in response to Message 28432.

Sometimes you need to double, triple, quadruple, 'and the next-one' check :)
We all make mistakes, and sometimes repeatedly - It's the nature of the Boinc.
Seen this one a few times, and I'm sure we will see it again...

BTW. Being wrong once in a while isn't a bad thing. I appreciate being corrected, not because I like being wrong, but because I like being right. If and when I'm wrong tell me so I can correct myself and so others can benefit. Ultimately this isn't about me or you, it's about the research, and for us the journey. If there is a problem, there is a solution, and when it gets fixed it's fixed for everyone.
Being a bit of a pious get
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile AdamYusko
Send message
Joined: 29 Jun 12
Posts: 26
Credit: 21,540,800
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 28434 - Posted: 7 Feb 2013 | 0:45:58 UTC

It does go both ways, I wanted to run some Beta Apps on one of my machines, and it turns out for nearly two weeks I forgot to Check the "Send Test Applications " box. Hopefully I start getting some soon, but its my slowest machine, and since the only tasks have been the Huge Long Runs, it only searches for work every two days or so.
____________

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,192,321,966
RAC: 10,573,188
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28435 - Posted: 7 Feb 2013 | 2:27:24 UTC

Besides testing a new application, are we doing any other scientific work with these betas?

noelia
Send message
Joined: 5 Jul 12
Posts: 35
Credit: 393,375
RAC: 0
Level

Scientific publications
wat
Message 28436 - Posted: 7 Feb 2013 | 9:31:19 UTC - in response to Message 28435.
Last modified: 7 Feb 2013 | 9:31:34 UTC

Actually yes, WUs in betaqueue right now are the first step of the simulations which will be sent to short queue. So this is already the real thing.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28437 - Posted: 7 Feb 2013 | 10:26:32 UTC - in response to Message 28436.
Last modified: 7 Feb 2013 | 10:28:42 UTC

I suspended a queued GPUGrid Beta this morning, allowed a running GPUGrid task to finished and then Resumed the GPUGrid Beta, trypsin_lig_127_1-NOELIA_RC3_equ-0-1-RND4222_0 (still running).
It started at 74% GPU utilization, rose to about 76% and then suddenly dropped to 66%. The power usage was about 65%, GPU temp 46°C with the fans fixed at 75%. Core clock was 1176MHz, which is normal. GPU utilization (W7x64) has subsequently dropped to around 55% and is fairly erratic; varies between 50 and 60% (two CPU threads free).

Thanks for fixing the credits ;)
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28438 - Posted: 7 Feb 2013 | 10:28:33 UTC

I have had a beta WU that completed without error:
6 Feb 2013 | 21:58:39 UTC 7 Feb 2013 | 2:17:47 UTC Completed and validated 12,202.02 2,548.72 13,500.00 ACEMD beta version v6.48 (cuda42).
It's about half time faster on my pc (i7, vista ultimate x64, GTX285) but uses approximately 800 seconds more CPU time. Temperature of the card was 60°C while non-beta it is 70-74°C.
____________
Greetings from TJ

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28439 - Posted: 7 Feb 2013 | 13:19:12 UTC - in response to Message 28437.

I suspended a queued GPUGrid Beta this morning, allowed a running GPUGrid task to finished and then Resumed the GPUGrid Beta, trypsin_lig_127_1-NOELIA_RC3_equ-0-1-RND4222_0 (still running).
It started at 74% GPU utilization, rose to about 76% and then suddenly dropped to 66%. The power usage was about 65%, GPU temp 46°C with the fans fixed at 75%. Core clock was 1176MHz, which is normal. GPU utilization (W7x64) has subsequently dropped to around 55% and is fairly erratic; varies between 50 and 60% (two CPU threads free).


Is this much detail useful for the developers? If it is then I would be willing to create a script that would poll the card for this info periodically, save it to disk and ftp the file to some address. It would have the task name of course and each entry would include % completion, fan speed, load, clocks, whatever info would be helpful. I have a library of Python functions that implement the GUI RPC calls and together with the nvidia-settings and nvcontrol apps anything is possible. Would probably run on Windows too.


____________
BOINC <<--- credit whores, pedants, alien hunters

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 28450 - Posted: 8 Feb 2013 | 19:04:18 UTC - in response to Message 28439.

Hi,
we cannot test until the beta queue is cleared.

There is a problem with the new app and it has been difficult test it out.

gdf

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28454 - Posted: 8 Feb 2013 | 23:46:26 UTC - in response to Message 28450.

Hi,
we cannot test until the beta queue is cleared.

There is a problem with the new app and it has been difficult test it out.

gdf


Well I will help to clear the beta queue, but I don't get them often. I have checked "run test applications" and " beta".
____________
Greetings from TJ

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,192,321,966
RAC: 10,573,188
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28466 - Posted: 10 Feb 2013 | 2:14:32 UTC

I had particularly bad experience with a beta unit. Most bad WU simply give you computation error message when they crash, and you go on to the next WU, without any reboot or computer crash. This unit ran for a few seconds, froze up the computer, then blue screen, and the computer reboots. It did this few times, before I aborted the unit. It also cause another perfectly good WU to crash as well.

Here is the link:

http://www.gpugrid.net/workunit.php?wuid=4137081

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28467 - Posted: 10 Feb 2013 | 13:37:53 UTC - in response to Message 28466.

This task appeared to do something similar; cause a system reboot somehow.

6486709 4137080 139265 10 Feb 2013 | 5:57:04 UTC 10 Feb 2013 | 13:11:46 UTC Error while computing 2.15 0.03 --- ACEMD beta version v6.48 (cuda42)

http://www.gpugrid.net/workunit.php?wuid=4137080

Thanks,
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,603,811,851
RAC: 8,786,588
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28476 - Posted: 11 Feb 2013 | 23:53:07 UTC

I think this one belongs in the list too:

http://www.gpugrid.net/workunit.php?wuid=4138484

It gave me a BSOD after 7 seconds with

The problem seems to be caused by the following file: dxgkrnl.sys

STOP: 0x00000116 (0xfffffa801b5de010, 0xfffff88006dc4404, 0x0000000000000000,
0x000000000000000d)

That's from BlueScreenView: the original had a line about an NVIDIA driver crash and not restarting within the time allowed. BSV doesn't seem able to retrieve that information - I'll record it manually (and accurately) if it happens again.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 28482 - Posted: 12 Feb 2013 | 15:17:04 UTC

Thanks for bringing it to our attention. There was an issue building a small number of the simulations, which our checks didn't catch before they were sent out. We have cancelled the work units that were crashing machines, but it is possible that there are others so let us know if it happens again. Crashing your machines is obviously the last thing we want to do. In the future we can avoid this with additional checks we'll be doing for this type of work unit.

nate

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28483 - Posted: 12 Feb 2013 | 15:29:39 UTC

This WU: trypsin_lig_904_3-NOELIA_RC3_equ-0-1-RND0962, resultied in the nVidia driver to stop. However it recovered automatically without booting the system.
____________
Greetings from TJ

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28486 - Posted: 13 Feb 2013 | 9:31:45 UTC
Last modified: 13 Feb 2013 | 9:38:16 UTC

Been running a few betas now. GPU load varies between WUs in the range of 2x - 3x% (GTX660Ti). Accordingly, Power consumption, temperature, fan speed and memeory controller load are really low. Runtimes for 1500 credit-WUs vary between 1700 and 4000s.

Edit: another observation.. GPU Grid beta 6.48 gets an entire core of my i7. However, GPU load decreased considerably as soon as I ran one Einstein CPU task along. Running 6 Einsteins reduces GPU load by ~25%. I don't think the regular apps have been this fragile.

MrS
____________
Scanning for our furry friends since Jan 2002

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,603,811,851
RAC: 8,786,588
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28494 - Posted: 13 Feb 2013 | 16:22:27 UTC - in response to Message 28476.

Had a repeat of my BSOD, though this time it seemed to be another project which triggered it.

The exact phrase on screen is:

"Attempt to reset the display driver and recover from timeout failed" (stop 116)

There was also a reference to nvlddkm.sys

Host is an i7 3770K with dual Gainward Phantom GTX 670 - driver 310.90 WHQL

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,192,321,966
RAC: 10,573,188
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28502 - Posted: 13 Feb 2013 | 23:14:18 UTC

Here is a beta unit that ran rather slowly.

trypsin_lig_491_run4-NOELIA_RL3_equ-0-1-RND5688

13 Feb 2013 | 11:57:24 UTC 13 Feb 2013 | 19:50:06 UTC Completed and validated 26,096.32 25,685.34 1,500.00 ACEMD beta version v6.48 (cuda42)


See link:

http://www.gpugrid.net/workunit.php?wuid=4142465

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28517 - Posted: 14 Feb 2013 | 15:40:08 UTC

I have set my system to accept only beta WU's to help clear the queue. However today I got 9 WU's that error out quickly. All are Noelia's run 2 and run 3 and only one run 4. All the run4-Noelia from yesterday and this morning (11 and 4) finished correctly.

This is the error message:
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
ERROR: file mdioload.cpp line 207: Error reading parmtop file
called boinc_finish

</stderr_txt>
]]>

Wing(wo)ma(e)n had error for same WU's.
____________
Greetings from TJ

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28531 - Posted: 15 Feb 2013 | 15:33:56 UTC
Last modified: 15 Feb 2013 | 15:34:33 UTC

This one: trypsin_lig_1259_run2-NOELIA_RL3_equ-0-1-RND7950 and 2 more (1 run1) where resulting in an unresponsive system. Mouse pointer was moveable not click-able. All windows freeze for a few minutes then screen blank, and back with a notification that display driver had recovered, but again all windows freeze immediately. I had to abort these step by step in the seconds the system was responsive.
WinVista x64 ultimate, i7, 12GB, GTX285, driver 310.90 CUDA version 5, BOINC 7.0.28

The system is now running a short run (cuda31) without problems so far.
____________
Greetings from TJ

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,603,811,851
RAC: 8,786,588
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28564 - Posted: 17 Feb 2013 | 15:11:21 UTC - in response to Message 28450.

Hi,
we cannot test until the beta queue is cleared.

OK, we seem to be done. The server status page says there are no tasks in the Beta queue, and my log just got these messages:

17/02/2013 15:07:56 | GPUGRID | Reporting 1 completed tasks
17/02/2013 15:07:56 | GPUGRID | Requesting new tasks for NVIDIA
17/02/2013 15:07:58 | GPUGRID | No tasks sent
17/02/2013 15:07:58 | GPUGRID | No tasks are available for ACEMD beta version

So, fastening my seat belt and holding on tight for the next twist in the roller-coaster ride that is beta testing... :-)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28565 - Posted: 17 Feb 2013 | 15:28:28 UTC - in response to Message 28564.
Last modified: 17 Feb 2013 | 15:57:42 UTC

ACEMD beta version 0 248 0.57 (0.17 - 1.26) 111

There are 248 in progress.
If they fail they will return and go back into the queue to be resent repeatedly until the x_7th failure.
If they succeed do they auto-generate a new task?

- Just got a couple of Beta's. The first one basically killed my system!
Lots of GPU driver restarts, Blue screen/crash, recovered to windows, Boinc starts, No GPU detected, closed and opened Boinc, Still no GPU detected. Restarted, Windows wouldn't start up; just keeps restarting.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,603,811,851
RAC: 8,786,588
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28569 - Posted: 17 Feb 2013 | 18:26:31 UTC - in response to Message 28565.

Well, if a task crashes, a replacement is generated. I just got

http://www.gpugrid.net/workunit.php?wuid=4144581

courtesy of somebody who hoarded it for three days and then crashed it (shouldn't really be doing Beta testing on an anonymous host, and certainly not for this project on a host with a three-day turnround)

skgiven, if you're warning us about rogue tasks which are likely to blue-screen our machines, can you identify the tasks, please? I'd like to know whether my resend is from the same batch, so I can make appropriate decisions how to leave my test machine when I go out later this evening.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28570 - Posted: 17 Feb 2013 | 19:20:39 UTC - in response to Message 28569.

Probably this one, trypsin_lig_904_4-NOELIA_RC3_equ-0-1-RND8427_4

I would suggest you run a nice Long WU :)

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Post to thread

Message boards : News : Beta testing starting soon

//