Message boards : News : WARNING/CHALLENGE: VERY LONG WU (VERYLONG_CXCL12_confAna)
Author | Message |
---|---|
We just launched 400 very long WU (they will take about 24h in a 780GTX) named VERYLONG_CXCL12_confAna whose results we need as soon as possible (we are in a hurry). They come with a credit+bonus of 400K. Please, if you don't have a good graphic card, reject them. For the brave ones, take it as a challenge and see you on the performance tab ;) | |
ID: 39575 | Rating: 0 | rate: / Reply Quote | |
Please, if you don't have a good credit card, reject them. Would love to do as you ask but its not always possible as some computers are remote. I think he means graphic card. :-) | |
ID: 39578 | Rating: 0 | rate: / Reply Quote | |
Nothing here ! Too bad ! Snif, snif ;) | |
ID: 39579 | Rating: 0 | rate: / Reply Quote | |
hahaha sorry. what was i thinking about...? | |
ID: 39580 | Rating: 0 | rate: / Reply Quote | |
If it was possible to select the WU's, I would take some of the VERY LONG. | |
ID: 39581 | Rating: 0 | rate: / Reply Quote | |
If it was possible to select the WU's, I would take some of the VERY LONG. Je ne pense pas que ce soit possible Phil... C'est au petit bonheur la chance, IMHO... Ou sinon je suis preneur aussi ;) Bonjour à l'Alliance :) ____________ [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres | |
ID: 39583 | Rating: 0 | rate: / Reply Quote | |
Just checked my tasks. No joy on the high value target. | |
ID: 39585 | Rating: 0 | rate: / Reply Quote | |
Hello Thomas, | |
ID: 39586 | Rating: 0 | rate: / Reply Quote | |
Received 2 : GERARD_VERYLONG_CXCL12 | |
ID: 39587 | Rating: 0 | rate: / Reply Quote | |
This may be the happiest day of my life. | |
ID: 39588 | Rating: 0 | rate: / Reply Quote | |
Received 2 : GERARD_VERYLONG_CXCL12 Veinard ! :) GG ! ____________ [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres | |
ID: 39590 | Rating: 0 | rate: / Reply Quote | |
Looking forward to compute some of them on my GTX 970 :3 | |
ID: 39591 | Rating: 0 | rate: / Reply Quote | |
Hi, I've received one very long GERARD on my 560ti with 448 shaders. Does it make sense to to crunch it to the end? | |
ID: 39592 | Rating: 0 | rate: / Reply Quote | |
I got another short run-.- | |
ID: 39593 | Rating: 0 | rate: / Reply Quote | |
Got one of these tasks on a computer with a NVIDIA GTX 680 factory OC'd video card that rarely has a processing error. I will run it unless you need the work unit assigned to another computer with a stronger/faster video card. | |
ID: 39595 | Rating: 0 | rate: / Reply Quote | |
Saw the call for the challenge and was able to download 2 VL's while on lunch. I've got 15 hours from the time this message is being typed until my Nvidia Titan's attention can be turned to them. Excited to see if these use the full processing power of the card or if I have to run them two(or more) at a time like most GPU WU's. | |
ID: 39596 | Rating: 0 | rate: / Reply Quote | |
These are on the, uh, normally long queue right? | |
ID: 39598 | Rating: 0 | rate: / Reply Quote | |
These are on the, uh, normally long queue right? Dayle, I did switch my settings from accepting shorts and beta work to just long WU's. During my lunch break was able to get two WU's before being told to fly a kite by Boinc when requesting more. ;) | |
ID: 39599 | Rating: 0 | rate: / Reply Quote | |
We just launched 400 very long WU (they will take about 24h in a 780GTX) named VERYLONG_CXCL12_confAna whose results we need as soon as possible (we are in a hurry). I've got two of them on my GTX 780Ti host. According to my linear approximation made at 12.3% progress, the total computing time will be about 18 hours and 15 minutes. They come with a credit+bonus of 400K. That's nice. Please, if you don't have a good graphic card, reject them. That's an inappropriate way to arrange such batches. You should set up a third queue for these purposes. I couldn't receive one of these workunits on my GTX980 host. I'm sure I'm not alone with that. EDIT: I had to abort 10 other workunits to receive one of these on my GTX980 host. That's why your method is dangerous: it propagates failed workunits (by encouraging user intervention) For the brave ones, take it as a challenge and see you on the performance tab ;) Challenge accepted. :) | |
ID: 39600 | Rating: 0 | rate: / Reply Quote | |
Found one already downloaded and running on a GTX 670 - 10% done, and the figures suggest ~36 hours total, or mid-morning Saturday UTC for completion. Plus whatever it takes to upload the result file - most tasks generate ~5MB per hour on this card, so I'm expecting a 180 MB final upload! | |
ID: 39601 | Rating: 0 | rate: / Reply Quote | |
I have two of these running on my GTX 680 host, it will take 28 hours and 40 minutes for the faster (1200MHz) card to finish. | |
ID: 39603 | Rating: 0 | rate: / Reply Quote | |
Is a intel i3 with a gtx 660 fast enough? | |
ID: 39604 | Rating: 0 | rate: / Reply Quote | |
Frustrating. I have THREE GTX 780ti SC and not even one very long WU. Aborted several long run WU to see if I could get just one and nothing. | |
ID: 39605 | Rating: 0 | rate: / Reply Quote | |
aborted about 30 - no luck in getting the rare ones for 970. | |
ID: 39606 | Rating: 0 | rate: / Reply Quote | |
Is GTX 970M fast enough to compute this project? | |
ID: 39608 | Rating: 0 | rate: / Reply Quote | |
Looks like 16.5 hours here! YAY! | |
ID: 39610 | Rating: 0 | rate: / Reply Quote | |
Well, I don´t care if I don´t get these WU´s.. It´s ok when I can spend the little things too, it´s for Science. :) | |
ID: 39613 | Rating: 0 | rate: / Reply Quote | |
Please don't cancel any more WUs to get them! I checked and they are already all taken at this point so only if they fail on some machines will they get added to the queue again. They are only a single step so they are not coming back. | |
ID: 39616 | Rating: 0 | rate: / Reply Quote | |
It looks like my GTX 980 will finish this very long workunit in 15 hours and 40 minutes. It's 2.5 hours shorter than the estimated running time of the GTX 780Ti. These workunits has low memory controller load (29-31% on my GTX980@3.5GHz, 11% on my GTX780Ti@3.5GHz, 24% on my GTX780Ti@2.9GHz). | |
ID: 39617 | Rating: 0 | rate: / Reply Quote | |
Is a intel i3 with a gtx 660 fast enough? According to my estimation your host will finish in 2 days 5 hours and 20 minutes, so it fast enough to finish before the deadline, but not fast enough to earn bonus for returning the result within 2 days. | |
ID: 39618 | Rating: 0 | rate: / Reply Quote | |
I have one on a GTX980. 43% complete after 11 hours. | |
ID: 39619 | Rating: 0 | rate: / Reply Quote | |
Please don't cancel any more WUs to get them! I checked and they are already all taken at this point so only if they fail on some machines will they get added to the queue again. They are only a single step so they are not coming back. Gerard never asked anyone to ABORT WU's to get their hands on one of these long WU's. While you may agree with Retvari he was the first to post that he had ABORTED other WU's for no other reason than to get one of these on a particular card. His warning was a "Self Fulfilling Prophecy" and others followed. Maybe Retvari decided that these units gave more credits and a place on the "Performance" tab and were more important than other scientists WU's. In addition, he might of thought he deserved more of them than anyone else. | |
ID: 39620 | Rating: 0 | rate: / Reply Quote | |
Gerard never asked anyone to ABORT WU's to get their hands on one of these long WU's. It is quite insulting to others actually, that you are saying they couldn't figure it out without my "advice". Maybe Retvari decided that these units gave more credits and a place on the "Performance" tab and were more important than other scientists WU's. In addition, he might of thought he deserved more of them than anyone else. I'm just a human. Sorry. I made mistakes. But I really like to help other people learn from my mistakes, especially those which were induced by their mistakes. | |
ID: 39621 | Rating: 0 | rate: / Reply Quote | |
Started the VL WU's early this morning while grabbing a cup of coffee. One WU only used around 60% of the Titan(May have misread, it was early :)). Currently running them concurrently and achieved 75-80 percent utilization. | |
ID: 39622 | Rating: 0 | rate: / Reply Quote | |
... so I'm expecting a 180 MB final upload! My task has now reached 47%, and I'm beginning to get a little bit worried by this. File 2x23-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3250_0_9 has already reached 84,510 KB, and if it continues to grow linearly with progress, the final size will be very close to the 180 MB I predicted. But the upload file specification says <file> <name>2x23-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3250_0_9</name> <nbytes>0.000000</nbytes> <max_nbytes>128000000.000000</max_nbytes> <status>0</status> <upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url> </file> <max_nbytes> 128,000,000 is 128 MB, no? And we are going to exceed that by nearly 50%? Fortunately, Retvari's GTX 980 mine-canary should finish in a couple of hours, and the rest of us tortoises will find out whether the hare has crashed and burned - or not. | |
ID: 39623 | Rating: 0 | rate: / Reply Quote | |
Fortunately, Retvari's GTX 980 mine-canary should finish in a couple of hours, and the rest of us tortoises will find out whether the hare has crashed and burned - or not. Ahem. We are all working on the same side. If Retvari Zoltan's work units fail, ours will also fail...just a little slower. | |
ID: 39624 | Rating: 0 | rate: / Reply Quote | |
Please don't cancel any more WUs to get them! I checked and they are already all taken at this point so only if they fail on some machines will they get added to the queue again. They are only a single step so they are not coming back. Strongly Disagree with you. I also asked if one should abort WU's to obtain the "VERY LONG". But in French, so I guess you did not understand. Furthermore, it was a question only, not a suggestion. http://www.gpugrid.net/forum_thread.php?id=3988&nowrap=true#39586 I don't know Retvari Zoltan in the real life, but your post is insulting, as he is very often posting here in order to HELP crunchers + to give very good advices. We are here to help science, not to denigrate a particular person. Retvari Zoltan, I want to thank you for the help and advices you give on this forum. Best Regards, Phil1966 | |
ID: 39626 | Rating: 0 | rate: / Reply Quote | |
Good points, both of you! (Richard Haselgrove & Dayle) <file>
<name>4x14-GERARD_VERYLONG_CXCL12_confAna-0-1-RND7430_0_8</name>
<nbytes>0.000000</nbytes>
<max_nbytes>256000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>4x14-GERARD_VERYLONG_CXCL12_confAna-0-1-RND7430_0_9</name>
<nbytes>0.000000</nbytes>
<max_nbytes>128000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file> However, it could be easily fixed by manually editing these numbers with a text editor (it is advised to exit BOINC manager & the scientific applications before doing so). | |
ID: 39627 | Rating: 0 | rate: / Reply Quote | |
It looks like the 970 PNY will take about 30 hours to complete these WU's while the 970 Gaming G1 will need about 24 hours. | |
ID: 39628 | Rating: 0 | rate: / Reply Quote | |
Good points, both of you! (Richard Haselgrove & Dayle) Are we supposed to change something manually ? Do we need to increase the <max_nbytes> ? Or do we need to check if this is necessary before doing so ? | |
ID: 39629 | Rating: 0 | rate: / Reply Quote | |
However, it could be easily fixed by manually editing these numbers with a text editor (it is advised to exit BOINC manager & the scientific applications before doing so). Agreed. It would be required to exit BOINC first (stopping the client - the Manager doesn't matter), before making the edit - very carefully, and using only a plain-text editor. Client_state.xml is only ever read by BOINC at start-up - it's effectively a hard-copy backup file, so any edits made while the BOINC client is running are overwritten by the next dump from memory. I think we should wait for confirmation that a real problem exists (or doesn't, as the case may be) before advocating wholesale file editing. I did send a PM to Gerard after my last post here, suggesting that he monitors the early returns, but I haven't received any feedback yet. | |
ID: 39630 | Rating: 0 | rate: / Reply Quote | |
However, it could be easily fixed by manually editing these numbers with a text editor (it is advised to exit BOINC manager & the scientific applications before doing so). It is proven: 23/01/2015 16:47:23 | GPUGRID | Computation for task 4x14-GERARD_VERYLONG_CXCL12_confAna-0-1-RND7430_0 finished
23/01/2015 16:47:23 | GPUGRID | Output file 4x14-GERARD_VERYLONG_CXCL12_confAna-0-1-RND7430_0_9 for task 4x14-GERARD_VERYLONG_CXCL12_confAna-0-1-RND7430_0 exceeds size limit.
23/01/2015 16:47:23 | GPUGRID | File size: 186931932.000000 bytes. Limit: 128000000.000000 bytes | |
ID: 39631 | Rating: 0 | rate: / Reply Quote | |
I was afraid so. What's more, the error is declared immediately after the task finishes: I was hoping that possibly disabling networking as the task approaches completion would buy enough time for an edit, but evidently not. | |
ID: 39632 | Rating: 0 | rate: / Reply Quote | |
This is deadly serious! Every workunit will fail to upload, so the whole "challenge" will come to naught without user intervention. | |
ID: 39634 | Rating: 0 | rate: / Reply Quote | |
It sounds like the batch is bad. | |
ID: 39635 | Rating: 0 | rate: / Reply Quote | |
OK, so here are the KISS ('Keep it simple, stupid') instructions. <file>
<name>2x23-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3250_0_9</name>
<nbytes>0.000000</nbytes>
<max_nbytes>128000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file> 7) Make sure you have exactly the right section: the last number before </name> should be _9, and there should be an <upload_url> line. 8) Change the first three numbers after <max_nbytes> from 128 to 256. Just those three numbers - don't accidentally delete any punctuation, change the number of zeroes, or make any other change. 9) Repeat steps (6), (7) and (8) for each separate VERYLONG task that you have on the system. 10) Save the file, restart BOINC, and relax. All done. | |
ID: 39636 | Rating: 0 | rate: / Reply Quote | |
It sounds like the batch is bad. And they possibly will be. But in the meantime, we can do something to help. Look at the opening post in this thread: these are an urgent research challenge, "whose results we need as soon as possible (we are in a hurry)". That was over 24 hours ago, and many tasks will be approaching completion. We can save them, and get good data back today or tomorrow. If we have to go through the batch recall process (with the weekend already started in Barcelona), it'll probably be Tuesday before any results are returned. Why wait? | |
ID: 39637 | Rating: 0 | rate: / Reply Quote | |
While your instructions are very well laid out, manual intervention is something that a lot of people probably won't be doing, in my estimation. | |
ID: 39638 | Rating: 0 | rate: / Reply Quote | |
4) Find the file 'client_state.xml' in your BOINC Data folder. Under Windows, this is likely - if you accepted the default installation setting - to be C:\Programdata\BOINC: under Linux, it might be /var/lib/boinc As the ProgramData folder is hidden, it's easier to make a shortcut to notepad, which immediately opens the client_state.xml for editing: 1. right click on an empty area of your desktop 2. select new -> shortcut 3. enter the follwing text to the input field: C:\Windows\System32\notepad.exe c:\ProgramData\BOINC\client_state.xml 4. click next 5. enter a self-explanatory name for this shortcut: edit client_state.xml | |
ID: 39639 | Rating: 0 | rate: / Reply Quote | |
Instead, it seems most appropriate, to me, for them to immediately cancel the batch and re-issue a corrected one. Why wait? :) Because I for one would resent the waste of 20 hours of perfectly salvageable crunching. Save what can be saved, and only reissue the remainder. This project tends to attract people with a pretty high level of motivation: let them demonstrate that their skill level matches their determination, before you condemn them all as incapable. But I agree - many tasks will fail, simply because not enough people will pick up news of the problem from this thread - unless the admins can post a second news item flagged to appear as a Notice? | |
ID: 39640 | Rating: 0 | rate: / Reply Quote | |
You could just open File Explorer, and then in the address bar, type "%ProgramData%" (without the double-quotes). | |
ID: 39641 | Rating: 0 | rate: / Reply Quote | |
While your instructions are very well laid out, manual intervention is something that a lot of people probably won't be doing, in my estimation. I agree. The instructions intended for those who want to save 20+ hours of crunching. If the project aborts the batch, those pieces which are under processing won't be aborted, only those which are sitting in the queue. | |
ID: 39642 | Rating: 0 | rate: / Reply Quote | |
I will follow your instructions. | |
ID: 39643 | Rating: 0 | rate: / Reply Quote | |
I agree.Most wont even know. I last checked on mine about 7.5 hrs in all seamed well. Next time I had a chance to check on it it was competed. When I came to the site to see how long it did actually take I found it of course did not upload. So 17.5 hrs wasted. Bummer :( It uploaded at 15.41 UTC so the info was too late for me | |
ID: 39644 | Rating: 0 | rate: / Reply Quote | |
Please do not panic. I'm going to discuss this issue with my superiors. I'll keep you updated. | |
ID: 39645 | Rating: 0 | rate: / Reply Quote | |
Thank you to Richard Haselgrove, Retvari Zoltan* and Jacon Klein for their explanation, pro-activity and involvment. | |
ID: 39646 | Rating: 0 | rate: / Reply Quote | |
OK, so here are the KISS ('Keep it simple, stupid') instructions. Thank you guys for bringing this info out, I'm hoping by the time I get home I will be able to make the edits in time. Like others have stated it is wasteful to use all this energy only to fail during upload. | |
ID: 39647 | Rating: 0 | rate: / Reply Quote | |
Richard, Why is it that max_nbytes only needs to be change for the _9 file, and not the others? - TIA <file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_1</name>
<nbytes>0.000000</nbytes>
<max_nbytes>50000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_2</name>
<nbytes>0.000000</nbytes>
<max_nbytes>50000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_3</name>
<nbytes>0.000000</nbytes>
<max_nbytes>50000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_4</name>
<nbytes>0.000000</nbytes>
<max_nbytes>256000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_5</name>
<nbytes>0.000000</nbytes>
<max_nbytes>10000000.000000</max_nbytes>
<status>0</status>
<gzip_when_done/>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_6</name>
<nbytes>0.000000</nbytes>
<max_nbytes>10000000.000000</max_nbytes>
<status>0</status>
<gzip_when_done/>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_7</name>
<nbytes>0.000000</nbytes>
<max_nbytes>10000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_8</name>
<nbytes>0.000000</nbytes>
<max_nbytes>256000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_9</name>
<nbytes>0.000000</nbytes>
<max_nbytes>256000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_10</name>
<nbytes>0.000000</nbytes>
<max_nbytes>10000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file>
<file>
<name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_11</name>
<nbytes>0.000000</nbytes>
<max_nbytes>5000000.000000</max_nbytes>
<status>0</status>
<upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url>
</file> ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help | |
ID: 39648 | Rating: 0 | rate: / Reply Quote | |
Richard, Because, in my experience, only one of the multiple upload files grows proportionately to the runtime of the task. In the case of the VERYLONG tasks, it's file _9 which is the one which grows - and in this case grows too much. No harm will be done if you extend the limits on the other files as well, but in my experience doing multiple repetitive edits is when I get bored, tired - and sloppy. And I make mistakes. Client_state.xml is a very important and sensitive file, and if you break the file 'shape' - its XML structure - in even the most trivial way, you can lose a lot of work - not just from the project you're trying to tweak. I'd always advocate making the smallest, simplest change possible - hence KISS. | |
ID: 39649 | Rating: 0 | rate: / Reply Quote | |
My GTX 980 host is uploading the first result. | |
ID: 39650 | Rating: 0 | rate: / Reply Quote | |
OK, so here are the KISS ('Keep it simple, stupid') instructions. Thanks Richard. I found 3 of the verylong WUs and executed your fix. One thing that may make this easier: I simply searched for _0_9 and in each case the first instance found was the correct one. Just make sure it's a verylong WU and not a Noelia that you're editing. All 3 of these are on 750Ti cards and completion looks to be around 50 hours. Edit: Hmm from the example above it looks like some of the WUs may have _1_9 instead of _0_9 that all of mine had: <file> <name>2x10-GERARD_VERYLONG_CXCL12_confAna-0-1-RND3907_1_9</name> <nbytes>0.000000</nbytes> <max_nbytes>256000000.000000</max_nbytes> <status>0</status> <upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url> </file> | |
ID: 39651 | Rating: 0 | rate: / Reply Quote | |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. | |
ID: 39652 | Rating: 0 | rate: / Reply Quote | |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. It should be fine for newly created WUs, but I'm not sure whether the change will propagate to automatically-generated replacements for tasks which fail - we'll need to keep an eye on those. It certainly won't be passed to tasks which are already 'out in the field' - on volunteers' computers. They will have to be modified manually, or allowed to fail. | |
ID: 39653 | Rating: 0 | rate: / Reply Quote | |
The upload of the first result is finished. | |
ID: 39654 | Rating: 0 | rate: / Reply Quote | |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. Even if you "update" the project ? | |
ID: 39655 | Rating: 0 | rate: / Reply Quote | |
My GTX 980 host is uploading the first result. Credit 600,000.00 Congratulations on the home run - and thanks for the confirmation that the file edit is effective. | |
ID: 39656 | Rating: 0 | rate: / Reply Quote | |
The BOINC administrator just raised the upload limit to 512 Mb, please let us know if you can upload the WU now. I've received a 1894-NOELIA_BI3_unbind-1-10-RND9593_0 just now, and the file info has the old size limit: <file_info>
<name>1894-NOELIA_BI3_unbind-1-10-RND9593_0_8</name>
<nbytes>0.000000</nbytes>
<max_nbytes>256000000.000000</max_nbytes>
<generated_locally/>
<status>0</status>
<upload_when_present/>
<url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</url>
</file_info>
<file_info>
<name>1894-NOELIA_BI3_unbind-1-10-RND9593_0_9</name>
<nbytes>0.000000</nbytes>
<max_nbytes>128000000.000000</max_nbytes>
<generated_locally/>
<status>0</status>
<upload_when_present/>
<url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</url>
</file_info> | |
ID: 39657 | Rating: 0 | rate: / Reply Quote | |
Congratulations on the home run - and thanks for the confirmation that the file edit is effective. Thank you! We had similar upload size problems before, and the solution was the same back then. | |
ID: 39658 | Rating: 0 | rate: / Reply Quote | |
Congratulations on the home run - and thanks for the confirmation that the file edit is effective. And we recently had the same thing at CPDN, which was why I checked - it had been bumped back to the top of my list of "things project administrators forget to do" when they're excited by an interesting bit of research. Which reminds me..... @ Gerard, If you find yourself having to re-generate all or part of this batch of 'verylong' tasks, could you please adjust <rsc_fpops_est> proportionately, so that our BOINC clients show a fair estimate of the task runtime from the beginning, and the task doesn't mess up DCF when it finishes? | |
ID: 39659 | Rating: 0 | rate: / Reply Quote | |
I've raised the limit in the DB for VERYLONG WUs. I'm not sure whether such changes propagate to clients at some time. | |
ID: 39660 | Rating: 0 | rate: / Reply Quote | |
My second very long workunit is uploading. | |
ID: 39661 | Rating: 0 | rate: / Reply Quote | |
I'm glad I checked this thread again. Two of my clients have one of the very long WUs. I implemented the fix as described by Richard. I'll keep an eye on it and check the status on completion. | |
ID: 39662 | Rating: 0 | rate: / Reply Quote | |
Thanks for the detailed guidance Richard and Retvari! | |
ID: 39663 | Rating: 0 | rate: / Reply Quote | |
4x6-GERARD_VERYLONG_CXCL12_confAna-0-1-RND1754_0_0 working now. It started out and continued to count down about 18.5 hours till completion, then at 48.8% finished and about 13 hours is jumped to 13.5 hours left. hah! Anyway, with just around 10 hours showing left I am changing the xml according to Richard's instructions. Only have one since I only have the one machine I run longs on. Good thing it has the 3 780's in it. I'll keep an eye out to see if I get any more in the near future also. Right now I have 3 queued and 3 working and only 1 of these. | |
ID: 39664 | Rating: 0 | rate: / Reply Quote | |
We just launched 400 very long WU (they will take about 24h in a 780GTX) named VERYLONG_CXCL12_confAna whose results we need as soon as possible (we are in a hurry). Absolutely agree that this is an inappropriate way to handle these large work units. Another "very long" queue like Retvari says or have the server automatically figure out what computers should get them based on the installed graphics cards. ____________ | |
ID: 39665 | Rating: 0 | rate: / Reply Quote | |
Finally got one but on my GTX560ti which I aborted but nothing for my GTX970 | |
ID: 39666 | Rating: 0 | rate: / Reply Quote | |
Absolutely agree that this is an inappropriate way to handle these large work units. Another "very long" queue like Retvari says or have the server automatically figure out what computers should get them based on the installed graphics cards. As far as that goes, the Notice that went out and started this thread is clear that they had a limited number of work units that needed immediate release and ASAP completion. The fact that they are very long is secondary to the fact that they are needed ASAP. Having the priority on the ASAP means that adding a different queue for them involves either voluntary addition to that queue by the end users, maybe in response to a notice that goes out calling for them, or forcing everyone onto that queue which then ends in the exact thing you have right now, which is having them go out to the first come/first serve whether they can be completed or not by those machines. I don't think either of these is an appropriate thing for an on the spot addition of a longer task that needs to be completed ASAP. So that leaves the other option, which is having the servers determine if the machine can run it in the time needed before assigning it to that machine. I suppose that could be done, but I think if it could be currently done immediately and it was not, it was just a bad judgment call. Based on that, I would assume that their side of the system does not currently have that ability past what the user tells them you can do, via the queues you choose for your machines, i.e. Short, Long, Test, CPU, etc. I think assigning these tasks to the machines that are set to receive "normal" long work units and then sending out an official BOINC Notice to flash on the client IS the right way to have done this this time. And then, based on finances and manpower, work on adding more functionality to their back-end to determine what machines can do what tasks to fine tune what is already in place in the voluntary queues. It seems very clear that they were not expecting these VERYLONG work units too far in advance to actually have done this any better, making the way it was done the best way it could have been done. And now for the future, if it is to be done on occasion, not much manpower and time needs to go into it to "correct" the process, but if the VERYLONG work units are to become a regular thing for the grid, then time should be invested to add a queue or help their servers better determine the potential of machines to finish them in the times needed. All in all, people who have no better solutions, but only want to share frustrations are better off for everyone involved to state that there is an issue, what they think the issue is, and then know that someone saw the statement and will work to fix it if it needs fixing. We don't need to get emotionally involved unless we are singled out as overtly ignored. And then, there is always more projects or more official channels than the fellow user base and volunteer workers on a forum board. Not flaming, just always want to see solution makers making solutions and agitators making quiet. Life works better that way around. :-) | |
ID: 39667 | Rating: 0 | rate: / Reply Quote | |
I see that after the first VERYLONG units errored out, it has been posted that it was necessary to edit the xml before finishing the task. 24/01/2015 08:46:57 | GPUGRID | Output file 6x13-GERARD_VERYLONG_CXCL12_confAna-0-1-RND0906_0_9 for task 6x13-GERARD_VERYLONG_CXCL12_confAna-0-1-RND0906_0 exceeds size limit. 31 Hours of wasted time, which I could have used for 1 abandonned and 1 suspended NOELIA, just because I answered to the notice in my BOINC-manager which asked for help to finish those VERYLONG wu's ASAP. :-( | |
ID: 39668 | Rating: 0 | rate: / Reply Quote | |
Hello. | |
ID: 39670 | Rating: 0 | rate: / Reply Quote | |
Hello. | |
ID: 39671 | Rating: 0 | rate: / Reply Quote | |
4) Find the file 'client_state.xml' in your BOINC Data folder. Under Windows, this is likely - if you accepted the default installation setting - to be C:\Programdata\BOINC: under Linux, it might be /var/lib/boinc For linux, the file is under "/var/lib/boinc-client". You should be able to edit the file in terminal using "sudo nano". | |
ID: 39672 | Rating: 0 | rate: / Reply Quote | |
Hi, | |
ID: 39677 | Rating: 0 | rate: / Reply Quote | |
Hi, No, not an actual BOINC intrinsic limit, just a configuration oversight in this particular model run. Read on further down this thread. | |
ID: 39678 | Rating: 0 | rate: / Reply Quote | |
Errors can happen, so no hard feeling to the research project. | |
ID: 39679 | Rating: 0 | rate: / Reply Quote | |
mine also finished with result upload error: | |
ID: 39682 | Rating: 0 | rate: / Reply Quote | |
For those that can't be bothered to read through the topic...Your fix for the "old" WU's. | |
ID: 39683 | Rating: 0 | rate: / Reply Quote | |
http://www.gpugrid.net/forum_thread.php?id=3988&nowrap=true#39636 | |
ID: 39684 | Rating: 0 | rate: / Reply Quote | |
OK, so here are the KISS ('Keep it simple, stupid') instructions. That worked for a while, but I picked up one this morning on my GTX 660 Ti, so apparently it is not foolproof. It is now 10 hours into a 40-hour run. However, the DB fix that Toni mentioned was included in the files that were downloaded, so I have: <max_nbytes>512000000.000000</max_nbytes> We will see if it all works. Thanks for the tips though. | |
ID: 39685 | Rating: 0 | rate: / Reply Quote | |
Provided the _1_9 file has the increased <max_nbytes> 512,000,000 you should be fine (_1 in this case, because it's a resent task). | |
ID: 39686 | Rating: 0 | rate: / Reply Quote | |
Yes, it is the _1_9 file. | |
ID: 39687 | Rating: 0 | rate: / Reply Quote | |
I don't know if a GTX 660 Ti is really worth doing it on, I would think not a great proposition. I unfortunately have 4 of these running on my 750Ti cards. Didn't try to get them. The first one finished in 48.5 hours so 400k credits, no bonuses made it far less than simply running the normal WUs. That combined with the chance of erroring out, I hope we've seen the last of these. Anyway, I'd reserve these guys for only the very fastest GPUs and limit them to those. | |
ID: 39691 | Rating: 0 | rate: / Reply Quote | |
Quite true, but we don't have much flexibility here. Once they start, we may not catch them for a few hours as we both know. Then we have to decide whether to keep going or not. With the lack of any guidance, it is anyone's guess. I am sure they will think of a better way to do it next time. | |
ID: 39693 | Rating: 0 | rate: / Reply Quote | |
Just got another one, but this time I could shunt it off to my one 670 card (thanks flashhawk!). We'll see how that goes. | |
ID: 39694 | Rating: 0 | rate: / Reply Quote | |
I've received another one of these, and I confirm that the upload size has been doubled, so this problem is fixed by now. <file_info>
<name>8x3-GERARD_VERYLONG_CXCL12_confAna-0-1-RND4073_2_8</name>
<nbytes>0.000000</nbytes>
<max_nbytes>256000000.000000</max_nbytes>
<generated_locally/>
<status>0</status>
<upload_when_present/>
<url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</url>
</file_info>
<file_info>
<name>8x3-GERARD_VERYLONG_CXCL12_confAna-0-1-RND4073_2_9</name>
<nbytes>0.000000</nbytes>
<max_nbytes>512000000.000000</max_nbytes>
<generated_locally/>
<status>0</status>
<upload_when_present/>
<url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</url>
</file_info> However it is not clear for me, why is the size of the _9 file bigger than the size of the _8 file, while the in the reality it's the opposite. | |
ID: 39695 | Rating: 0 | rate: / Reply Quote | |
I got 2 of the very long wus and computed 60 hours total. After reaching 99 % each wu aborted with an computing error. | |
ID: 39703 | Rating: 0 | rate: / Reply Quote | |
I got 2 of the very long wus and computed 60 hours total. After reaching 99 % each wu aborted with an computing error. That is unfortunate. Please post the error message. | |
ID: 39707 | Rating: 0 | rate: / Reply Quote | |
I got 2 of the very long wus and computed 60 hours total. After reaching 99 % each wu aborted with an computing error. He doesn't have to. You can click on his name, then click View Computers, then see the 2 failed tasks. They failed because of the "upload file size too big" error that was outlined in this thread. Tough break, it happens. Try not to think of it in terms of "lost credits". Think of it instead in terms of "I tried to help humanity, but it didn't work out. I hope it does next time." Regards, Jacob Klein | |
ID: 39711 | Rating: 0 | rate: / Reply Quote | |
I have just noticed that my 750 Ti got one of these. | |
ID: 39716 | Rating: 0 | rate: / Reply Quote | |
I have just noticed that my 750 Ti got one of these. Check that it doesn't suffer from the "upload file size too big" problem discussed in this thread (it shouldn't - you got a resend issued well after remedial action was taken). Otherwise all your hard work won't help with the urgent research project. | |
ID: 39717 | Rating: 0 | rate: / Reply Quote | |
I have just noticed that my 750 Ti got one of these. Yup, I checked. It was a resend, so the max_nbytes was properly adjusted. | |
ID: 39719 | Rating: 0 | rate: / Reply Quote | |
I'd be fascinated to hear how well we did with the original 400-WU research challenge over the weekend, after that slightly shaky start. | |
ID: 39721 | Rating: 0 | rate: / Reply Quote | |
Before anything, we would like to apologize for the big mess of this last weekend. It obviously was a peculiar situation and as such, our settings proved not to be correctly prepared for it. I hope we have learned the lesson. | |
ID: 39722 | Rating: 0 | rate: / Reply Quote | |
pretty, early in the morning, first and successful :-) | |
ID: 39723 | Rating: 0 | rate: / Reply Quote | |
Hello. | |
ID: 39724 | Rating: 0 | rate: / Reply Quote | |
confirmed new very longs are fixed. | |
ID: 39740 | Rating: 0 | rate: / Reply Quote | |
So I get home from a work trip and see a very long running on my GTX 750. The task is at 49 hours and 77%. Despite missing the bonus. We should all get the bonus if we are putting in the time. I want it to finish. Looks like 15 hours to go. I notice the 750 was stuck at 400Mhz so I rebooted twice and it went back up to 1250Mhz. Not sure why or how long it ran at 400 but it is fixed now. | |
ID: 39767 | Rating: 0 | rate: / Reply Quote | |
My question is: do I need to edit the text files for this to upload or will it go on its own now. I don't want all of this time to go to waste. I'd for sure check it according to the instructions above. I fixed 3 of them and they all finished fine. Without the fix they would have failed. Two more were already fixed, so did 5 in all. Don't care to see any more of these mammoths: aborted the last 3 received so they could run on faster cards. Your 750 time sounds about right as my 750Ti cards completed 4 of them, finishing in 48.5-50.5 hours. My lone 670 ran faster, about 38.5 hours. | |
ID: 39771 | Rating: 0 | rate: / Reply Quote | |
I don't know about you but I am very impressed with the 750. I can overclock the proc by 50 and the memory by 1 Ghz with no problem. Best card for $80 ever. | |
ID: 39772 | Rating: 0 | rate: / Reply Quote | |
This is what I found in my config file: | |
ID: 39773 | Rating: 0 | rate: / Reply Quote | |
Does this look right? It has 512 instead of 128. Yes, that one should run through to the end and report normally, everything else being equal. No action needed on your part. | |
ID: 39779 | Rating: 0 | rate: / Reply Quote | |
Before anything, we would like to apologize for the big mess of this last weekend. It obviously was a peculiar situation and as such, our settings proved not to be correctly prepared for it. I hope we have learned the lesson. Thanks Gerard for the heads-up and these details about this research :) Really appreciated. ____________ [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres | |
ID: 39789 | Rating: 0 | rate: / Reply Quote | |
It looks like I just got a brand new one 3 hours ago and I am crunching it now with the highest prio. 10.901% at 3:00:00 on my device 2 of 0,1,2. | |
ID: 39792 | Rating: 0 | rate: / Reply Quote | |
GTX570 - 53 h 44m 07s ufff done :) | |
ID: 39797 | Rating: 0 | rate: / Reply Quote | |
ERRRRRRR. 217,000 secs of compute for nothing. It says error while computing. That was a waste. Yes, I am frustrated. I know I shouldn't be but 60 hours of compute down the drain. Don't know why. Any ideas? | |
ID: 39798 | Rating: 0 | rate: / Reply Quote | |
ERRRRRRR. 217,000 secs of compute for nothing. It says error while computing. That was a waste. Yes, I am frustrated. I know I shouldn't be but 60 hours of compute down the drain. Don't know why. Any ideas? This task's log contains a lot of error messages: # The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 36378000)
...
SWAN : FATAL : Cuda driver error 715 in file 'swanlibnv2.cpp' in line 1965.
# SWAN swan_assert 0
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 37079000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 37133000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 37963000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 41979000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 42124000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 42709000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 42718000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 43071000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 44636000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 44846000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 45046000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 48215000)
...
# The simulation has become unstable. Terminating to avoid lock-up (1) Probably it's overclocked, and it can't take that much of that. Its temperature doesn't go over 73°C, so it's not overheating. You should either decrease the GPU clock by 10-20-50MHz, or increase the GPU voltage by 25mV (increase its power target). But I don't recommend the latter. | |
ID: 39802 | Rating: 0 | rate: / Reply Quote | |
Good to know. Thanks for looking into it. Guess the overclock is too much. | |
ID: 39804 | Rating: 0 | rate: / Reply Quote | |
A good test for overclocking stability, is to see if Heaven 4.0 can run for 5 solid hours without any freezing and without any TDRs logged in C:\Windows\LiveKernelReports\WATCHDOG. | |
ID: 39805 | Rating: 0 | rate: / Reply Quote | |
Re- issued task due to time-out past deadline of original release. 26.60 hours, finished with no errors. | |
ID: 39828 | Rating: 0 | rate: / Reply Quote | |
There's another longer than usual batch called GERARD_CXCL12_confAnaFX_ada2, it took 9 hours for my GTX780Ti to complete. | |
ID: 39829 | Rating: 0 | rate: / Reply Quote | |
Knocked it out in 5 hours ... EVGA GeForce 980 .. no sweat | |
ID: 39930 | Rating: 0 | rate: / Reply Quote | |
Knocked it out in 5 hours ... EVGA GeForce 980 .. no sweat This appears to be the only task you've ever done: 854-NOELIA_PNP-5-10-RND5715_0 | |
ID: 39932 | Rating: 0 | rate: / Reply Quote | |
Hi ! | |
ID: 39938 | Rating: 0 | rate: / Reply Quote | |
yes.. just started/joined yesterday late afternoon.. | |
ID: 39939 | Rating: 0 | rate: / Reply Quote | |
Welcome Tiger and great to have you as a fellow volunteer. This forum is about the CXCL12_confAna tasks from a week ago. | |
ID: 39946 | Rating: 0 | rate: / Reply Quote | |
Thanx Dale .. I'm off to the NOELIA thread .. happy crunching | |
ID: 39949 | Rating: 0 | rate: / Reply Quote | |
It seems to me that the WUs are getting longer at faster pace than the video cards are increasing in speed. I have one suggestion for this, create the ability to run these very long WUs on more than one video card at a time. This would seem to me to be a more prudent use resources than the multi-CPU apps. | |
ID: 39951 | Rating: 0 | rate: / Reply Quote | |
Is their a way to select to run GERARD_VERYLONG_CXCL12_confAna Tasks? | |
ID: 39999 | Rating: 0 | rate: / Reply Quote | |
Is their a way to select to run GERARD_VERYLONG_CXCL12_confAna Tasks? No. | |
ID: 40011 | Rating: 0 | rate: / Reply Quote | |
Is their a way to select to run GERARD_VERYLONG_CXCL12_confAna Tasks? Thanks for the info. | |
ID: 40013 | Rating: 0 | rate: / Reply Quote | |
Oh no had two i saw now. Disk limit exceed after 120k und. 135k computing ^^ outsch seems i must adjust something for the future because i use only 40gb disks on some boinc machines O.o | |
ID: 40024 | Rating: 0 | rate: / Reply Quote | |
Everything runs smoothly. | |
ID: 40027 | Rating: 0 | rate: / Reply Quote | |
pretty, early in the morning, first and successful :-) Why was this task canceled . That's what mean ?! A couple of weeks has been recognized as correctly calculated and is now cleared. and on the performance board is also deleted. | |
ID: 40064 | Rating: 0 | rate: / Reply Quote | |
What do you call good graphic cards? Mines gtx 650 Nvidia | |
ID: 40072 | Rating: 0 | rate: / Reply Quote | |
The number in the hundred's slot represents the generation, and the number in the ten's slot is the relative strength compared to other cards within that generation. Here's a handy chart of their relative usefulness for you, cira Febrary 2nd, 2015: | |
ID: 40073 | Rating: 0 | rate: / Reply Quote | |
pretty, early in the morning, first and successful :-) I ask one more time to explain the disappearance of this task and remove it from the board performance. You are able for the intelligent conversation? | |
ID: 40090 | Rating: 0 | rate: / Reply Quote | |
pretty, early in the morning, first and successful :-) All tasks, at all BOINC projects, are purged from the online database after a pre-determined period of time - otherwise the database would grow to an unmanageable size, and performance would slow to a crawl. The oldest valid task in my own current list is dated 28 January: I think we can take it that the 'purge interval' at this project is 10 days, although some projects use 24 hours or even less. Your 26 January task is simply too old to be retained in the online transactional database: that does not mean that it has been 'cancelled' - the science will have been moved to another place, and your credits will be retained in your totals. | |
ID: 40091 | Rating: 0 | rate: / Reply Quote | |
Yeah, I completed 2 of these verylong WUs and the one that was completed in the shortest time is no longer on the performance board either. | |
ID: 40092 | Rating: 0 | rate: / Reply Quote | |
Do not do this by himself uncomprehendingly richard..... | |
ID: 40093 | Rating: 0 | rate: / Reply Quote | |
I do not understand you, Jozef. | |
ID: 40095 | Rating: 0 | rate: / Reply Quote | |
I'm running Intel Core i7-4790 CPU @4.00GHz with NVIDIA GeForce GTX980. Would something like this be able to help you guy out? | |
ID: 40151 | Rating: 0 | rate: / Reply Quote | |
I'm running Intel Core i7-4790 CPU @4.00GHz with NVIDIA GeForce GTX980. Would something like this be able to help you guy out? Sure it would, but this batch is finished now. | |
ID: 40156 | Rating: 0 | rate: / Reply Quote | |
Message boards : News : WARNING/CHALLENGE: VERY LONG WU (VERYLONG_CXCL12_confAna)