Advanced search

Message boards : Number crunching : Server is out of disk space

Author Message
Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,172,115,685
RAC: 15,112,405
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61548 - Posted: 20 Jun 2024 | 21:56:05 UTC

Thu 20 Jun 2024 05:49:06 PM EDT | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

mrchips
Send message
Joined: 9 May 21
Posts: 16
Credit: 1,391,944,427
RAC: 2,247,342
Level
Met
Scientific publications
wat
Message 61550 - Posted: 21 Jun 2024 | 0:23:57 UTC

still out of space
____________

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,172,115,685
RAC: 15,112,405
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61554 - Posted: 21 Jun 2024 | 22:41:23 UTC

It's full again.

Fri 21 Jun 2024 06:40:41 PM EDT | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

pututu
Send message
Joined: 8 Oct 16
Posts: 25
Credit: 4,153,801,869
RAC: 6,545,874
Level
Arg
Scientific publications
watwatwatwat
Message 61719 - Posted: 25 Aug 2024 | 17:00:25 UTC
Last modified: 25 Aug 2024 | 17:25:46 UTC

Currently, I'm seeing disk full error message.

Edit1: seems to be clearing slowly now.

Edit2: uploading seems to be intermittent...

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,172,115,685
RAC: 15,112,405
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61720 - Posted: 25 Aug 2024 | 18:43:54 UTC

Confirmed. Server is out of disk space.

pututu
Send message
Joined: 8 Oct 16
Posts: 25
Credit: 4,153,801,869
RAC: 6,545,874
Level
Arg
Scientific publications
watwatwatwat
Message 61722 - Posted: 25 Aug 2024 | 21:07:26 UTC
Last modified: 25 Aug 2024 | 21:09:12 UTC

Seems like it is intermittent, meaning need to perform network transfer retry occasionally to upload the completed tasks. Best to run with a script.

Fingers crossed that the server disk hasn't completely crashed yet as of this posting...

Freewill
Send message
Joined: 18 Mar 10
Posts: 20
Credit: 31,543,682,894
RAC: 160,869,050
Level
Trp
Scientific publications
watwatwatwatwat
Message 61728 - Posted: 26 Aug 2024 | 4:55:44 UTC

Still getting this error as of 26 Aug almost 7:00 AM Madrid time.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,932,721,670
RAC: 18,237,121
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61731 - Posted: 26 Aug 2024 | 8:00:37 UTC

26/08/2024 08:58:40 | GPUGRID | [error] Error reported by file upload server: Maintenance underway: file uploads are temporarily disabled.

Steve
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 21 Dec 23
Posts: 46
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 61734 - Posted: 26 Aug 2024 | 9:13:22 UTC - in response to Message 61731.

Thank you for reporting. It is now back and with more space.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,172,115,685
RAC: 15,112,405
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61779 - Posted: 8 Sep 2024 | 21:07:19 UTC

Server is out of disc space again.


Sun 08 Sep 2024 05:05:46 PM EDT | GPUGRID | Started upload of p38_A31_A28_r0_4-QUICO_ATM_AF_04_Benchmark-16-20-RND3498_1_0
Sun 08 Sep 2024 05:05:47 PM EDT | GPUGRID | [error] Error reported by file upload server: Server is out of disk space
Sun 08 Sep 2024 05:05:47 PM EDT | GPUGRID | Temporarily failed upload of p38_A31_A28_r0_4-QUICO_ATM_AF_04_Benchmark-16-20-RND3498_1_0: transient upload error
Sun 08 Sep 2024 05:05:47 PM EDT | GPUGRID | Backing off 00:24:33 on upload of p38_A31_A28_r0_4-QUICO_ATM_AF_04_Benchmark-16-20-RND3498_1_0



Stacie
Send message
Joined: 29 Mar 20
Posts: 22
Credit: 754,971,093
RAC: 1,166,353
Level
Glu
Scientific publications
wat
Message 61780 - Posted: 9 Sep 2024 | 0:27:54 UTC - in response to Message 61779.

Is this why finished workunits are failing to upload? They are starting to pile up in my que.
____________

Stacie
Send message
Joined: 29 Mar 20
Posts: 22
Credit: 754,971,093
RAC: 1,166,353
Level
Glu
Scientific publications
wat
Message 61781 - Posted: 9 Sep 2024 | 1:51:46 UTC

...do we lose bonus turnaround credit because the upload server is locked up?
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,382,797,676
RAC: 28,992,384
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61782 - Posted: 9 Sep 2024 | 2:31:47 UTC - in response to Message 61781.

...do we lose bonus turnaround credit because the upload server is locked up?

yes, at least in the past this has happened. I doubt this has changed now.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,382,797,676
RAC: 28,992,384
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61784 - Posted: 9 Sep 2024 | 7:50:29 UTC - in response to Message 61734.

On August 26 Steve wrote:

Thank you for reporting. It is now back and with more space.

Steve, now again no finished tasks can be uploaded since last night :-(

How come that this happens that often? Is there not any kind of automated reporting early enough (say, once the disk is 80% full), or other measures that prevent this problem to happen every few weeks ? This shouldn't be that difficult to implement.

WPrion
Send message
Joined: 30 Apr 13
Posts: 96
Credit: 2,786,534,111
RAC: 21,421,804
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61785 - Posted: 9 Sep 2024 | 10:40:25 UTC - in response to Message 61784.
Last modified: 9 Sep 2024 | 10:41:04 UTC


How come that this happens that often? Is there not any kind of automated reporting early enough (say, once the disk is 80% full), or other measures that prevent this problem to happen every few weeks ? This shouldn't be that difficult to implement.


Kinda ironic. They are super anxious to get this project completed so they award 10X the points the tasks deserve, therefore attract an army of crunchers, but then don't keep their hardware up to speed to support the effort.

Post to thread

Message boards : Number crunching : Server is out of disk space

//