Must be one screwed up WU

This is the place for chat on DBB Distributed Computing projects... remember team #39070 for Stanfords Folding @ Home project.
Locked
User avatar
KompresZor
DBB Captain
DBB Captain
Posts: 919
Joined: Wed Jul 31, 2002 2:01 am
Location: Clearfield, Pennslyvania
Contact:

Must be one screwed up WU

Post by KompresZor » Tue Jan 25, 2005 8:58 am

I got 63 points for doing 73% of a 600 pointer :(
EDIT: replaced the code tags w/ quote
--- Opening Log file [January 22 23:19:34]


# Windows Console Edition #####################################################
###############################################################################

Folding@Home Client Version 5.02

http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: E:\Program Files\Folding@Home
Service: E:\Program Files\Folding@Home\FAH502-Console.exe
Arguments: -svcstart -advmethods -verbosity 9 -forceasm

Launched as a service.
Entered E:\Program Files\Folding@Home to do work.

Warning:
By using the -forceasm flag, you are overriding
safeguards in the program. If you did not intend to
do this, please restart the program without -forceasm.
If work units are not completing fully (and particularly
if your machine is overclocked), then please discontinue
use of the flag.

[23:19:34] - Ask before connecting: No
[23:19:34] - User name: KompresZor (Team 39070)
[23:19:34] - User ID: 295C5580429563BE
[23:19:34] - Machine ID: 1
[23:19:34]
[23:19:34] Loaded queue successfully.
[23:19:34] + Benchmarking ...
[23:19:36] The benchmark result is 3072
[23:19:36]
[23:19:36] + Processing work unit
[23:19:36] Core required: FahCore_78.exe
[23:19:36] Core found.
[23:19:36] - Autosending finished units...
[23:19:36] Trying to send all finished work units
[23:19:36] + No unsent completed units remaining.
[23:19:36] - Autosend completed
[23:19:36] Working on Unit 00 [January 22 23:19:36]
[23:19:36] + Working ...
[23:19:36] - Calling 'FahCore_78.exe -dir work/ -suffix 00 -checkpoint 15 -service -forceasm -verbose -lifeline 1336 -version 502'

[23:19:37]
[23:19:37] *------------------------------*
[23:19:37] Folding@Home Gromacs Core
[23:19:37] Version 1.68 (August 18, 2004)
[23:19:37]
[23:19:37] Preparing to commence simulation
[23:19:37] - Assembly optimizations manually forced on.
[23:19:37] - Not checking prior termination.
[23:19:51] - Expanded 2962313 -> 16166417 (decompressed 545.7 percent)
[23:19:55]
[23:19:55] Project: 1141 (Run 92, Clone 18, Gen 2)
[23:19:55]
[23:20:07] Assembly optimizations on if available.
[23:20:07] Entering M.D.
[23:20:34] (Starting from checkpoint)
[23:20:34] Protein: p1141_RIBO_FSpeptide_HEL_nospring
[23:20:34]
blah..
blah...
blah....
[21:34:20] Completed 182500 out of 250000 steps (73)
[21:49:21] Timered checkpoint triggered.
[22:02:08] Gromacs cannot continue further.
[22:02:08] Going to send back what have done.
[22:02:08] logfile size: 31995
[22:02:08] - Writing 32531 bytes of core data to disk...
[22:02:08] ... Done.
[22:02:08]
[22:02:08] Folding@home Core Shutdown: EARLY_UNIT_END
[22:02:13] CoreStatus = 72 (114)
[22:02:13] Sending work to server

[22:02:13] + Attempting to send results
[22:02:13] - Reading file work/wuresults_00.dat from core
[22:02:13] (Read 32531 bytes from disk)
[22:02:13] Connecting to http://171.65.103.156:8080/
[22:02:15] Posted data.
[22:02:15] Initial: 0000; - Uploaded at ~16 kB/s
[22:02:15] - Averaged speed for that direction ~14 kB/s
[22:02:15] + Results successfully sent
[22:02:15] Thank you for your contribution to Folding@Home.
)))MuSiC(((
DBB Cadet
DBB Cadet
Posts: 9
Joined: Mon Jan 17, 2005 1:56 pm

Post by )))MuSiC((( » Tue Jan 25, 2005 12:22 pm

i also had this unit mess up on me at various times... i think i might have completed it once though... but if i am not mistaken its pretty unstable :o
User avatar
STRESSTEST
DBB DemiGod
DBB DemiGod
Posts: 6574
Joined: Sun Nov 21, 1999 3:01 am

Post by STRESSTEST » Tue Jan 25, 2005 12:33 pm

-verbosity 9?
User avatar
Deadmeat
DBB Captain
DBB Captain
Posts: 631
Joined: Tue Jun 12, 2001 2:01 am
Location: Davis, Ca, USA
Contact:

Post by Deadmeat » Tue Jan 25, 2005 12:35 pm

Oh crapola, my second rig just started on of those, although it's a different run. Maybe I'll have better luck. Will keep ya posted.
User avatar
Deadmeat
DBB Captain
DBB Captain
Posts: 631
Joined: Tue Jun 12, 2001 2:01 am
Location: Davis, Ca, USA
Contact:

Post by Deadmeat » Tue Jan 25, 2005 12:44 pm

Oh crapola, my second rig just started one of those, although it's a different run. Maybe I'll have better luck. Will keep ya posted.
Dev_Null
DBB Ace
DBB Ace
Posts: 33
Joined: Sun Nov 14, 2004 9:07 pm
Location: Montgomery, AL
Contact:

Post by Dev_Null » Tue Jan 25, 2005 12:51 pm

STRESSTEST wrote:-verbosity 9?
That just increases the detail of information that the client spits out into the log file. Good when you're trying to figure out a problem, but I never use it otherwise.

~Dev
User avatar
KompresZor
DBB Captain
DBB Captain
Posts: 919
Joined: Wed Jul 31, 2002 2:01 am
Location: Clearfield, Pennslyvania
Contact:

Post by KompresZor » Wed Jan 26, 2005 7:36 am

Well today it says I got 38 points :-/ I hope you're right Jim and they take that WU out and try to figure out what's wrong with it, soon. It might not bother you guys with the super awesome rigs much, but it took this old 1.4ghz 74 hours and change to get that far. That's a lot of folding for 38 points :) Maybe I should switch to an abacus. :wink:
User avatar
Dev Null
DBB Cadet
DBB Cadet
Posts: 4
Joined: Wed Jan 01, 2003 3:01 am
Location: Montgomery, AL USA
Contact:

Post by Dev Null » Wed Jan 26, 2005 2:29 pm

That seems idiotic to me. They could have just as easily based it on the percentage completed verses the total point value. So a failure at 50% completion in a 600 point work unit would render 300 points. .5 points per frame? That's too bad.

~Dev
User avatar
STRESSTEST
DBB DemiGod
DBB DemiGod
Posts: 6574
Joined: Sun Nov 21, 1999 3:01 am

Post by STRESSTEST » Wed Jan 26, 2005 3:04 pm

Xciter wrote:I've done more research into the failure... while I can't share everything, the failure type was bad... and the given credit is based on .5 point per frame completed.
have they atleast pulled it now?
Dev_Null
DBB Ace
DBB Ace
Posts: 33
Joined: Sun Nov 14, 2004 9:07 pm
Location: Montgomery, AL
Contact:

Post by Dev_Null » Wed Jan 26, 2005 3:43 pm

True. Room for complaint is somewhat narrowed if you run the advmethods flag. As a beta tester myself, I certainly understand the annoyances of losing units due to various short comings in the unit, the server, the core, etc.

I don't understand what you mean by it ended in a rare way. It seems after some research that the 600 point work units are plagued by early ends, especially on Macs. Can you clarify exactly what is rare about this particular unit failure?

~Dev
User avatar
Deadmeat
DBB Captain
DBB Captain
Posts: 631
Joined: Tue Jun 12, 2001 2:01 am
Location: Davis, Ca, USA
Contact:

Post by Deadmeat » Mon Jan 31, 2005 1:25 pm

Here's an update. I finished that P1141 (Run 70, Clone 91, Gen 1) with no problems. It took a while because I can only run my comps during waking hours, so it was started and stopped a few times. This was done on a 2600+ machine that was not online when it tried to send the results. I just built this machine for my GF and it shares her IP, so I have to disconnect it when she wants on. Given this machine was on/off, up/down there were more than enough chances for the project to terminate early, but it didn't.

Yee Haw, 600 more points for the team. :D
User avatar
STRESSTEST
DBB DemiGod
DBB DemiGod
Posts: 6574
Joined: Sun Nov 21, 1999 3:01 am

Post by STRESSTEST » Tue Feb 01, 2005 10:49 pm

jesus christ, Im sick of this damn wu... I swear I've had 20 of them crash in the last 4 days..... atleast
User avatar
BAAL
DBB Captain
DBB Captain
Posts: 706
Joined: Mon Sep 03, 2001 2:01 am
Location: Canada
Contact:

Post by BAAL » Wed Feb 02, 2005 2:30 pm

I havent had an 1141 crash on me yet.....actually no Wu's have failed me yet to my knowledge.
User avatar
KompresZor
DBB Captain
DBB Captain
Posts: 919
Joined: Wed Jul 31, 2002 2:01 am
Location: Clearfield, Pennslyvania
Contact:

Post by KompresZor » Wed Feb 02, 2005 8:25 pm

I've done 4 of the p1141's on this box and this is the only one that has failed. Between the two pc's I have folding they've done 10 of the 600 point WU's and this is the only failure. ( crosses fingers ) I'm folding another p1141 right now, it's at 69%, my other box is working on a p1140 and it's at 65%.
User avatar
KompresZor
DBB Captain
DBB Captain
Posts: 919
Joined: Wed Jul 31, 2002 2:01 am
Location: Clearfield, Pennslyvania
Contact:

Post by KompresZor » Fri Feb 04, 2005 9:18 am

Well, it failed at 92% but at least this time i got the normal amount of points.
[13:01:58] Project: 1141 (Run 70, Clone 21, Gen 4)
[13:01:58]
[13:02:13] Assembly optimizations on if available.
[13:02:13] Entering M.D.
[13:02:39] (Starting from checkpoint)
[13:02:39] Protein: p1141_RIBO_FSpeptide_HEL_nospring
~~~
[06:32:49] Completed 230000 out of 250000 steps (92)
[06:47:50] Timered checkpoint triggered.
[07:01:36] - Autosending finished units...
[07:01:36] Trying to send all finished work units
[07:01:36] + No unsent completed units remaining.
[07:01:36] - Autosend completed
[07:02:52] Timered checkpoint triggered.
[07:17:52] Timered checkpoint triggered.
[07:31:06] Quit 101 - Fatal error: NaN detected: (ener[13])
[07:31:06]
[07:31:06] Simulation instability has been encountered. The run has entered a
[07:31:06] state from which no further progress can be made.
[07:31:06] If you often see other project units terminating early like this
[07:31:06] too, you may wish to check the stability of your computer (issues
[07:31:06] such as high temperature, overclocking, etc.).
[07:31:06] Going to send back what have done.
[07:31:06] logfile size: 39956
[07:31:06] - Writing 40519 bytes of core data to disk...
[07:31:06] ... Done.
[07:31:06]
[07:31:06] Folding@home Core Shutdown: EARLY_UNIT_END
[07:31:09] CoreStatus = 72 (114)
[07:31:09] Sending work to server
User avatar
Deadmeat
DBB Captain
DBB Captain
Posts: 631
Joined: Tue Jun 12, 2001 2:01 am
Location: Davis, Ca, USA
Contact:

Post by Deadmeat » Sun Feb 06, 2005 1:23 pm

Finished another P1141 (Run 59, Clone 5, Gen 4) on my main rig without a hitch and it started an 1134. My second rig just finished an 1134 and has now started another 1141 (Run 6, Clone 51, Gen 3). Seems since Stress starting whining all I'm getting on my three machines are 600 pointers. Thanx Bill. :D
User avatar
BAAL
DBB Captain
DBB Captain
Posts: 706
Joined: Mon Sep 03, 2001 2:01 am
Location: Canada
Contact:

Post by BAAL » Sun Feb 06, 2005 2:37 pm

Thats all i ever seem to get now as well. I am not complaining, although it takes forever on a duron700 (3rd folding box) :lol:
Locked