Thread '16e tasks frequently fail for me, with Computation Error - across multiple PCs and OSes'
Message boards : Questions/Problems/Bugs : 16e tasks frequently fail for me, with Computation Error - across multiple PCs and OSes
Message board moderation
    
| Author | Message | 
|---|---|
| Send message Joined: 10 May 21 Posts: 2 Credit: 0 RAC: 0 | 
 I run NFS on a number of different machines, primarily MacOS and Linux. I've been seeing a high number of failing tasks on all of them, particular 16e tasks - "Computation Error" within the first couple of minutes. I've seen a couple of other posts about this in this forum, but no real indication of what's going on or how to prevent it. Given that I'm seeing it across multiple machines and OSes, I'm fairly confident it's not environment-specific on my side. Here's an example from today - Boinc 7.16.14 on MacOS 10.15.7, on a Mac mini. Updates to Boinc and MacOS don't seem to have any impact on the volume or frequency of the failing tasks. In the time it's taken me to write this post, the single non-failed task has progressed to 12.5% quite happily. I do run other projects on all machines where I'm seeing this, and I don't see frequently failing tasks for any of those. For what it's worth, the other machines where I see frequent failures have been running either Ubuntu or Debian (latest versions of either, plus latest Boinc for those OSes). I don't have NFS running on any of them currently, though, because of the high volume of failed tasks. What can I do to investigate and resolve these problems? Edit: the img tag of my screenshot seems to be failing. Here's the URL: https://imgur.com/a/43Pi8su | 
| Send message Joined: 26 Jun 08 Posts: 651 Credit: 512,825,862 RAC: 15,748               | 
 Those were run on Mac and ended with a segmentation fault. I typically see a bit higher error rates on the Mac app, especially now that we are running a more difficult quartic, but the project-wide Mac error rates are ok. I'm not sure why that host is unhappy. Perhaps you could try one of the other apps with that host to see if this particular number is causing the issue? | 
| Send message Joined: 10 May 21 Posts: 2 Credit: 0 RAC: 0 | 
 OK, thanks v much.  I'm running under an Account Manager so I can't directly choose which NFS apps I run, but I'll check whether the person who runs the pool can adjust it for us. | 
| Send message Joined: 6 May 16 Posts: 5 Credit: 14,673,762 RAC: 21,397               | 
 Hi Greg. I think I'm encountering something similar. This is on my latest and Mac and OS (MacBook Air 2017 running Catalina) which hasn't done NFS work before. But another Mac running Macintosh OS 10.14.6 build 18G87 might have the same issue if I can jugde from just one task that it sent back for now. I think in the past my Macs had no such problems with NFS, as they have credit. Two workunits might make it through, as they are running further, while another one tells me it's waiting for memory. I suppose RAM limitation might be the key to this problem. Please have a look at the stderr: <core_client_version>7.14.4</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)</message>
<stderr_txt>
boinc initialized
work files resolved, now working
-> lasieve5f_1.11_x86_64-apple-darwin
-> -r
-> -f
-> 2102270000
-> -c
-> 2000
-> -R
-> -o
-> ../../projects/escatter11.fullerton.edu_nfs/S2L2162_2102270_0_r21334165_0
-> ../../projects/escatter11.fullerton.edu_nfs/S2L2162.poly
SIGSEGV: segmentation violation
Crashed executable name: lasieve5f_1.11_x86_64-apple-darwin
built using BOINC library version 7.5.0
Machine type Intel x86-64h Haswell (64-bit executable)
System version: Macintosh OS 10.15.7 build 19H1030
Mon May 17 07:11:05 2021
atos cannot load symbols for the file lasieve5f_1.11_x86_64-apple-darwin for architecture x86_64.
0   lasieve5f_1.11_x86_64-apple-darwin  0x000000010007d21c  
SIGPIPE: write on a pipe with no reader
1   lasieve5f_1.11_x86_64-apple-darwin  0x0000000100071ad7  
SIGPIPE: write on a pipe with no reader
2   libsystem_platform.dylib            0x00007fff6954a5fd  
Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0100001f  rbx: 0x00000003  rcx: 0x7ffeefbfc228  rdx: 0x00000028
  rdi: 0x7ffeefbfc298  rsi: 0x00000003  rbp: 0x7ffeefbfc280  rsp: 0x7ffeefbfc228
   r8: 0x00000607   r9: 0x00000000  r10: 0x000009c8  r11: 0x00000206
  r12: 0x00000003  r13: 0x000009c8  r14: 0x7ffeefbfc298  r15: 0x00000028
  rip: 0x7fff69492dfa  rfl: 0x00000206
Binary Images Description:
       0x100000000 -        0x10009bfff /Library/Application Support/BOINC Data/slots/5/../../projects/escatter11.fullerton.edu_nfs/lasieve5f_1.11_x86_64-apple-darwin
    0x7fff66336000 -     0x7fff66337fff /usr/lib/libSystem.B.dylib
    0x7fff6661c000 -     0x7fff6666efff /usr/lib/libc++.1.dylib
    0x7fff6666f000 -     0x7fff66684fff /usr/lib/libc++abi.dylib
    0x7fff68196000 -     0x7fff681c9fff /usr/lib/libobjc.A.dylib
    0x7fff6861f000 -     0x7fff68669fff /usr/lib/libstdc++.6.dylib
    0x7fff69133000 -     0x7fff69138fff /usr/lib/system/libcache.dylib
    0x7fff69139000 -     0x7fff69144fff /usr/lib/system/libcommonCrypto.dylib
    0x7fff69145000 -     0x7fff6914cfff /usr/lib/system/libcompiler_rt.dylib
    0x7fff6914d000 -     0x7fff69156fff /usr/lib/system/libcopyfile.dylib
    0x7fff69157000 -     0x7fff691e9fff /usr/lib/system/libcorecrypto.dylib
    0x7fff692f6000 -     0x7fff69336fff /usr/lib/system/libdispatch.dylib
    0x7fff69337000 -     0x7fff6936dfff /usr/lib/system/libdyld.dylib
    0x7fff6936e000 -     0x7fff6936efff /usr/lib/system/libkeymgr.dylib
    0x7fff6937c000 -     0x7fff6937cfff /usr/lib/system/liblaunch.dylib
    0x7fff6937d000 -     0x7fff69382fff /usr/lib/system/libmacho.dylib
    0x7fff69383000 -     0x7fff69385fff /usr/lib/system/libquarantine.dylib
    0x7fff69386000 -     0x7fff69387fff /usr/lib/system/libremovefile.dylib
    0x7fff69388000 -     0x7fff6939ffff /usr/lib/system/libsystem_asl.dylib
    0x7fff693a0000 -     0x7fff693a0fff /usr/lib/system/libsystem_blocks.dylib
    0x7fff693a1000 -     0x7fff69428fff /usr/lib/system/libsystem_c.dylib
    0x7fff69429000 -     0x7fff6942cfff /usr/lib/system/libsystem_configuration.dylib
    0x7fff6942d000 -     0x7fff69430fff /usr/lib/system/libsystem_coreservices.dylib
    0x7fff69431000 -     0x7fff69439fff /usr/lib/system/libsystem_darwin.dylib
    0x7fff6943a000 -     0x7fff69441fff /usr/lib/system/libsystem_dnssd.dylib
    0x7fff69442000 -     0x7fff69443fff /usr/lib/system/libsystem_featureflags.dylib
    0x7fff69444000 -     0x7fff69491fff /usr/lib/system/libsystem_info.dylib
    0x7fff69492000 -     0x7fff694befff /usr/lib/system/libsystem_kernel.dylib
    0x7fff694bf000 -     0x7fff69506fff /usr/lib/system/libsystem_m.dylib
    0x7fff69507000 -     0x7fff6952efff /usr/lib/system/libsystem_malloc.dylib
    0x7fff6952f000 -     0x7fff6953cfff /usr/lib/system/libsystem_networkextension.dylib
    0x7fff6953d000 -     0x7fff69546fff /usr/lib/system/libsystem_notify.dylib
    0x7fff69547000 -     0x7fff6954ffff /usr/lib/system/libsystem_platform.dylib
    0x7fff69550000 -     0x7fff6955afff /usr/lib/system/libsystem_pthread.dylib
    0x7fff6955b000 -     0x7fff6955ffff /usr/lib/system/libsystem_sandbox.dylib
    0x7fff69560000 -     0x7fff69562fff /usr/lib/system/libsystem_secinit.dylib
    0x7fff69563000 -     0x7fff6956afff /usr/lib/system/libsystem_symptoms.dylib
    0x7fff6956b000 -     0x7fff69581fff /usr/lib/system/libsystem_trace.dylib
    0x7fff69583000 -     0x7fff69588fff /usr/lib/system/libunwind.dylib
    0x7fff69589000 -     0x7fff695befff /usr/lib/system/libxpc.dylib
Exiting...
</stderr_txt>
]]>For now I only have hands-on access to the MacBook as I'm not at home, but I can have a look later-on. Thanks for your time! - - - - - - - - - - Greetings, Jens | 
| Send message Joined: 6 May 16 Posts: 5 Credit: 14,673,762 RAC: 21,397               | 
 Short addition: This doesn't look right for me: Gültig (17) · Ungültig (0) · Fehler (85) (valid, invalid, errors). - - - - - - - - - - Greetings, Jens |