[Aces-support] problem last night on ao-cluster

Jean-Michel Campin jmc at ocean.mit.edu
Thu Nov 8 14:03:51 EST 2007


Hi,

I was running some MPI job last night on:
a54-1727-039
a54-1727-045
and some time between 1.am and 2.am starting to get this error:

Created /home/jmc/test_ACES/gcm_tests/tmp_gnu/MITgcm/verification/global_with_exf/run/PI12494
poll: protocol failure in circuit setup
p0_12574:  p4_error: Child process exited while making connection to remote process on a54-1727-039: 0
p0_12574: (2.009951) net_send: could not write to fd=4, errno = 32

Apparently, the problem was still there after 5.30 am.
Do you know if this has been fixed ?

Thanks,
Jean-Michel



More information about the Aces-support mailing list