[Aces-support] Re: a54-1727-038

aces-admin at techsquare.com aces-admin at techsquare.com
Fri Apr 4 08:01:05 EDT 2008


hello jmc-

the node had a (large) number of stale
jobs on it, so i have power-cycled it.

[greg]


> Date: Wed, 2 Apr 2008 17:03:34 -0400
> From: Jean-Michel Campin <jmc at ocean.mit.edu>
> Cc: aces-admin at techsquare.com, jmc at mit.edu
> Mime-Version: 1.0
> 
> Hi, 
> 
> it's a 1 node (on a54-1727-038) (2.processor) mpi job that fail with this error:
> 
> poll: protocol failure in circuit setup
> p0_30264:  p4_error: Child process exited while making connection to remote process on a54-1727-038: 0
> /usr/local/pkg/mpich/mpich-intel/bin/mpirun: line 1: 30264 Broken pipe             /home/jmc/gcm_ifort_mpi/verification/exp2/run/./mitgcmuv -p4pg /home/jmc/gcm_ifort_mpi/verification/exp2/run/PI30174 -p4wd /home/jmc/gcm_ifort_mpi/verification/exp2/run
> 
> Jean-Michel
> 



More information about the Aces-support mailing list