[Aces-support] walltime question

Huajian Yao hjyao at MIT.EDU
Tue Dec 2 13:35:56 EST 2008


Greg,

6 hours walltime for "one" queue may work for my jobs. Thanks a lot for your
help.

Cheers,

Huajian

Quoting aces-admin at techsquare.com:

> hello huajian-
>
> what you suggest will not work, as
> the max_walltime for the 'one' queue
> is less than 12h.
>
> secondly, the default walltime for the
> 'one' queue is 6h. if you point me
> to the url where there is stale data
> i will fix that up.
>
> thirdly, if you are hoping to game the
> queue-system, then you will have to either
> modify your job or become one of the PIs
> who set policy ;)
>
> ;; job-modification
>
> have your job checkpoint every hour or
> two and set them free (with appropriate
> pickup files) in the 'one' queue.
>
> ;; job-submission modification
>
> request all 16 nodes available to your
> job in the 'long' queue and 'manually
> multiplex' your job that way.
>
> i hope that helps...
>
> [greg]
>
>
>> Date: Tue, 02 Dec 2008 11:12:23 -0500
>> From: Huajian Yao <hjyao at MIT.EDU>
>> MIME-Version: 1.0
>> Cc:
>> Reply-To: ACES-support at mitgcm.org
>>
>> Greg,
>>
>> Thanks a lot for your quick reply!
>> Maybe you misunderstood my last email. The reason I want to switch to
>> 'one' queue is that I can run up to 64 jobs at the same time. But with
>> 'long' queue, I can only have a maximum of 8 jobs. However, the maximum
>> walltime seems to be 2 hours (according to acesgrid.org) which does not
>> satisfy my computation time for each job. Is it possible that I require
>> walltime for 'one' queue like: qsub -q one -l nodes=1,walltime=12:00:00 ?
>>
>> Huajian
>>
>> Quoting aces-admin at techsquare.com:
>>
>> > hello hiyao-
>> >
>> > i believe that you may be over-thinking this.
>> >
>> > your job, for example, does not take any longer
>> > to run when executed in the 'long' queue than
>> > when executed in the 'one' queue
>> >
>> >  unless your job does not terminate and you
>> >  simply wait for the queue-system to beat it
>> >  up. in this case, you should write some terminal
>> >  conditions to your job.
>> >
>> > what you may notice, however, is that jobs in
>> > the 'one' queue are scheduled more readily than
>> > jobs in the 'long' queue. this has more to do
>> > with the default resource requirements for each
>> > queue and you should be able to achieve similar
>> > scheduling results by submitting your job to the
>> > 'long' queue and requesting only 12h of walltime.
>> >
>> >  qsub -q long -l nodes=1,walltime=12:00:00
>> >
>> > did i mis-read your message entirely or is this
>> > 'on the right track' ?
>> >
>> > [greg]
>> >
>> >
>> >
>> >
>> >> Date: Tue, 02 Dec 2008 10:25:46 -0500
>> >> From: Huajian Yao <hjyao at mit.edu>
>> >> MIME-Version: 1.0
>> >> Cc:
>> >> Reply-To: ACES-support at mitgcm.org
>> >>
>> >> Hi,
>> >>
>> >> I am running (many) jobs each with single node one cpu on the 
>> cluster. The
>> >> computation
>> >> time usually takes about 9-12 hours for each job. Usually I run my
>> >> code using
>> >> "long" job which works pretty well but takes too much time. I have
>> >> heard from
>> >> someone that the walltime for "one" type job has changed to 12
>> >> hours. However,
>> >> on
>> >> the acesgrid.org, the walltime for "one" is still 2 hours. 
>> Yesterday I tried
>> >> some "one" type jobs on the cluster but all killed after about 6
>> >> hours. Could
>> >> you
>> >> please let me what is the real walltime for "one" type job? Thanks a lot!
>> >> And I really hope the walltime for "one" can extend to 12 hours.
>> >>
>> >> Huajian
>> >> _______________________________________________
>> >> Aces-support mailing list
>> >> Aces-support at acesgrid.org
>> >> http://acesgrid.org/mailman/listinfo/aces-support
>> >>
>> > _______________________________________________
>> > Aces-support mailing list
>> > Aces-support at acesgrid.org
>> > http://acesgrid.org/mailman/listinfo/aces-support
>> >
>>
>>
>> _______________________________________________
>> Aces-support mailing list
>> Aces-support at acesgrid.org
>> http://acesgrid.org/mailman/listinfo/aces-support
>>
> _______________________________________________
> Aces-support mailing list
> Aces-support at acesgrid.org
> http://acesgrid.org/mailman/listinfo/aces-support
>





More information about the Aces-support mailing list