Common Sense Suggestions when Using Fusion Group Resources
The below should all be common sense, and shouldn't apply to most of CSL
because in general we've been happy with how most users of our systems
behave.
We are primarily discussing the domori and sampaka clusters here, but it
applies to any machine solely owned/operated by the
Fusion Group.
We've had some incidents lately and the offenders have claimed
that they somehow didn't understand how to share other group's computers
properly. So what follows are the terms and conditions for sharing our
resources. This isn't anything new, it's just a formal codifying of the
rules.
In general CSL is a friendly environment and we try to share our group's
resources whenever possible. We try to keep our machines open to anyone
affiliated with CSL.
Sometimes it is necessary to take one of the clusters aside for private
use. This can happen for a variety of reasons. Sometimes we are doing
sysadmin work, sometimes we are doing timing runs that can't be perturbed,
sometimes we are having problems with the NBS system and are running jobs
by hand.
In the above cases, we will politely ask anyone currently running jobs to
finish up, and we will disable the batch queue. The disabled batch queue
is a hint that the cluster is not open for general use.
If the batch queue is disabled, this is *not* a hint that you should try
to run jobs in other ways (such as ssh'ing to individual nodes, or else
trying to use alternate queues).
The MPI queue on the various clusters is for running massively parallel
MPI runs. It is not there for people to get around the closing of the
batch queue.
Here are some general rules:
- Only run on clusters that have the batch queue enabled unless
you've made specific alternate arrangements with the group.
- If you're going to be making a run that uses more than 25% of the
cluster, it is polite to ask first to make sure that's OK.
- Fusion group users and collaborators get highest priority on
Fusion group machines. We are always open to starting new
and exciting collaborations with other groups.
- While we do our best to maintain files and backups, we do not
gaurantee this. Always backup your files. We reserve the right
to delete any files off of the /fusion filesytems at any time
with no warning (but we probably won't).
- We also reserve the right to kill any jobs on our systems at any
time for any reason (again, we probably won't do this).
- Subverting the clusters to run jobs when the batch queue is
disabled is grounds for immediate suspension of your fusion
account.
- Using an account other than the one supplied to you will result
in both accounts being banned.
- We're all computer architechts here, so we have the same paper
deadlines. You do not get a higher priority on your jobs because
you are not good at planning and leave everything until the last
minute.
- Fusion sysadmins have final say on any activity taking place
on Fusion machines.
- We do not have to justify why we are using a machine
in a certain way. Believe us, there's a reason.
- If you break something on one of our machines, notify us *immediately*.
We will be much more forgiving if you tell us right away,
rather than if we waste time having to figure out what went wrong
and who did it.
- If you disagree with any of these rules, we kindly ask that you
not use our machines.
- No Whining!
Thank you, and again I apologize for those who use our resources
courteously. We appreciate it.
---- The Fusion Sysadmin staff