We have followings computes.
Host name | Internal IP address | External IP address | Specialities | Bench mark (sec) | Installed data |
grappa.phys.chiba-u.jp (upgraded) | 10.25.200.1 | 133.82.248.47 | login server (CentOS 7.2) | 9.9 | 2014/09/26 |
drygin.local | 192.168.1.2 | N/A | (CentOS 7.2) | 10.1 | - |
dryvermouth.local | 192.168.1.3 | N/A | (CentOS 7.2) | 10.1 | - |
olive.local | 192.168.1.4 | N/A | (CentOS 7.2) | 10.1 | - |
lemonpeel.local | 192.168.1.5 | N/A | (CentOS 7.2) | 10.1 | - |
vodka.local | 192.168.1.6 | N/A | (CentOS 7.2) | 10.1 | - |
lime.local | 192.168.1.7 | N/A | (CentOS 7.2) | 11.3 | - |
rum.local (upgraded) | 192.168.1.8 | N/A |
|
9.3 | - |
cognac.local | 192.168.1.9 | N/A | /disk4 (CentOS 7.2) | 17.0 | Not used now |
caolila.local (upgraded) | 192.168.1.21 | N/A |
|
8.3 | 2014/09/26 |
limoncello.local | 192.168.1.13 | N/A | /disk7, /disk8, /disk9 (CentOS 7.2) | 12.4 | - |
amaretto.local | 192.168.1.14 | N/A | (CentOS 7.2) /disk3, old old grappa system (sdc) | 12.4 | - |
campari.local | 192.168.1.15 | N/A | (CentOS 6.4) | 12.8 | OLD OS |
cassis.local | 192.168.1.16 | N/A | (CentOS 7.2) | 10.5 | - |
orange.local | 192.168.1.17 | N/A | (CentOS 7.2) | 10.5 | - |
soda.local | 192.168.1.18 | N/A | (CentOS 7.2) | 8.3 | 2014/09/26 |
sugar.local | 192.168.1.19 | N/A | (CentOS 7.2) | 8.3 | 2014/09/26 |
mint.local | 192.168.1.20 | N/A | (CentOS 7.2) | 8.3 | 2014/09/26 |
sherry.local | 192.168.1.22 | N/A | (CentOS 7.2) | 8.3 | 2017/01/18 |
tequila.local | 192.168.1.200 | N/A | (CentOS 7.2) | ||
cachaca.local | 192.168.1.201 | N/A | (CentOS 7.2) | ||
bourbon.local | 192.168.1.202 | N/A | (CentOs 7.2) | ||
glass.phys.chiba-u.jp (upgraded) | 10.25.200.9 | N/A | /disk0 (CentOS 6.4) | 26 TB | 2014/09/26 |
fridge.phys.chiba-u.jp | 10.25.200.13 | N/A | /disk1 (RedHat3.2) | 26 TB | - |
icebox.local (upgraded) | 192.168.1.102 | N/A | /home (CentOS 6.4) | 9 TB | - |
freezer.local | 192.168.1.104 | N/A | /disk10 (CentOS 4.8) | 9.1 TB | - |
kura.local | 192.168.1.103 | N/A | /disk11 (Netgear) | - | - |
coolbox.local | 192.168.1.105 | N/A | /disk12 (CentOS 4.8) | 4.6 TB | - |
icebucket.local | 192.168.1.106 | N/A | /disk13 (CentOS 4.8) | 4.6 TB | - |
souko.local | 192.168.1.107 | N/A | /disk14 (Netgear) | - | - |
muddler.local | 192.168.1.108 | N/A | /disk15 (CentOS 4.8) | 20 TB | - |
shaker.loccal | 192.168.1.109 | N/A | /disk16 (CentOS 6.2) | 19 TB | - |
blackbox.loccal | 192.168.1.110 | N/A | /disk17 (CentOS 6.5) | 40 TB | - |
icepick.loccal | 192.168.1.111 | N/A | /disk18 (CentOS 6.5?) | 80 TB | 2017/11/08 |
mixer.local | 192.168.1.112 | N/A | /disk19 (CentOS 7.2) | 175 TB | - |
ebisu.local | 192.168.1.204 | N/A | /disk20 (CentOS 7.6) | 255 TB | - |
* The bench mark is done by PI calculation. The value is the time to calculate PI. So, the less, the faster.
host name | internal IP address | external IP address | specialities |
printer-ppl.phys.chiba-u.jp | 10.25.200.3 | 133.82.248.71 | 3F, at computer room |
printer-ppl-4f.phys.chiba-u.jp | 10.25.200.19 | - | 4F, at post doc room |
host name | internal IP address | external IP address | specialities |
www.ppl.phys.chiba-u.jp | 10.25.200.2 | 133.82.248.70 | (Vine4.2) |
host name | internal IP address | external IP address | specialities |
hepburn.s.chiba-u.ac.jp | N/A | 133.82.130.190 | (Ubuntu 12.04) |
host name | internal IP address | external IP address | specialities |
pplmac.phys.chiba-u.jp | 10.25.200.7 | 133.82.248.169 | Machintosh (Tiger) |
ppldaq1.phys.chiba-u.jp | 10.25.200.4 | N/A | login as ppldaq (Turbolinux2.96) |
ppldaq2.phys.chiba-u.jp | 10.25.200.5 | N/A | login as ppldaq (Turbolinux2.96) |
ppldaq3.phys.chiba-u.jp | 10.25.200.11 | N/A | login as testdaq (RedHat 3.2) |
pplt.phys.chiba-u.jp | 10.25.200.14 | 133.82.249.27 | temporal IP (RELEASE when you finish to use!) |
We have followings disks. The summary of each disk is given below.
/disk0/data/photonics (154 GB) /disk0/data/spline_table /disk0/data/IceCube/DataBranch (175 GB) /disk0/data/IceCube/dominant_data (2.5 GB) /disk0/data/IceCube/JULIeT (47 GB) /disk0/data/IceCube/MCdata/9strings (8.8 GB) /disk0/data/IceCube/MCdata/IceCube++ (14 GB) /disk0/data/IceCube/MCdata/standard-candle (11 GB) /disk0/work (18 GB) /disk0/users (684 GB)
/disk1/data/IceCube/RealData/40strings /disk1/data/IceCube/RealData/79strings /disk1/data/IceCube/RealData/86strings /disk1/data/IceCube/RealData/beacon
/disk2/oldusers (21 GB)/disk2/data/IceCube/RealData/2007 (922 GB)
/disk3/data/IceCube/MCdata/22strings/V02-00-14/juliet
/misc/disk4/data/IceCube/MCdata/22strings/V02-00-14/corsika /misc/disk4/data/IceCube/MCdata/22strings/V02-00-14/juliet
/disk5/data/IceCube/RealData/2009/testdata (33 GB)
/disk7/data/IceCube/IceCube/MCdata/40strings/V02-02-13/juliet
/disk8/data/IceCube/MCdata/40strings/ /disk8/data/IceCube/MCdata/80strings/ /disk8/data/IceCube/MCdata/9strings/
/disk9/data/IceCube/RealData/22strings /disk9/data/IceCube/MCdata/80strings /disk9/data/IceCube/MCdata/DOMdata
/disk10/data/IceCube/Analysis /disk10/data/IceCube/MCdata/79strings /disk10/data/IceCube/MCdata/80strings /disk10/data/IceCube/MCdata/86strings /disk10/data/IceCube/RealData/79strings
/disk11/icecube/data/MCData/22strings/V02-00-01/corsika/SIBYLL/proton (488 GB) /disk11/icecube/data/MCData/22strings/V02-00-01/corsika/SIBYLL/iron (913 GB) /disk11/icecube/data/MCData/22strings/V02-00-01/corsika/QGSII/proton (336 GB) /disk11/icecube/data/MCData/22strings/V02-00-01/corsika/QGSII/iron (861 GB)
/disk12/data/IceCube/MCdata/79strings /disk12/data/IceCube/MCdata/80strings /disk12/data/IceCube/MCdata/dcorsika6900 /disk12/data/IceCube/RealData/40strings /disk12/data/ARA/calibration /disk12/data/ARA/opticalfiber
/disk13/data/IceCube/MCdata/80strings /disk13/data/IceCube/RealData/40strings
/disk14/data/IceCube/MCdata/79strings /disk14/data/IceCube/RealData/40strings
/disk15/data/IceCube/MCdata/79strings /disk15/data/IceCube/MCdata/80strings /disk15/data/IceCube/MCdata/86strings /disk15/data/IceCube/MCdata/SCdata /disk15/data/IceCube/MCdata/dcorsika6900 /disk15/data/IceCube/RealData/40strings /disk15/scratch
/disk16/data/ARA/opticalFiber /disk16/data/ARA/DTM* /disk16/data/IceCube/MCdata/40strings /disk16/data/IceCube/MCdata/59strings /disk16/data/IceCube/MCdata/79strings /disk16/data/IceCube/MCdata/80strings /disk16/data/IceCube/MCdata/86strings /disk16/data/IceCube/MCdata/dcorsika6900 /disk16/data/IceCube/MCdata/dcorsika6980 /disk16/data/IceCube/RealData/40strings /disk16/data/IceCube/RealData/59strings
ATM Network (s.chiba-u.ac.jp)
Gateway:133.82.130.254 Subnet Mask:255.255.255.0 DNS:133.82.130.1
Giga Network (phys.chiba-u.jp)
Gateway:10.25.254.254 Subnet Mask:255.255.0.0 DNS:10.245.1.95
/home/icecube/instructions/HOW_TO_MAKE_A_NEW_USERSome instructions regarding software installations can be found in:
/home/icecube/instructions
Condor is one of a batch system which can manange jobs.
The home page is here, and the manual is here, where you can get the detailed info.
You can learn something in the above page, but you don't normaly need to read it just to use the condor. I will give you some useful commands, and how to use it.
condor_submit [condor file] : to submit jobs
condor_status : to see computer status
condor_q : to see job status
condor_rm [job number] : to kill jobs
You can find the info in this page.
Let me give you the short course.
Make the following file as named test.cf.
Executable = a.out Universe = vanilla Error = err.$(Process) Input = in.$(Process) Output = out.$(Process) Log = foo.log
condor_submit test.cf
In order to pass your environment, you can add a following line. (This will be needed to run icecube software for example.)
GetEnv = TRUE
Now we have different architectures in our cluster, namely INTEL and X86_64, though they are both Xeon machines. (The limoncello is the only X86_64 machine so far.)
Though I don't completely understand, if you compile your source code in INTEL machine and submit it to condor, it doesn't enter into X86_64 machine. In order to avoid this you can edit condor file such as below;
Requirements = Arch == "INTEL" || Arch == "X86_64"
The X86_64 machine is compatible to INTEL, so that there is no problem to submit your job compiled with INTEL machine to the X86_64 machine. But, a job compiled with X86_64 machine will not run at INTEL machines.
The condor system running on grappa can be customized by modifying condor configuration files. These are located here:
/home/condor/condor/etc/condor_config /home/condor/condor/etc/*.local /home/condor/condor/etc/config.d
Occasionally the negotiator or scheduler are hanging, resulting in only a subset of the nodes accepting jobs. They can be restarted using these commands:
condor_restart -subsystem negotiator condor_restart -subsystem schedd
/home/condor/condor/etc/machinename.localFor example customized slots can be configured like this:
##-------------------------------------------------------------------- ## condor_startd ##-------------------------------------------------------------------- SLOT_TYPE_1 = cpus=2, ram=8000MB SLOT_TYPE_2 = cpus=2, ram=5800MB NUM_SLOTS_TYPE_1 = 5 NUM_SLOTS_TYPE_2 = 1The official documentation for this can be found here.
/etc/sysctl.confThe default value for the oom-killer task choice is
vm.oom_kill_allocating_task = 0By setting
vm.oom_kill_allocating_task = 1instead of performing the task scan the oom-killer will instead kill the process going over the memory limit. (A change in this file only becomes active after rebooting the machine.) Another possibility to handle these jobs is by configuring condor to put them on HOLD if they are overusing their requested resources. It is handled by this file on grappa:
/home/condor/condor/etc/config.d/60_startd_holdThis should put jobs on HOLD if they are using more memory than they requested with a small grace value, instead of just quietly killing them. The idea is that this will prevent the machines from dropping from condor and also make bookkeeping for users easier, because it should be more obvious which jobs are being killed/held.
We have the Intel compiler (version 12.1) in our cluster. It's xx% (has to be evaluated) faster than the gcc 4.4.
It's installed at /opt/intel/bin.
In order to use icc (Intel compiler), you can alias gcc to icc such as below.
(You can write it in your .cshrc (in case of tcsh). You also have to make a path to /opt/intel/bin.)
alias gcc 'icc -static -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE' alias cc 'icc -static -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE'
1, ログイン
ICRRの外からは、次のどちらかにログインします. icrlogin1.icrr.u-tokyo.ac.jp icrlogin2.icrr.u-tokyo.ac.jp job を投げたりするには、そこから更に、 icrhome6 にログインします。2, パスワード
icrlogin1、icrlogin2、icrhome6 のパスワードは別々に管理されています。 どこか一ヶ所で変更しても、別の場所のパスワードはかわりません。 icrhome6 では passwd ではなく、chpasswd を使います。3, バッチジョブ
3-1 基本
Moab 千葉大の condor にあたるソフトは、Moabです。 使い方は PBS とほとんど同じですが、投入時のコマンドは msub です。 詳しくは、マニュアル (PDF ファイル、HTML)を読んでください。 投入方法 計算機で走らせたいシェルスクリプト(***.sh)を用意して、 [someone@icrhome6 somewhere]$ msub ***.sh 確認方法 千葉とは違って、ジョブの数が多いので grep が便利です。 [someone@icrhome6 somewhere]$ showq | grep someone 消去方法 showqしたときに一番左に表示される数字(JOBID)を使います。 [someone@icrhome6 somewhere]$ canceljob JOBID 自分のジョブ全てをキャンセルするには [someone@icrhome6 somewhere]$ canceljob ALL を実行します。3-2 IceTray のジョブの投入
condor と違って、一気にジョブを投入できないので、 ジョブ投入用シェルスクリプト、バッチスクリプト、パイソンスクリプト の3つを用意します。 ジョブ投入用シェルスクリプト ポイントは2つです。 + -V オプションをつけて、ジョブが投入される計算機にも今設定してる IceTray の環境変数を反映させる + [someone@icrhome6 somewhere]$ msub ***.sh 100 のようにして、***.sh で argv 等を使っても引数「100」は 認識されません。そこで、引数にしたい数や文字列は環境変数にして 無理矢理読み込ませます。 例) JulietwzCmcGr_nue_40str_msub.sh バッチスクリプト #PBS が頭にある行に Moab の設定を書きます。 ジョブのログが記録されたファイルと、作った i3ファイルとの 対応関係を明確にするため、この例では、 ジョブ番号を使いました。($PBS_JOBID) 例) JulietwzCmcGr_nue_40str.sh パイソンスクリプト いつものパイソンです。例は、IC40 の nu_e を生成するスクリプトです。 例) JulietwzCmcGr_nue_40str.py4, 環境変数、フォトニクステーブル
/icrr/work/icecube/tools の下に必要なものがすべてあります。 具体的には、以下のような感じです。 setenv I3_PORTS /icrr/work/icecube/tools/i3_ports setenv JAVA_HOME /icrr/work/icecube/tools/java/jdk1.6.0_04 setenv ROOTSYS /icrr/work/icecube/tools/i3_ports/root-v5.16.00 tables -> /icrr/work/icecube/tools/photonics_table/AHAmodel/
Keiichi Mase Mio Ono Last modified: 2009-05-18 19:08:55