KIPAC Orange Cluster
Introduction
The KIPAC "orange" cluster comprises 90 compute nodes, 1 head node, a parallel filesystem and a high speed DDR infiniband interconnect. Each node hosts two Dual-Core AMD Opteron(tm) Processor 2218's running at 2.6 GHz and 16 GB of memory. Hence the 360 core cluster has 4 GB/core totaling 1.44 TB of memory. Complementing the memory and compute capacity is an I/O system intended to allow memory <--> disk read/write cycles in a reasonable amount of time. The expected through put is 1 GB/s or better for the lustre version 1.6 based system. The nominal configuration will have 2 MDS/MGS machines as an HA pair, 6 OSS machines and 6 OST's connected via an FC switch to the OSS's. All nodes, compute and filesystem, will be interconnected via DDR Infiniband in a 50% blocking factor tree configuration. The details of the components are shown in the table:
| item | qnty | description |
| interactive node | 1 | Sun X2200 w/2218 CPU, 16x1GB memory, 250 GB HD |
| compute node | 90 | Sun X2200 w/2218 CPU, 16x1GB memory, 250 GB HD |
| lustre MGS/MDS | 2 | Sun X2200 w/2218 CPU, 8x1GB memory, 250 GB HD |
| lustre OSS | 6 | Sun X2200 w/2218 CPU, 4x1GB memory, 250 GB HD |
| DISK Array | 3 | Sun 6140 w/2GB cache, 16x500 GB HD, dual controller, configured as 2x{6+1+1} RAID 5 |
| FC switch | 1 | 16 port McData FC switch |
| racks | 4 | Sun Rack 1000-38 w/power distribution etc. |
| IB switch | 11 | Cisco 7000D series DDR IB switches (4 core, 7 leaf (<=12) |
| IB HCA | 100 | Cisco DDR HCA |
| IB cables | 156 | 100 1m, 16 3m, 40 5m DDR rated IB cables |
| Ethernet switch | 2 | 48 Port Cisco ~3500 class GigE switch as uplink to main network |
| management switch | 4 | Cisco ???, for service processor (SP) connection |
| serial concentrator | 4 | various for console access and logging distribution |
Current Status
The compute nodes are fully functional. Basic monitoring and job control/submission is functional. Additional monitoring and control features will be added over time. Nodes 001...080 are available for batch processing. The Lustre file system is functional and is being tested on a subset of the compute nodes (081..090). Aggregate write/read speeds are typically at the 700/800 MB/s scale with no optimizations yet applied. The Lustre file system is expected to be in service in early July. NFS space is available for job output and long term storage.How to Run jobs on this system -- click here