Alderaan

From openwfm
Jump to navigation Jump to search


Funding

This cluster is funded by NSF grant 2019089 "CC* Compute: Accelerating Science and Education by Campus and Grid Computing" under the NSF CC* program. The cluster will be integrated with the Open_Science_Grid (OSG). At least 20% of capacity will be contributed to the OSG as required by the program.

Configuration

Hardware

  • 2048 AMD EPYC 2 cores and 4TB DDR4 memory in 8 compute nodes
  • 256 AMD EPYC 2 cores, 2 NVIDIA Tesla A100 and 4TB DDR4 memory in 2 high-memory GPU nodes
  • 128 AMD EPYC 2 cores in a head node
  • 1PB storage (raw)
  • HDR100 Infiniband interconnect
  • 10Gb/s connectivity path from each node to Internet 2

Software

  • scheduler, cluster tools, GNU compiler stack with MPI over Infiniband
  • CUDA for GPUs
  • shared filesystem over Infiniband

Expected progress

  • Fall semester 2020
    • Purchase order
    • Planning
    • Test software installations
  • Spring semester 2021
    • Delivery and installation
    • Testing with the vendor's software configuration
    • Finalize networking, access, and security plan
    • Open to early users
    • Install user web interface - JupyterHub, FastX
    • Install OSG - run OSG jobs and submit jobs to OSG
  • AY 2021/2022
    • Open to users
    • Finalize integration within the local network

User training

  • submission of OSG jobs
  • web access to the cluster
  • ssh command line usage

Contact: jan.mandel@ucdenver.edu