Difference between revisions of "WRF-SFIRE and WRFx on Alderaan"
(35 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[Category:WRF-SFIRE]] | [[Category:WRF-SFIRE]] | ||
− | + | ||
==Initial setup== | ==Initial setup== | ||
− | Following Atipa User Guide Phoenix | + | Following [https://olucdenver-my.sharepoint.com/:b:/g/personal/jan_mandel_ucdenver_edu/ERpQiDN2uslNhA0JTqOXqi0BMLaWV-rLpnmz9rVIi5hVgA?e=50suU4 Atipa User Guide Phoenix] (link requires CU Denver login) |
===SSH=== | ===SSH=== | ||
Line 16: | Line 16: | ||
Passwordless ssh to head node works fine. Passwordless ssh to compute nodes was disabled for regular users intentionally. | Passwordless ssh to head node works fine. Passwordless ssh to compute nodes was disabled for regular users intentionally. | ||
=== Compilers=== | === Compilers=== | ||
+ | <pre> | ||
+ | [jmandel@math-alderaan ~]$ gcc --version | ||
+ | gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5) | ||
+ | [jmandel@math-alderaan ~]$ ls -l /shared | ||
+ | drwxr-xr-x. 3 root root 17 Mar 5 20:21 aocl-linux-gcc-2.2-5 | ||
+ | drwxr-xr-x. 3 root root 23 Mar 5 20:20 jemalloc-5.2.1 | ||
+ | drwxr-xr-x 7 root root 80 Mar 19 06:26 modulefiles | ||
+ | drwxr-xr-x. 3 root root 23 Mar 5 20:27 openmpi-4.1.0 | ||
+ | drwxr-xr-x 3 root root 23 Mar 19 06:22 openmpi-4.1.0-cuda | ||
+ | ls /shared/openmpi-4.1.0/ | ||
+ | gcc-9.2.1 | ||
+ | ls /shared/openmpi-4.1.0/gcc-9.2.1/bin | ||
+ | mpiCC mpicxx mpif90 ompi-clean opal_wrapper orte-server orterun oshcc oshmem_info shmemc++ shmemfort | ||
+ | mpic++ mpiexec mpifort ompi-server orte-clean ortecc oshCC oshcxx oshrun shmemcc shmemrun | ||
+ | mpicc mpif77 mpirun ompi_info orte-info orted oshc++ oshfort shmemCC shmemcxx | ||
+ | </pre> | ||
+ | So gcc is at 8.3.1 and there is no other compiler on the system. Openmpi seems compiled with gcc 9 though. Created | ||
+ | .bash_profile with the line | ||
+ | <pre> | ||
+ | PATH="/shared/openmpi-4.1.0/gcc-9.2.1/bin:$PATH" | ||
+ | </pre> | ||
+ | Copied example files | ||
+ | <pre> | ||
+ | mkdir test | ||
+ | cp -a /opt/phoenix/doc/examples test | ||
+ | </pre> | ||
+ | Fixed missing int before main in mpi-example.c (in the guide called mpihello.c). | ||
+ | |||
+ | The guide says "Intel Math Kernel Library is installed on all Atipa clusters in /opt/intel/cmkl" but there is no such thing. | ||
+ | |||
+ | ===Modules=== | ||
+ | |||
+ | But modules are there: | ||
+ | <pre> | ||
+ | [jmandel@math-alderaan examples]$ module avail | ||
+ | ------------------------------------------------ /shared/modulefiles ------------------------------------------------ | ||
+ | aocl/2.2-5 gcc/9.2.1 jemalloc/5.2.1/gcc/9.2.1 openmpi-cuda/4.1.0/gcc/9.2.1 openmpi/4.1.0/gcc/9.2.1 | ||
+ | </pre> | ||
+ | changing .bash_profile to | ||
+ | <pre> | ||
+ | module load gcc/9.2.1 openmpi/4.1.0/gcc/9.2.1 | ||
+ | </pre> | ||
+ | |||
+ | ===MPI and scheduler=== | ||
+ | Copied slurm_submit.sh from the guide, made minor changes | ||
+ | <pre> | ||
+ | [jmandel@math-alderaan examples]$ mpicc mpi-example.c | ||
+ | [jmandel@math-alderaan examples]$ cat slurm_submit.sh | ||
+ | #!/bin/bash | ||
+ | ### Sets the job's name. | ||
+ | #SBATCH --job-name=mpihello | ||
+ | ### Sets the job's output file and path. | ||
+ | #SBATCH --output=mpihello.out.%j | ||
+ | ### Sets the job's error output file and path. | ||
+ | #SBTACH --error=mpihello.err.%j | ||
+ | ### Requested number of nodes for this job. Can be a single number or a range. | ||
+ | #SBATCH -N 4 | ||
+ | ### Requested partition (group of nodes, i.e. compute, fat, gpu, etc.) for the resource allocation. | ||
+ | #SBATCH -p compute | ||
+ | ### Requested number of tasks to be invoked on each node. | ||
+ | #SBATCH --ntasks-per-node=4 | ||
+ | ### Limit on the total run time of the job allocation. | ||
+ | #SBATCH --time=10:00 | ||
+ | ### Amount of real memory required per node. | ||
+ | #SBATCH --mem-per-cpu=100 | ||
+ | module list | ||
+ | mpirun a.out | ||
+ | [jmandel@math-alderaan examples]$ sbatch slurm_submit.sh | ||
+ | </pre> | ||
+ | |||
+ | ==Building WRF-SFIRE== | ||
+ | Following [[Running WRF-SFIRE with real data in the WRFx system]] | ||
+ | ===Libraries=== | ||
+ | Downloaded libraries and built NETCDF following https://www2.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php | ||
+ | |||
+ | Note: NETCDF is built with --disable-netcdf-4 ! | ||
+ | |||
+ | Changing .bash_profile to | ||
+ | <pre> | ||
+ | module load gcc/9.2.1 openmpi/4.1.0/gcc/9.2.1 | ||
+ | DIR=$HOME/libraries | ||
+ | export CC="gcc" | ||
+ | export CXX="g++" | ||
+ | export FC="gfortran" | ||
+ | export FCFLAGS="-m64" | ||
+ | export F77="gfortran" | ||
+ | export FFLAGS="-m64" | ||
+ | export JASPERLIB="$DIR/grib2/lib" | ||
+ | export JASPERINC="$DIR/grib2/include" | ||
+ | export LDFLAGS="-L$DIR/grib2/lib" | ||
+ | export CPPFLAGS="-I$DIR/grib2/include" | ||
+ | export PATH="$DIR/netcdf/bin:$PATH" | ||
+ | export NETCDF="$DIR/netcdf" | ||
+ | </pre> | ||
+ | |||
+ | ===WRF-SFIRE=== | ||
+ | Following [[Running WRF-SFIRE with real data in the WRFx system]]. | ||
+ | Tested building by | ||
+ | <pre> | ||
+ | ./configure -d | ||
+ | ./compile em_fire | ||
+ | </pre> | ||
+ | and submitting by modified slurm_submit.sh with | ||
+ | <pre> | ||
+ | mpirun -np 1 ./ideal.exe | ||
+ | mpirun ./wrf.exe | ||
+ | </pre> | ||
+ | |||
+ | ===WPS=== | ||
+ | Following Running WRF-SFIRE with real data in the WRFx system. From parent directory: | ||
+ | <pre> | ||
+ | git clone https://github.com/openwfm/WPS | ||
+ | ln -s WRF-SFIRE WRF | ||
+ | cd WPS | ||
+ | ./configure | ||
+ | ./compile >& compile_wps.log & | ||
+ | </pre> | ||
+ | Replace WRF-SFIRE by the name used. | ||
+ | The WPS repository is unmodified fork of https://github.com/wrf-model/WPS, currently frozen at release-v4.2. | ||
+ | |||
+ | ===PnetCDF=== | ||
+ | Add to .bash_profile | ||
+ | <pre> | ||
+ | export MPICC=mpicc | ||
+ | export MPICXX=mpicxx | ||
+ | export MPIF77=mpif77 | ||
+ | export MPIF90=mpif90 | ||
+ | # export PNETCDF="$DIR/pnetcdf" | ||
+ | </pre> | ||
+ | Use $PNETCDF only when needed. source .bash_profile again, then | ||
+ | <pre> | ||
+ | wget https://parallel-netcdf.github.io/Release/pnetcdf-1.12.2.tar.gz | ||
+ | tar xvfz pnetcdf-1.12.2.tar.gz | ||
+ | cd pnetcdf-1.12.2/ | ||
+ | ./configure --prefix=$HOME/libraries/pnetcdf | ||
+ | make | ||
+ | make install | ||
+ | </pre> | ||
+ | |||
+ | ==Building WRFx== | ||
+ | Following [[Running WRF-SFIRE with real data in the WRFx system]] | ||
+ | ===[[Running_WRF-SFIRE_with_real_data_in_the_WRFx_system#WRFx:_Requirements_and_environment|Installing anaconda]]=== | ||
+ | Had to add at the end of .bash_profile | ||
+ | <pre> | ||
+ | unset PYTHONPATH | ||
+ | source ~/.bashrc | ||
+ | </pre> | ||
+ | Do this '''before''' installing anaconda. For some reason the system has PYTHONPATH set which throws anaconda off, and the shell when it starts does not source .bashrc, which anaconda modifies and relies on to implement conda environments. |
Latest revision as of 19:05, 4 August 2021
Initial setup
Following Atipa User Guide Phoenix (link requires CU Denver login)
SSH
There was no .ssh directory in my account. Then passwordless ssh to compute nodes does not work contrary to the guide page 8. Of course commands over compute nodes such as fornodes -s “ps aux | grep user” also do not work.
Setting up ssh:
ssh-keygen cat id_rsa.pub >> authorized_keys
Passwordless ssh to head node works fine. Passwordless ssh to compute nodes was disabled for regular users intentionally.
Compilers
[jmandel@math-alderaan ~]$ gcc --version gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5) [jmandel@math-alderaan ~]$ ls -l /shared drwxr-xr-x. 3 root root 17 Mar 5 20:21 aocl-linux-gcc-2.2-5 drwxr-xr-x. 3 root root 23 Mar 5 20:20 jemalloc-5.2.1 drwxr-xr-x 7 root root 80 Mar 19 06:26 modulefiles drwxr-xr-x. 3 root root 23 Mar 5 20:27 openmpi-4.1.0 drwxr-xr-x 3 root root 23 Mar 19 06:22 openmpi-4.1.0-cuda ls /shared/openmpi-4.1.0/ gcc-9.2.1 ls /shared/openmpi-4.1.0/gcc-9.2.1/bin mpiCC mpicxx mpif90 ompi-clean opal_wrapper orte-server orterun oshcc oshmem_info shmemc++ shmemfort mpic++ mpiexec mpifort ompi-server orte-clean ortecc oshCC oshcxx oshrun shmemcc shmemrun mpicc mpif77 mpirun ompi_info orte-info orted oshc++ oshfort shmemCC shmemcxx
So gcc is at 8.3.1 and there is no other compiler on the system. Openmpi seems compiled with gcc 9 though. Created .bash_profile with the line
PATH="/shared/openmpi-4.1.0/gcc-9.2.1/bin:$PATH"
Copied example files
mkdir test cp -a /opt/phoenix/doc/examples test
Fixed missing int before main in mpi-example.c (in the guide called mpihello.c).
The guide says "Intel Math Kernel Library is installed on all Atipa clusters in /opt/intel/cmkl" but there is no such thing.
Modules
But modules are there:
[jmandel@math-alderaan examples]$ module avail ------------------------------------------------ /shared/modulefiles ------------------------------------------------ aocl/2.2-5 gcc/9.2.1 jemalloc/5.2.1/gcc/9.2.1 openmpi-cuda/4.1.0/gcc/9.2.1 openmpi/4.1.0/gcc/9.2.1
changing .bash_profile to
module load gcc/9.2.1 openmpi/4.1.0/gcc/9.2.1
MPI and scheduler
Copied slurm_submit.sh from the guide, made minor changes
[jmandel@math-alderaan examples]$ mpicc mpi-example.c [jmandel@math-alderaan examples]$ cat slurm_submit.sh #!/bin/bash ### Sets the job's name. #SBATCH --job-name=mpihello ### Sets the job's output file and path. #SBATCH --output=mpihello.out.%j ### Sets the job's error output file and path. #SBTACH --error=mpihello.err.%j ### Requested number of nodes for this job. Can be a single number or a range. #SBATCH -N 4 ### Requested partition (group of nodes, i.e. compute, fat, gpu, etc.) for the resource allocation. #SBATCH -p compute ### Requested number of tasks to be invoked on each node. #SBATCH --ntasks-per-node=4 ### Limit on the total run time of the job allocation. #SBATCH --time=10:00 ### Amount of real memory required per node. #SBATCH --mem-per-cpu=100 module list mpirun a.out [jmandel@math-alderaan examples]$ sbatch slurm_submit.sh
Building WRF-SFIRE
Following Running WRF-SFIRE with real data in the WRFx system
Libraries
Downloaded libraries and built NETCDF following https://www2.mmm.ucar.edu/wrf/OnLineTutorial/compilation_tutorial.php
Note: NETCDF is built with --disable-netcdf-4 !
Changing .bash_profile to
module load gcc/9.2.1 openmpi/4.1.0/gcc/9.2.1 DIR=$HOME/libraries export CC="gcc" export CXX="g++" export FC="gfortran" export FCFLAGS="-m64" export F77="gfortran" export FFLAGS="-m64" export JASPERLIB="$DIR/grib2/lib" export JASPERINC="$DIR/grib2/include" export LDFLAGS="-L$DIR/grib2/lib" export CPPFLAGS="-I$DIR/grib2/include" export PATH="$DIR/netcdf/bin:$PATH" export NETCDF="$DIR/netcdf"
WRF-SFIRE
Following Running WRF-SFIRE with real data in the WRFx system. Tested building by
./configure -d ./compile em_fire
and submitting by modified slurm_submit.sh with
mpirun -np 1 ./ideal.exe mpirun ./wrf.exe
WPS
Following Running WRF-SFIRE with real data in the WRFx system. From parent directory:
git clone https://github.com/openwfm/WPS ln -s WRF-SFIRE WRF cd WPS ./configure ./compile >& compile_wps.log &
Replace WRF-SFIRE by the name used. The WPS repository is unmodified fork of https://github.com/wrf-model/WPS, currently frozen at release-v4.2.
PnetCDF
Add to .bash_profile
export MPICC=mpicc export MPICXX=mpicxx export MPIF77=mpif77 export MPIF90=mpif90 # export PNETCDF="$DIR/pnetcdf"
Use $PNETCDF only when needed. source .bash_profile again, then
wget https://parallel-netcdf.github.io/Release/pnetcdf-1.12.2.tar.gz tar xvfz pnetcdf-1.12.2.tar.gz cd pnetcdf-1.12.2/ ./configure --prefix=$HOME/libraries/pnetcdf make make install
Building WRFx
Following Running WRF-SFIRE with real data in the WRFx system
Installing anaconda
Had to add at the end of .bash_profile
unset PYTHONPATH source ~/.bashrc
Do this before installing anaconda. For some reason the system has PYTHONPATH set which throws anaconda off, and the shell when it starts does not source .bashrc, which anaconda modifies and relies on to implement conda environments.