Architecture File Structure#
Architecture files follow a modular structure that defines how the code is built across different systems (CPU, GPU, compiler toolchains, and libraries).
They are typically organized into the following sections:
1. Acceleration Control#
Controls whether GPU acceleration is enabled.
Acceleration = 0 # CPU-only build
Acceleration = 1 # GPU-enabled build
When enabled, CUDA-related variables and libraries must be defined.
2. Library Paths#
Defines paths to external dependencies.
Common variables:
HDF5_PATH
LIBXC_PATH
FFTW_PATH
P3DFFT_PATH
LUA_PATH
ACCEL_PATH
MATH_PATH
Notes: - Empty paths indicate the package will be built internally. - External installations should be explicitly provided. - GPU builds require ACCEL_PATH (CUDA).
3. Library Linking (LIBS)#
Specifies libraries required during linking.
Since core external libraries (HDF5, LIBXC, FFTW, P3DFFT) are built internally, they should NOT be included here.
Typical patterns:
Minimal CPU (Generic BLAS/LAPACK):
LIBS += -lblas -llapack -lm -lpthread
Intel MKL (Sequential):
LIBS += -L${MKL_ROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core \
-lpthread -lm -ldl
Intel MKL (MPI + ScaLAPACK):
LIBS += -L${MKL_ROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential \
-lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \
-lpthread -lm -ldl
Intel OneAPI (Simplified):
LIBS += -qmkl=cluster -lifcore
GNU + OpenBLAS:
LIBS += -lopenblas -lm -lpthread
NVHPC with NVPL:
LIBS += -Mnvpl -Mscalapack
NVHPC + explicit BLAS/LAPACK:
LIBS += -lblas -llapack -lscalapack
CUDA (Basic GPU support):
LIBS += -L${NVHPC_ROOT}/cuda/lib64 -lcudart -lcuda \
-L${NVHPC_ROOT}/math_libs/lib64 -lcublas -lcusolver \
-lstdc++ -lm
CUDA + NVHPC (optimized GPU build):
LIBS += -L${NVHPC_ROOT}/compilers/lib -lblas -llapack \
-L${NVHPC_ROOT}/cuda/lib64 \
-lcudart -lnvtx3interop \
-cuda -cudalib=cublas,cusolver \
-gpu=cc90,cuda12.9,lineinfo \
-lstdc++
MKL + CUDA hybrid:
LIBS += -L${MKL_ROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential \
-lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \
-L${NVHPC_ROOT}/cuda/lib64 -lcudart -lcuda \
-L${NVHPC_ROOT}/math_libs/lib64 -lcublas -lcusolver \
-lstdc++ -lpthread -lm -ldl
MKL + CUDA + GNU Fortran runtime:
LIBS += -L${MKL_ROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential \
-lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \
-L${NVHPC_ROOT}/cuda/lib64 -lcudart -lcuda \
-L${NVHPC_ROOT}/math_libs/lib64 -lcublas -lcusolver \
-lstdc++ -lpthread -lm -ldl -lgfortran
Notes:
- Choose only ONE math backend (MKL, OpenBLAS, or NVPL).
- GPU builds must include CUDA runtime libraries.
- Add -lgfortran when mixing GNU Fortran with C/C++ linking.
4. Compiler Toolchain#
Defines MPI-enabled compilers.
CC = mpicc
CXX = mpicxx
F77 = mpif90
FC = mpif90
MPICC = mpicc
ACCEL_CXX = nvcc -arch=sm_XX
ARCHV = ar -r
Notes: - Intel toolchains may use mpiicx / mpiifort. - GPU builds require nvcc.
5. Compilation and Preprocessor Flags#
Controls optimization, parallelism, and preprocessing.
FPPDEFS = -cpp
CPPDEFS = -cpp
FPPFLAGS = -DMPI -DMaxOutProcs=1
CFLAGS = -O3
CXXFLAGS = -O3 -std=c++14
FFLAGS = -O3 -I.
Optional flags:
-DUSE_SCALAPACK
-march=native
-fallow-argument-mismatch
6. OpenMP Support#
Defines shared-memory parallelism.
OPT_OPENMP = -fopenmp # GNU / Intel
OPT_OPENMP = -mp # NVHPC
Used in:
CFLAGS, CXXFLAGS, FFLAGS, LD_FLAGS
7. Linker Settings#
LD_FLAGS = $(OPT_OPENMP)
LD = $(FC) $(LD_FLAGS)
Notes: - Some builds omit LD_FLAGS. - GPU builds may include additional CUDA linking flags.
8. External Package Configuration#
Used for building dependencies.
HDF5_CONFIG_FLAGS
LIBXC_CONFIG_FLAGS
P3DFFT_CONFIG_FLAGS
FFTW_CONFIG_FLAGS
Typical pattern:
HDF5_CONFIG_FLAGS = --enable-fortran CC=$(CC) FC=$(FC)
Notes: - MPI and OpenMP are usually enabled. - Compiler wrappers must match the toolchain.
9. GPU-Specific Configuration#
Only required when acceleration is enabled.
ACCEL = CUDA
ACCEL_PATH = /path/to/cuda
ACCEL_CXX = nvcc -arch=sm_XX
Typical GPU libraries:
-lcudart -lcublas -lcusolver
Architecture examples:
sm_60, sm_70, sm_80, sm_90