top of page

Research Projects


 Exascale Computing Project

Screen Shot 2020-05-13 at 11.45.11

Heterogeneous catalysis and the design of new catalysts is a grand challenge problem in computational chemistry that will require the capabilities of exascale computing. The GAMESS project is extending methods and algorithms based on chemical fragmentation methods and coupling these with high-fidelity quantum chemistry (QC) simulations to solve this problem.


As part of the Exascale Computing Project, GAMESS is currently being refactored to take advantage of modern computer hardware and software, and the capabilities of the C++ libcchem code that is co-developed with GAMESS are being greatly expanded.


Our group is at the forefront of the GAMESS ECP initiative, which is a concerted effort of the Oak Ridge and Argonne Leadership Computing Facilities, the Ames National Lab, and various other US collaborators and vendors. 


Our development in this project involves leveraging novel hardware architectures and programming model with the ultimate goal of devising software that can be executed on the exascale machines Frontier and Aurora to push the edge of what is currently achievable in chemical modelling.

Hardware Synergistic Algorithms 


The evolution of scientific calculations is contingent on the concerted advancement of the underlying algorithms and the computer systems that they use.


It is now a time of fundamental change in computing. The end of Dennard scaling and the inevitable slowdown of Moore's law are progressively leading to the end of the general-purpose processor's era, due to their power-inefficient use of transistors. In order to achieve higher performance than general-purpose systems, novel supercomputers adopt heterogeneous architectures where CPUs are dedicated mostly to flow control while special-purpose accelerators absorb the vast majority of the computational workload. 


The focus of this project is to design novel computational science algorithms that are more scalable and that can efficiently reap performance benefits from the increasingly complex and heterogeneous computer hardware.


Two examples of hardware-synergistic algorithms that we developed for application in computational chemistry are the Fragmentation-Based Accelerated Hartree-Fock algorithm in libcchem (GAMESS), and the Q-MP2 algorithm in Q-Chem.

Architecture and Structure Aware Linear Algebra

Linear algebra (LA) operations are fundamental to a large number of computational science algorithms. The applications span the entire scientific board, with machine learning (ML) algorithms being among the most reliant on LA operations; they provide the mathematics that underpins much of what we do.  Historically, this fact has driven the development of a plethora of libraries providing high-performance implementations of LA algorithms: BLAS, OpenBLAS, cuBLAS, CLBLAS, LAPACK, ARPACK, ATLAS, cuSOLVER, MAGMA and many more. For a given LA operation, the choice can be bewildering for the programmer, especially given that within the same library there may be several algorithms yielding different performance depending, for example, on the specific structure of the matrices involved.


The pursuit of optimal LA algorithms is significantly complicated by the increasing architectural heterogeneity of the high-performance computing (HPC) platforms, with a variable mix of general-purpose processors (CPUs) and accelerators (GPUs, DSPs, FPGAs, etc.), and complex associated memory hierarchies and file systems


This project aims to build an Architecture and Data-Structure Aware Linear Algebra (ADSALA) software package that will use machine learning to learn the hardware/data-structure/package/algorithm relationships when compiled on a specific hardware architecture for a spectrum of LA packages. At runtime, after analysing the structural features of the data structures involved, ADSALA will choose the most appropriate package/algorithm for a given LA operation, and assign the computation to the best combination of hardware resources (CPU, GPU), seeking to minimize execution time – all subject to specific user-defined constraints.



bottom of page