Audit of a Density Functional Theory application suggested a potentially improved communications scheme

Software for Chemistry & Materials (SCM) is an Amsterdam-based computational chemistry software company. It was originally a spinout from the Vrije Universiteit. SCM supports and develops the ADF Modelling Suite, centred around the flagship program Amsterdam Density Functional (ADF), which was originally developed in the 1970s.

A standalone part of the modelling suite, named DFTB, is a fast-approximate Density Functional Theory approach for molecules and periodic systems.

What is DFTB?

The Density-Functional based Tight-Binding technique enables calculations on large systems for long timescales even on a desktop computer. Relatively accurate results are obtained at a fraction of the cost of full DFTB by reading in pre-calculated parameters (Slater–Koster files), using a minimal basis and only including nearest-neighbour interactions. Long-range interactions are described with empirical dispersion corrections and third-order corrections accurately handle charged systems.

The SCM implementation of the DFTB method can perform single point calculations, geometry optimizations, transition state searches, frequency calculations, and molecular dynamics. Molecules as well as periodic systems can be handled ensuring a smooth link with full DFT codes. It can be used as a stand-alone command line program, or from the graphical interface.

Objective

The objective of this work was to investigate the possibilities of improving the DFTB application performance on large systems distributed-memory machines. 

Method

Audit – An initial audit of the code took two months and focused on a density matrix purification technique in DFTB.  A set of performance metrics were used to assess the performance and identify any limiting issues, these metrics relate to computational scaling, load balance and communication. This method yielded a number of interesting results and identified an opportunity for improvement of its matrix multiplication kernel.

Report – The audit findings were delivered to the DFTB development group. The application was shown to have very good computational load balance, but it was recommended that some improvements should be made in the area of the MPI based communications code. Changes were expected to improve scalability on distributed-memory HPC platforms. In particular, it was suggested that it should be worthwhile to look into ways that communication could be conducted at the same time as computation.


Analysis of the distributions of instructions and periods of computation ("useful duration") showed that computational load was well-balanced across participating processes, so the main scope for improvement was in the communication scheme.

Code improvements – For this project the advised enhancements were implemented as a proof of concept (POC) by the NAG team with direction from the SCM developers. The POC took four months to deliver and was tested on a large HPC machine. It clearly demonstrated that a more up-to-date version of the MPI Library had potential for improving the performance of the application, but unfortunately that was unavailable for testing on the development machine used at the time.

On-going – Even though there was not an explicit test with the newest versions of the MPI Library in the POC, the results were encouraging enough that the developers of the application decided to continue development of the improved code in the future. Further investigations will take place to look for performance scalability improvements when there is time.

Note: This work was carried out by NAG staff working under the remit of the Performance Optimisation and Productivity Centre of Excellence in Computing Applications (POP)