Notes on Chemistry codes

I attended and gave a tutorial at the 11th LCI International conference last year at the Pittsburgh Super Computing Center. There, I had the honor to meet several leading quantum chemistry HPC code researchers. One of them, Dr. Wang Yang. Over reception, I picked his brain on how quantum chemistry codes work and why they are important to super computing research. The 1/2 page nodes that I took gradually turned into a semi-research document as I filled in the details over the last year. Quantum chemistry codes are some of the most important for super computing research, as they scale extremely well. Dr. Yang at the time was able to scale his research simulation to almost 120,000 cores on the new #1 super computer at the time Jaguar at Oakridge.

Here’s my notes on chemistry code, and please feel free to send me comments or suggestions, and I hope this helps you to learn more about the field as well.

Molecular properties and interactions depend largely on their electronic structure, particularly that of the outer or "most exposed" electron shell. There are a number of commonly used methods employed for electronic structure calculations in physics and quantum chemistry. The calculations tell us about physical arrangement or distribution of the electrons, their quantum mechanical states, molecular structures, and interatomic bond strengths. From this information, we can derive the bulk description of the material including its conductivity, mechanical properties, bulk structure, and many other attributes.

By looking at electron distributions between atoms, we can determine the type and properties of molecular bonds.

At the root of this approach is determining solutions to the Schrödinger equation. Unfortunately, as we add particles to the problem (nuclei and electrons of atoms in the molecule), the equation quickly becomes virtually unsolvable due to the immense computational effort involved, except for the simplest of cases. Even within a simple solid, consisting of one type of atom, we must consider interactions among the electrons of neighboring atoms. But there are simply too many electrons, making everything complicated, as the many-particle Schrödinger equation cannot be simplified due to the interaction term. During the 1960s, Walter Kohn and others developed Density Functional Theory, an iterative approach toward understanding electronic properties.

Basically, DFT reduces the scope of the calculation to a single electron problem. What started out as a problem with many electrons and nuclei is now a many nuclei, single electron problem, which is much easier to solve. We still need to consider electron-electron and electron-nucleus interactions. At this point, DFT makes a further simplification: the mean field representation of all other electrons in the system.

As a result, the effective electron-electron interaction becomes an electron-mean field interaction. All other individual electrons are no longer in the picture and the only thing left is the mean field, a function of the electron density.

Now we solve the single-electron Schrödinger equation, whose solution is the wave function for the system. From the wave function, we determine the electron density and arrive at an expression for the mean field. The latter introduces an exchange-correlation potential that can be calculated using approximations such as the local-density approximation (LDA), which depends only on the density itself; the generalized gradient approximation (GGA), which depends on the gradient of the density; or some other advanced methods. Finally, we arrive at the mean field and our first iteration is complete. We plug the mean field back into the Schrödinger equation and repeat the calculation until we converge to a consistent mean field.

To summarize:

At the beginning, the electron wave function and mean field are both unknown. We assume a mean field as a starting function and calculate the electron density, which is the square of the absolute value of the wave function. We iterate the calculation until we arrive at a self-consistent result.

Since we don't know how the mean field depends on electron density, we use approximations (either LDA or GGA). Similar calculations are performed in molecular simulation by packages such as VASP (Vienna ab-initio Package Simulation) and Gaussian.

The molecular dynamics approach based on DFT is called Carr-Parrinello Molecular Dynamics (CPMD). This is an ab-initio quantum mechanical method in which we use the Schrödinger equation and assume the nuclei are at equilibrium and unperturbed.

Instead of such quantum mechanical methods, we could use a simplified quasi-classical representation of the interactions by means of a suitable force field. In this approach, the primary variables are distances between atoms. We do not calculate any densities or mean fields, since the force field is fixed. Although less accurate, this approach is popular when studying biological systems, where the number of atoms (electrons and nuclei) is huge (hundreds, thousands, or more). Such systems usually involve a small variety of atoms, namely carbon, hydrogen, nitrogen, and oxygen, which has motivated researchers to devise force field functions optimized for calculations on molecules consisting of these atoms. Some classical force fields used in biomolecular and organic chemistry include AMBER, CHARMM, OPLS, and ECEPP. More general force fields applicable to atoms of most or all elements in the periodic table also exist (UFF). These force fields and their variations are the culmination of years or decades of development effort and comparison to experimentally observed molecular structures and properties.

I am also proud to note that AMBER, GAMESS are available on the Windows HPC platform, and a port of Gaussian is in the works.  If you would like to know more details, please do feel free to send me a note via the blog portal.