Check autograd forces against analytical forces

Check autograd forces against analytical forces#

This notebook compares the forces computed with automatic differentiation (with respect to bond vectors, not atom coordinates) against the analytical forces for the Al99 Embedded Atom potential, as implemented by ASE.

First, fetch the EAM potential data from interatomic potentials repository:

!curl https://www.ctcms.nist.gov/potentials/Download/1999--Mishin-Y-Farkas-D-Mehl-M-J-Papaconstantopoulos-D-A--Al/2/Al99.eam.alloy -o Al99.eam.alloy
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  762k  100  762k    0     0  3592k      0 --:--:-- --:--:-- --:--:-- 3612k

Gradient checking is best done in 64-bit floating point precision:

from pathlib import Path

import torch
from ase.calculators.eam import EAM
from ase.build import bulk

import nfflr
from nfflr.data import graph
from nfflr.models.classical.eam import TorchEAM

torch.set_default_dtype(torch.float64)
/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
2024-03-08 20:06:50,341	INFO util.py:154 -- Missing packages: ['ipywidgets']. Run `pip install -U ipywidgets`, then restart the notebook server for rich notebook output.
2024-03-08 20:06:50,495	INFO util.py:154 -- Missing packages: ['ipywidgets']. Run `pip install -U ipywidgets`, then restart the notebook server for rich notebook output.
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[2], line 9
      7 import nfflr
      8 from nfflr.data import graph
----> 9 from nfflr.models.classical.eam import TorchEAM
     11 torch.set_default_dtype(torch.float64)

File ~/work/nfflr/nfflr/nfflr/models/classical/eam.py:12
      9 import torch
     10 from torch import nn
---> 12 from torchcubicspline import natural_cubic_spline_coeffs, NaturalCubicSpline
     14 import nfflr
     17 @dataclass
     18 class EAMData:

ModuleNotFoundError: No module named 'torchcubicspline'

We set up a small FCC aluminum system and add a small amount of jitter to the atomic coordinates:

# set up a small bulk Aluminum calculation
a = 4.05  
al_ase = bulk("Al", "fcc", a) * [4, 4, 4]
al_ase.rattle(stdev=0.05)
al_ase.wrap()
ase_eam = EAM(potential="Al99.eam.alloy")
al_ase.set_calculator(ase_eam)

# set up pytorch version
al = nfflr.Atoms(al_ase.get_cell().array, al_ase.get_scaled_positions(), al_ase.numbers)
torch_eam = TorchEAM("Al99.eam.alloy", dtype=torch.float64)

The ASE implementation computes the forces using the analytical gradients of the EAM spline components, while the pytorch implementation uses automatic differentiation to compute the gradient of the energy with respect to the bond displacement vectors. A sum reduction is used to aggregate these into the force components on individual atoms.

Both the energies and the force components on all the atoms match to within floating point precision:

# construct radius graph matching ASE cutoff
g = graph.periodic_radius_graph(al, r=torch_eam.data.cutoff, dtype=torch.float64)

# evaluate energies and forces with pytorch implementation
e_dgl, force_dgl = torch_eam(g)
e_dgl, force_dgl = e_dgl.detach(), force_dgl.detach()

# evaluate energy and forces with ASE
e_ase = al_ase.get_potential_energy()
force_ase = al_ase.get_forces()

print(f"{al_ase.get_potential_energy()=}")
print(f"{e_dgl.detach().item()=}")
print(f"potential energy matches: {np.isclose(e_ase, e_dgl.item())}")
print(f"force components match? : {np.isclose(force_ase, force_dgl).all()}")
/Users/bld/.pyenv/versions/3.10.9/envs/nfflr/lib/python3.10/site-packages/dgl/backend/pytorch/tensor.py:445: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  assert input.numel() == input.storage().size(), (
al_ase.get_potential_energy()=-214.15885773313198
e_dgl.detach().item()=-214.15885773313158
potential energy matches: True
force components match? : True

Tersoff potential#

This section performs the same diagnostic using the Tersoff potential, as implemented by LAMMPS. This demonstrates that forces computed by autograd with respect to relative position vectors reduce to the correct atomic forces.

The parameters used correspond to https://www.ctcms.nist.gov/potentials/entry/1988–Tersoff-J–Si-b/:

!curl https://www.ctcms.nist.gov/potentials/Download/1988--Tersoff-J--Si-b/1/1988_Si\(B\).tersoff -o Si.tersoff
!cat Si.tersoff
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   532  100   532    0     0     49      0  0:00:10  0:00:10 --:--:--   143 0 --:--:--  0:00:02 --:--:--     0     0 --:--:--  0:00:03 --:--:--     0
# Parameters for the Tersoff Si(B) potential. CITATION: J. Tersoff, Phys. Rev. B 37, 6991 (1988)

# Parameter values verified by Lucas Hale.
# Identical values in Si.tersoff of August 22, 2018 LAMMPS distribution.
# Identical values in openKIM model MO_245095684871_001 parameter file.

# Values are in LAMMPS "metal" units.

# e1 e2 e3 m   gamma lambda3 c      d      costheta0 n      beta    lambda2 B      R   D   lambda1 A
  Si Si Si 3.0 1.0   1.3258  4.8381 2.0417 0.0       22.956 0.33675 1.3258  95.373 3.0 0.2 3.2394  3264.7

The ASE LAMMPS calculator seems to still have issues parsing thermo data from lammps log files; we can work around that in the short term by monkey-patching read_lammps_log following the discussion here.

import ase.io
from ase.calculators.lammpsrun import LAMMPS

from ase.parallel import paropen
from re import compile as re_compile, IGNORECASE
from ase.calculators.lammps import CALCULATION_END_MARK

def read_lammps_log(self, lammps_log=None, thermo_args=None):
    """Monkey patched calc.read_lammps_log based on https://gitlab.com/ase/ase/-/issues/1096
    
    Method which reads a LAMMPS output log file.
    """

    if thermo_args is None:
        thermo_args = self.parameters.thermo_args

    if lammps_log is None:
        lammps_log = self.label + ".log"

    if isinstance(lammps_log, str):
        fileobj = paropen(lammps_log, "rb")
        close_log_file = True
    else:
        # Expect lammps_in to be a file-like object
        fileobj = lammps_log
        close_log_file = False

    # read_log depends on that the first (three) thermo_style custom args
    # can be capitalized and matched against the log output. I.e.
    # don't use e.g. 'ke' or 'cpu' which are labeled KinEng and CPU.
    # mark_re = r"^\s*" + r"\s+".join(
    #     [x.capitalize() for x in thermo_args[0:3]]
    # )
    # _custom_thermo_mark = re_compile(mark_re)
    mark_re = "^\s*" + "\s*".join(
        [x.capitalize() for x in thermo_args[0:3]]
    )
    _custom_thermo_mark = re_compile(mark_re)

    # !TODO: regex-magic necessary?
    # Match something which can be converted to a float
    f_re = r"([+-]?(?:(?:\d+(?:\.\d*)?|\.\d+)(?:e[+-]?\d+)?|nan|inf))"
    n_args = len(thermo_args)
    # Create a re matching exactly N white space separated floatish things
    _custom_thermo_re = re_compile(
        r"^\s*" + r"\s+".join([f_re] * n_args) + r"\s*$", flags=IGNORECASE
    )

    thermo_content = []

    line = fileobj.readline().decode("utf-8")
    while line and line.strip() != CALCULATION_END_MARK:
        # check error
        if 'ERROR:' in line:
            if close_log_file:
                fileobj.close()
            raise RuntimeError(f'LAMMPS exits with error message: {line}')

        # get thermo output
        if _custom_thermo_mark.match(line):
            bool_match = True
            while bool_match:
                line = fileobj.readline().decode("utf-8")
                bool_match = _custom_thermo_re.match(line)
                if bool_match:
                    # create a dictionary between each of the
                    # thermo_style args and it's corresponding value
                    thermo_content.append(
                        dict(
                            zip(
                                thermo_args,
                                map(float, bool_match.groups()),
                            )
                        )
                    )
        else:
            line = fileobj.readline().decode("utf-8")

    if close_log_file:
        fileobj.close()

    self.thermo_content = thermo_content
    # return thermo_content

    
# monkey patch the LAMMPS log parser
LAMMPS.read_lammps_log = read_lammps_log 

Small silicon system#

Now, set up a small Si system and a LAMMPS calculator to serve as the reference Tersoff potential implementation:

# set up a small Si example crystal
si_ase = bulk("Si", "diamond", 5.43) * [4, 4, 4]
si_ase.rattle(stdev=0.01, seed=36)
si_ase.wrap()

# configure LAMMPS Tersoff potential
calc = LAMMPS()
calc.parameters["command"] = "/opt/homebrew/bin/lmp_serial"
calc.parameters["files"] = ["Si.tersoff"]
calc.parameters["binary_dump"] = False
calc.parameters.update(
    {"pair_style": "tersoff", "pair_coeff": ["* * Si.tersoff Si"]}
    )
si_ase.set_calculator(calc)

# calculate Tersoff energy and forces with LAMMPS
e_lammps = si_ase.get_potential_energy()
f_lammps = si_ase.get_forces()

print(f"{e_lammps=}")
print(f"{f_lammps[:5]=}")
e_lammps=-592.398567149167
f_lammps[:5]=array([[-0.20522987, -0.26851818,  0.14179856],
       [ 0.06219078,  0.05221548,  0.0728464 ],
       [ 0.03842853, -0.02849059,  0.04544221],
       [-0.21198316,  0.12308603, -0.00683117],
       [ 0.5269759 , -0.12608296,  0.29867076]])

And

g = graph.periodic_radius_graph(nfflr.Atoms(si_ase), r=3.5, dtype=torch.float64)
tersoff = nfflr.models.Tersoff()
out = tersoff(g)
e_tersoff = out["total_energy"].detach()
f_tersoff = out["forces"].detach()
stress_tersoff = out["stress"].detach() / si_ase.get_volume()

print(f"{e_tersoff=}")
print(f"{f_tersoff[:5]=}")
print(f"{stress_tersoff=}")
e_tersoff=tensor(-592.3986)
f_tersoff[:5]=tensor([[-0.2052, -0.2685,  0.1418],
        [ 0.0622,  0.0522,  0.0728],
        [ 0.0384, -0.0285,  0.0454],
        [-0.2120,  0.1231, -0.0068],
        [ 0.5270, -0.1261,  0.2987]])
stress_tersoff=tensor([[[-6.5235e-04,  1.6092e-03,  4.2400e-04],
         [ 1.6092e-03, -6.5744e-04,  2.0972e-06],
         [ 4.2400e-04,  2.0972e-06, -6.4832e-04]]])

The total energies match to within numerical precision:

np.isclose(e_tersoff, e_lammps)
True

And so do the forces (using the numerical precision settings used by torch.autograd.gradcheck:

np.isclose(f_lammps, f_tersoff, atol=1e-05, rtol=0.001).all()
True

The largest force component discrepancy is about \(7 \times 10^{-7} \; eV/\mathrm{\AA}\)

(f_lammps - f_tersoff.numpy()).max()
6.749736085540081e-07

And the stress tensor values also match:

si_ase.get_stress()
array([-6.52352915e-04, -6.57444097e-04, -6.48324331e-04,  2.09715783e-06,
        4.23995521e-04,  1.60922348e-03])
np.isclose(
    stress_tersoff, 
    ase.stress.voigt_6_to_full_3x3_stress(si_ase.get_stress())
)
array([[[ True,  True,  True],
        [ True,  True,  True],
        [ True,  True,  True]]])