First example

This very simple tutorial constructs an ACE1 model for Si by fitting to an empirical potential.

Make sure to first read the installation notes. Now start by importing the required packages:

using ACEpotentials
import Random
using LinearAlgebra: norm, Diagonal

Step 1: specify the ACE Model

The parameters have the following meaning:

  • elements: list of chemical species, symbols
  • order : correlation order
  • totaldegree: maximum total polynomial degree used for the basis
  • rcut : cutoff radius (optional, defaults are provided)
model = acemodel(elements = [:Si,],
                 order = 3,
                 totaldegree = 10,
                 rcut = 5.0)
@show length(model.basis);
length(model.basis) = 110

Step 2: Generate a training set

Next we need to generate some training data to estimate the model parameters. Normally one would generate a training set using DFT data, stored as an .xyz file. Here, we create a random training set for simplicity. Please note that this is generally not a good strategy to generate data!

  • gen_dat() generates a single training configuration wrapped in an ACEpotentials.AtomsData structure. Each d::AtomsData contains the structure d.atoms, and energy value and a force vector to train against.
  • train is then a collection of such random training configurations.
data_keys = (energy_key = "energy", force_key = "forces")

function gen_dat()
   sw = StillingerWeber()
   at = rattle!(bulk(:Si, cubic=true) * rand(2:3), 0.3)
   set_data!(at, data_keys.energy_key, energy(sw,at))
   set_data!(at, data_keys.force_key, forces(sw,at))
   return at
end

Random.seed!(0)
train = [gen_dat() for _=1:20];

Step 3: Estimate Parameters

We specify a solver and then let ACEfit.jl to do all the work for us. More fine-grained control is possible; see the ACEfit.jl documentation. For sake of illustration we use a Bayesian Ridge Regression solver. This will automatically determine the regularisation for us.

solver = ACEfit.BLR()
acefit!(model, train; solver=solver, data_keys...);
┌─────────┬──────────┬───────┬────┬──────┬─────┐
│    Type  #Configs  #Envs  #E    #F   #V │
├─────────┼──────────┼───────┼────┼──────┼─────┤
│ default │       20 │  2344 │ 20 │ 7032 │   0 │
├─────────┼──────────┼───────┼────┼──────┼─────┤
│   total │       20 │  2344 │ 20 │ 7032 │   0 │
│ missing │        0 │     0 │  0 │    0 │ 120 │
└─────────┴──────────┴───────┴────┴──────┴─────┘
[ Info: Assembling linear problem.
[ Info:   - Creating feature matrix with size (7052, 110).
[ Info:   - Beginning assembly with processor count:  1.

Progress:  10%|████▏                                    |  ETA: 0:00:06
Progress:  60%|████████████████████████▋                |  ETA: 0:00:03
Progress: 100%|█████████████████████████████████████████| Time: 0:00:06
[ Info:   - Assembly completed.
[ Info: Assembling full weight vector.
Iter     Function value   Gradient norm
     0     6.727465e+03     6.769388e+03
 * time: 4.792213439941406e-5
     1    -1.653584e+03     2.072568e+04
 * time: 0.015866994857788086
     2    -4.158362e+03     2.522873e+04
 * time: 0.019505023956298828
     3    -5.498239e+03     1.598170e+04
 * time: 0.02332305908203125
     4    -5.550432e+03     2.292239e+04
 * time: 0.025217056274414062
     5    -5.676258e+03     3.198384e+03
 * time: 0.02672290802001953
     6    -5.705056e+03     6.107128e+03
 * time: 0.029703855514526367
     7    -6.198132e+03     1.814330e+04
 * time: 0.03394198417663574
     8    -9.292725e+03     5.804096e+04
 * time: 0.039563894271850586
     9    -1.102307e+04     3.289065e+04
 * time: 0.04235100746154785
    10    -1.109925e+04     1.042603e+04
 * time: 0.04446601867675781
    11    -1.114862e+04     1.825906e+04
 * time: 0.046530961990356445
    12    -1.139397e+04     1.388445e+04
 * time: 0.049237966537475586
    13    -1.201983e+04     4.704148e+04
 * time: 0.05193305015563965
    14    -1.219235e+04     1.962891e+04
 * time: 0.0539548397064209
    15    -1.238623e+04     2.176163e+04
 * time: 0.05597805976867676
    16    -1.253177e+04     1.966207e+04
 * time: 0.057968854904174805
    17    -1.257962e+04     2.096011e+03
 * time: 0.060066938400268555
    18    -1.259943e+04     1.193816e+04
 * time: 0.06228184700012207
    19    -1.260732e+04     1.447764e+03
 * time: 0.06470394134521484
    20    -1.260816e+04     1.061123e+03
 * time: 0.06671285629272461
    21    -1.260824e+04     5.962821e+01
 * time: 0.06860899925231934
    22    -1.260824e+04     2.535835e+00
 * time: 0.07046985626220703
    23    -1.260824e+04     7.334983e-02
 * time: 0.0718379020690918
    24    -1.260824e+04     1.743906e-01
 * time: 0.07486796379089355
    25    -1.260824e+04     3.080896e-01
 * time: 0.07619595527648926
    26    -1.260824e+04     2.300227e-01
 * time: 0.07748198509216309
    27    -1.260824e+04     1.465356e+01
 * time: 0.07876300811767578
    28    -1.260824e+04     3.198283e-02
 * time: 0.08015894889831543
    29    -1.260824e+04     1.465899e-02
 * time: 0.08147192001342773
 * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     -1.260824e+04

 * Found with
    Algorithm:     L-BFGS

 * Convergence measures
    |x - x'|               = 1.88e-09 ≤ 1.0e-08
    |x - x'|/|x'|          = 1.82e-11 ≰ 0.0e+00
    |f(x) - f(x')|         = 2.35e-03 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 1.87e-07 ≰ 0.0e+00
    |g(x)|                 = 1.47e-02 ≰ 0.0e+00

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    29
    f(x) calls:    123
    ∇f(x) calls:   123

To see the training errors we can use

@info("Training Errors")
ACEpotentials.linear_errors(train, model; data_keys...);
[ Info: Training Errors
[ Info: RMSE Table
┌─────────┬─────────┬──────────┬─────────┐
│    Type  E [meV]  F [eV/A]  V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │   0.396 │    0.037 │   0.000 │
├─────────┼─────────┼──────────┼─────────┤
│     set │   0.396 │    0.037 │   0.000 │
└─────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌─────────┬─────────┬──────────┬─────────┐
│    Type  E [meV]  F [eV/A]  V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │   0.299 │    0.028 │   0.000 │
├─────────┼─────────┼──────────┼─────────┤
│     set │   0.299 │    0.028 │   0.000 │
└─────────┴─────────┴──────────┴─────────┘

Step 4: Run some tests

At a minimum one should have a test set, check the errors on that test set, and confirm that they are similar as the training errors.

@info("Test Errors")
test =  [gen_dat() for _=1:20]
ACEpotentials.linear_errors(test, model; data_keys...);
[ Info: Test Errors
[ Info: RMSE Table
┌─────────┬─────────┬──────────┬─────────┐
│    Type  E [meV]  F [eV/A]  V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │   0.223 │    0.037 │   0.000 │
├─────────┼─────────┼──────────┼─────────┤
│     set │   0.223 │    0.037 │   0.000 │
└─────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌─────────┬─────────┬──────────┬─────────┐
│    Type  E [meV]  F [eV/A]  V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │   0.172 │    0.028 │   0.000 │
├─────────┼─────────┼──────────┼─────────┤
│     set │   0.172 │    0.028 │   0.000 │
└─────────┴─────────┴──────────┴─────────┘

If we wanted to perform such a test ``manually'' it might look like this:

@info("Manual RMSE Test")
potential = model.potential
test_energies = [ JuLIP.get_data(at, "energy") / length(at) for at in test]
model_energies = [energy(potential, at) / length(at) for at in test]
rmse_energy = norm(test_energies - model_energies) / sqrt(length(test))
@show rmse_energy;
[ Info: Manual RMSE Test
rmse_energy = 0.00022336031187506652

But in practice, one should run more extensive test simulations to check how robust the fitted potential is.

Step 5: export the model

The fitted model can be exported to a JSON or YAML file, or to a LAMMPs compatible yace file. We won't go through that in this tutoral. See export2json and export2lammps for further information.

Step 6: Using the model

Let's do something very simple: relax a vacancy.

We create a small Si cell, delete an atom and rattle the rest

at = bulk(:Si, cubic=true) * 3
deleteat!(at, 1)
rattle!(at, 0.03 * rnn(:Si))
JuLIP.Atoms with 215 atoms

we can now minimize the ACE energy.

set_calculator!(at, potential);
minimise!(at)
E_ace = energy(at)
-924.9764086077141

If we want a formation energy, we could get it like this.

at0 = bulk(:Si)
E0_ace = energy(potential, at0);
Evac_ace = E_ace - (length(at)-1)/length(at0) * E0_ace
@show Evac_ace;
Evac_ace = 2.7148087885758514

Note that there are no vacancy structures in the training set, so this is a prediction out of sample. We have no guarantee of the accuracy of this prediction. In fact the prediction is quite far off:

sw = StillingerWeber()
set_calculator!(at, sw)
minimise!(at; verbose=false);
E_sw = energy(at)
E0_sw = energy(sw, bulk(:Si))
Evac_sw = E_sw - (length(at)-1)/length(at0) * E0_sw
@show Evac_sw;
Evac_sw = -1.1122892601633794e-5

To obtain accurate predictions on a vacancy structure, we must add it to the training set. This iterative model development process goes beyond the scope of this tutorial.


This page was generated using Literate.jl.