First example
This very simple tutorial constructs an ACE1 model for Si by fitting to an empirical potential.
Make sure to first read the installation notes. Now start by importing the required packages:
using ACEpotentials
import Random
using LinearAlgebra: norm, Diagonal
Step 1: specify the ACE Model
The parameters have the following meaning:
elements
: list of chemical species, symbolsorder
: correlation ordertotaldegree
: maximum total polynomial degree used for the basisrcut
: cutoff radius (optional, defaults are provided)
model = acemodel(elements = [:Si,],
order = 3,
totaldegree = 10,
rcut = 5.0)
@show length(model.basis);
length(model.basis) = 110
Step 2: Generate a training set
Next we need to generate some training data to estimate the model parameters. Normally one would generate a training set using DFT data, stored as an .xyz
file. Here, we create a random training set for simplicity. Please note that this is generally not a good strategy to generate data!
gen_dat()
generates a single training configuration wrapped in anACEpotentials.AtomsData
structure. Eachd::AtomsData
contains the structured.atoms
, and energy value and a force vector to train against.train
is then a collection of such random training configurations.
data_keys = (energy_key = "energy", force_key = "forces")
function gen_dat()
sw = StillingerWeber()
at = rattle!(bulk(:Si, cubic=true) * rand(2:3), 0.3)
set_data!(at, data_keys.energy_key, energy(sw,at))
set_data!(at, data_keys.force_key, forces(sw,at))
return at
end
Random.seed!(0)
train = [gen_dat() for _=1:20];
Step 3: Estimate Parameters
We specify a solver and then let ACEfit.jl
to do all the work for us. More fine-grained control is possible; see the ACEfit.jl
documentation. For sake of illustration we use a Bayesian Ridge Regression solver. This will automatically determine the regularisation for us.
solver = ACEfit.BLR()
acefit!(model, train; solver=solver, data_keys...);
┌─────────┬──────────┬───────┬────┬──────┬─────┐
│ Type │ #Configs │ #Envs │ #E │ #F │ #V │
├─────────┼──────────┼───────┼────┼──────┼─────┤
│ default │ 20 │ 2344 │ 20 │ 7032 │ 0 │
├─────────┼──────────┼───────┼────┼──────┼─────┤
│ total │ 20 │ 2344 │ 20 │ 7032 │ 0 │
│ missing │ 0 │ 0 │ 0 │ 0 │ 120 │
└─────────┴──────────┴───────┴────┴──────┴─────┘
[ Info: Assembling linear problem.
[ Info: - Creating feature matrix with size (7052, 110).
[ Info: - Beginning assembly with processor count: 1.
Progress: 10%|████▏ | ETA: 0:00:06
Progress: 60%|████████████████████████▋ | ETA: 0:00:03
Progress: 100%|█████████████████████████████████████████| Time: 0:00:06
[ Info: - Assembly completed.
[ Info: Assembling full weight vector.
Iter Function value Gradient norm
0 6.727465e+03 6.769388e+03
* time: 4.792213439941406e-5
1 -1.653584e+03 2.072568e+04
* time: 0.015866994857788086
2 -4.158362e+03 2.522873e+04
* time: 0.019505023956298828
3 -5.498239e+03 1.598170e+04
* time: 0.02332305908203125
4 -5.550432e+03 2.292239e+04
* time: 0.025217056274414062
5 -5.676258e+03 3.198384e+03
* time: 0.02672290802001953
6 -5.705056e+03 6.107128e+03
* time: 0.029703855514526367
7 -6.198132e+03 1.814330e+04
* time: 0.03394198417663574
8 -9.292725e+03 5.804096e+04
* time: 0.039563894271850586
9 -1.102307e+04 3.289065e+04
* time: 0.04235100746154785
10 -1.109925e+04 1.042603e+04
* time: 0.04446601867675781
11 -1.114862e+04 1.825906e+04
* time: 0.046530961990356445
12 -1.139397e+04 1.388445e+04
* time: 0.049237966537475586
13 -1.201983e+04 4.704148e+04
* time: 0.05193305015563965
14 -1.219235e+04 1.962891e+04
* time: 0.0539548397064209
15 -1.238623e+04 2.176163e+04
* time: 0.05597805976867676
16 -1.253177e+04 1.966207e+04
* time: 0.057968854904174805
17 -1.257962e+04 2.096011e+03
* time: 0.060066938400268555
18 -1.259943e+04 1.193816e+04
* time: 0.06228184700012207
19 -1.260732e+04 1.447764e+03
* time: 0.06470394134521484
20 -1.260816e+04 1.061123e+03
* time: 0.06671285629272461
21 -1.260824e+04 5.962821e+01
* time: 0.06860899925231934
22 -1.260824e+04 2.535835e+00
* time: 0.07046985626220703
23 -1.260824e+04 7.334983e-02
* time: 0.0718379020690918
24 -1.260824e+04 1.743906e-01
* time: 0.07486796379089355
25 -1.260824e+04 3.080896e-01
* time: 0.07619595527648926
26 -1.260824e+04 2.300227e-01
* time: 0.07748198509216309
27 -1.260824e+04 1.465356e+01
* time: 0.07876300811767578
28 -1.260824e+04 3.198283e-02
* time: 0.08015894889831543
29 -1.260824e+04 1.465899e-02
* time: 0.08147192001342773
* Status: success (objective increased between iterations)
* Candidate solution
Final objective value: -1.260824e+04
* Found with
Algorithm: L-BFGS
* Convergence measures
|x - x'| = 1.88e-09 ≤ 1.0e-08
|x - x'|/|x'| = 1.82e-11 ≰ 0.0e+00
|f(x) - f(x')| = 2.35e-03 ≰ 0.0e+00
|f(x) - f(x')|/|f(x')| = 1.87e-07 ≰ 0.0e+00
|g(x)| = 1.47e-02 ≰ 0.0e+00
* Work counters
Seconds run: 0 (vs limit Inf)
Iterations: 29
f(x) calls: 123
∇f(x) calls: 123
To see the training errors we can use
@info("Training Errors")
ACEpotentials.linear_errors(train, model; data_keys...);
[ Info: Training Errors
[ Info: RMSE Table
┌─────────┬─────────┬──────────┬─────────┐
│ Type │ E [meV] │ F [eV/A] │ V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │ 0.396 │ 0.037 │ 0.000 │
├─────────┼─────────┼──────────┼─────────┤
│ set │ 0.396 │ 0.037 │ 0.000 │
└─────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌─────────┬─────────┬──────────┬─────────┐
│ Type │ E [meV] │ F [eV/A] │ V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │ 0.299 │ 0.028 │ 0.000 │
├─────────┼─────────┼──────────┼─────────┤
│ set │ 0.299 │ 0.028 │ 0.000 │
└─────────┴─────────┴──────────┴─────────┘
Step 4: Run some tests
At a minimum one should have a test set, check the errors on that test set, and confirm that they are similar as the training errors.
@info("Test Errors")
test = [gen_dat() for _=1:20]
ACEpotentials.linear_errors(test, model; data_keys...);
[ Info: Test Errors
[ Info: RMSE Table
┌─────────┬─────────┬──────────┬─────────┐
│ Type │ E [meV] │ F [eV/A] │ V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │ 0.223 │ 0.037 │ 0.000 │
├─────────┼─────────┼──────────┼─────────┤
│ set │ 0.223 │ 0.037 │ 0.000 │
└─────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌─────────┬─────────┬──────────┬─────────┐
│ Type │ E [meV] │ F [eV/A] │ V [meV] │
├─────────┼─────────┼──────────┼─────────┤
│ default │ 0.172 │ 0.028 │ 0.000 │
├─────────┼─────────┼──────────┼─────────┤
│ set │ 0.172 │ 0.028 │ 0.000 │
└─────────┴─────────┴──────────┴─────────┘
If we wanted to perform such a test ``manually'' it might look like this:
@info("Manual RMSE Test")
potential = model.potential
test_energies = [ JuLIP.get_data(at, "energy") / length(at) for at in test]
model_energies = [energy(potential, at) / length(at) for at in test]
rmse_energy = norm(test_energies - model_energies) / sqrt(length(test))
@show rmse_energy;
[ Info: Manual RMSE Test
rmse_energy = 0.00022336031187506652
But in practice, one should run more extensive test simulations to check how robust the fitted potential is.
Step 5: export the model
The fitted model can be exported to a JSON or YAML file, or to a LAMMPs compatible yace
file. We won't go through that in this tutoral. See export2json
and export2lammps
for further information.
Step 6: Using the model
Let's do something very simple: relax a vacancy.
We create a small Si cell, delete an atom and rattle the rest
at = bulk(:Si, cubic=true) * 3
deleteat!(at, 1)
rattle!(at, 0.03 * rnn(:Si))
JuLIP.Atoms with 215 atoms
we can now minimize the ACE energy.
set_calculator!(at, potential);
minimise!(at)
E_ace = energy(at)
-924.9764086077141
If we want a formation energy, we could get it like this.
at0 = bulk(:Si)
E0_ace = energy(potential, at0);
Evac_ace = E_ace - (length(at)-1)/length(at0) * E0_ace
@show Evac_ace;
Evac_ace = 2.7148087885758514
Note that there are no vacancy structures in the training set, so this is a prediction out of sample. We have no guarantee of the accuracy of this prediction. In fact the prediction is quite far off:
sw = StillingerWeber()
set_calculator!(at, sw)
minimise!(at; verbose=false);
E_sw = energy(at)
E0_sw = energy(sw, bulk(:Si))
Evac_sw = E_sw - (length(at)-1)/length(at0) * E0_sw
@show Evac_sw;
Evac_sw = -1.1122892601633794e-5
To obtain accurate predictions on a vacancy structure, we must add it to the training set. This iterative model development process goes beyond the scope of this tutorial.
This page was generated using Literate.jl.