The `ace_basis` interface

Start by importing the required libraries

using ACEpotentials

We need a dataset TiAl_tutorial.xyz for this tutorial. Normally we would get the path to a datset and then use read_extxyz to load in the training set.

data_file = "path/to/TiAl_tutorial.xyz"
data = read_extxyz(data_file)

For convenience we provide this dataset as a Julia artifact and make it conveniently accessible via ACEpotentials.example_dataset. We keep only a small subset of the training structures to keep the regression problem small.

data, _, meta = ACEpotentials.example_dataset("TiAl_tutorial")
train_data = data[1:5:end];

 Downloading artifact: TiAl_tutorial
     Failure artifact: TiAl_tutorial
 Downloading artifact: TiAl_tutorial

The next step is to generate a basis set:

order = 3 : We take 3-correlation, i.e. a 4-body potential,
totaldegree = 6 : a very low polynomial degree just for testing
rcut = 5.5 : this is a typical cutoff radius, there is also a good default which is a bit higher

These three are the most important approximation parameters to explore when trying to improve the fit-accuracy. In addition there is

The parameter r0 is just a scaling parameter and the fits should not be very sensitive to its choice. A rough estimate for the nearest-neighbour distance is usually ok. (NB: if you change to a non-trivial distance transform, then the parameter r0 may become important.)

r0 = 2.88
basis = ACE1x.ace_basis(elements = [:Ti, :Al],
                  order = 3,
                  totaldegree = 6,
                  rcut = 5.5,
                  r0 = r0,)
@show length(basis);

length(basis) = 190

Vref specifies a reference potential, which is subtracted from the training data and the ACE parameters are then estimated from the difference. This reference potential will in the end be added to the ACE model. Here we use a one-body potential i.e. a reference atom energy for each individual species. Usage of a one-body reference potential generally results in very slightly reduced fit accuracy but significantly improved 2-body potentials with a realistic dimer shape.

Vref = OneBody(:Ti => -1586.0195, :Al => -105.5954);

The next line specifies the regression weights: in the least squares loss different observations are given different weights,

\[ \sum_{R} \Big( w^E_R | E(R) - y_R^E |^2 + w^F_R | {\rm forces}(R) - y_R^F |^2 + w^V_R | {\rm virial}(R) - y_R^V |^2 \Big),\]

and this is specificed via the following dictionary. The keys correspond to the config_type of the training structures.

weights = Dict(
        "FLD_TiAl" => Dict("E" => 60.0, "F" => 1.0 , "V" => 1.0 ),
        "TiAl_T5000" => Dict("E" => 5.0, "F" => 1.0 , "V" => 1.0 ));

The next step is to evaluate the basis on the training set. Precomputing the basis once (and possibly save it to disk) makes experimenting with different regression parameters much more efficient. This is demonstrated below by showing various different solver options. Similarly once could also explore different data weights (see weights below).

datakeys = (energy_key = "energy", force_key = "force", virial_key = "virial")
train = [ACEpotentials.AtomsData(t; weights=weights, v_ref=Vref, datakeys...) for t in train_data]
A, Y, W = ACEfit.assemble(train, basis);

[ Info: Assembling linear problem.
[ Info:   - Creating feature matrix with size (1770, 190).
[ Info:   - Beginning assembly with processor count:  1.

Progress:   3%|█▎                                       |  ETA: 0:04:24
Progress:  11%|████▍                                    |  ETA: 0:01:26
Progress:  12%|█████                                    |  ETA: 0:01:15
Progress:  14%|█████▋                                   |  ETA: 0:01:07
Progress:  15%|██████▎                                  |  ETA: 0:01:00
Progress:  17%|██████▉                                  |  ETA: 0:00:55
Progress:  18%|███████▌                                 |  ETA: 0:00:50
Progress:  20%|████████▏                                |  ETA: 0:00:46
Progress:  21%|████████▊                                |  ETA: 0:00:43
Progress:  23%|█████████▍                               |  ETA: 0:00:40
Progress:  24%|██████████                               |  ETA: 0:00:37
Progress:  26%|██████████▌                              |  ETA: 0:00:35
Progress:  27%|███████████▏                             |  ETA: 0:00:33
Progress:  29%|███████████▊                             |  ETA: 0:00:31
Progress:  30%|████████████▍                            |  ETA: 0:00:29
Progress:  32%|█████████████                            |  ETA: 0:00:28
Progress:  33%|█████████████▋                           |  ETA: 0:00:26
Progress:  35%|██████████████▎                          |  ETA: 0:00:25
Progress:  36%|██████████████▉                          |  ETA: 0:00:24
Progress:  38%|███████████████▌                         |  ETA: 0:00:22
Progress:  39%|████████████████▏                        |  ETA: 0:00:21
Progress:  41%|████████████████▊                        |  ETA: 0:00:20
Progress:  42%|█████████████████▍                       |  ETA: 0:00:19
Progress:  44%|██████████████████                       |  ETA: 0:00:18
Progress:  45%|██████████████████▋                      |  ETA: 0:00:18
Progress:  47%|███████████████████▎                     |  ETA: 0:00:17
Progress:  48%|███████████████████▉                     |  ETA: 0:00:16
Progress:  50%|████████████████████▌                    |  ETA: 0:00:15
Progress:  52%|█████████████████████▏                   |  ETA: 0:00:15
Progress:  53%|█████████████████████▊                   |  ETA: 0:00:14
Progress:  55%|██████████████████████▍                  |  ETA: 0:00:13
Progress:  56%|███████████████████████                  |  ETA: 0:00:13
Progress:  58%|███████████████████████▋                 |  ETA: 0:00:12
Progress:  59%|████████████████████████▎                |  ETA: 0:00:11
Progress:  61%|████████████████████████▉                |  ETA: 0:00:11
Progress:  62%|█████████████████████████▌               |  ETA: 0:00:10
Progress:  64%|██████████████████████████▏              |  ETA: 0:00:10
Progress:  65%|██████████████████████████▊              |  ETA: 0:00:09
Progress:  67%|███████████████████████████▍             |  ETA: 0:00:09
Progress:  68%|████████████████████████████             |  ETA: 0:00:08
Progress:  70%|████████████████████████████▋            |  ETA: 0:00:08
Progress:  71%|█████████████████████████████▎           |  ETA: 0:00:07
Progress:  73%|█████████████████████████████▉           |  ETA: 0:00:07
Progress:  74%|██████████████████████████████▌          |  ETA: 0:00:06
Progress:  76%|███████████████████████████████          |  ETA: 0:00:06
Progress:  77%|███████████████████████████████▋         |  ETA: 0:00:06
Progress:  79%|████████████████████████████████▎        |  ETA: 0:00:05
Progress:  80%|████████████████████████████████▉        |  ETA: 0:00:05
Progress:  82%|█████████████████████████████████▌       |  ETA: 0:00:04
Progress:  83%|██████████████████████████████████▏      |  ETA: 0:00:04
Progress:  85%|██████████████████████████████████▊      |  ETA: 0:00:04
Progress:  86%|███████████████████████████████████▍     |  ETA: 0:00:03
Progress:  88%|████████████████████████████████████     |  ETA: 0:00:03
Progress:  89%|████████████████████████████████████▋    |  ETA: 0:00:02
Progress:  91%|█████████████████████████████████████▎   |  ETA: 0:00:02
Progress:  92%|█████████████████████████████████████▉   |  ETA: 0:00:02
Progress:  94%|██████████████████████████████████████▌  |  ETA: 0:00:01
Progress:  95%|███████████████████████████████████████▏ |  ETA: 0:00:01
Progress:  97%|███████████████████████████████████████▊ |  ETA: 0:00:01
Progress:  98%|████████████████████████████████████████▍|  ETA: 0:00:00
Progress: 100%|█████████████████████████████████████████| Time: 0:00:21
[ Info:   - Assembly completed.
[ Info: Assembling full weight vector.

Progress:   3%|█▎                                       |  ETA: 0:00:05
Progress: 100%|█████████████████████████████████████████| Time: 0:00:00

ACE1.jl has a heuristic smoothness prior built in which assigns to each basis function Bi a scaling parameter si that estimates how "rough" that basis function is. The following line generates a regularizer (prior) with si^q on the diagonal, thus penalizing rougher basis functions and enforcing a smoother fitted potential.

P = smoothness_prior(basis; p = 4)

190×190 LinearAlgebra.Diagonal{Float64, Vector{Float64}}:
 2.0    ⋅     ⋅      ⋅      ⋅       ⋅   …    ⋅         ⋅        ⋅        ⋅ 
  ⋅   17.0    ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅   82.0     ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅   257.0     ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅   626.0      ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅   1297.0  …    ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
 ⋮                                 ⋮    ⋱                             
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅   …    ⋅         ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅      29.9748     ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅      101.908     ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅     154.616     ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅        ⋅         ⋅        ⋅     331.982

Once all the solver parameters have been determined, we use ACEfit to estimate the parameters. This routine will return the fitted interatomic potential IP as well as the a dictionary lsqfit with some information about the fitting process.

solver = ACEfit.LSQR(damp = 1e-2, atol = 1e-6, P = P)
results = ACEfit.solve(solver, W .* A, W .* Y)
pot_1 = JuLIP.MLIPs.SumIP(Vref, JuLIP.MLIPs.combine(basis, results["C"]));

┌ Warning: Need to apply preconditioner in LSQR.
└ @ ACEfit ~/.julia/packages/ACEfit/9u2t5/src/solvers.jl:94
damp  0.01
atol  1.0e-6
maxiter  100000
Converged after 722 iterations.
relative RMS error  0.002820185107857736

The advantage of working with the ACE basis rather than the ACE model interface is that we can now make some changes to the fitting parameters and refit. For example, we might want different weights, change the smoothness prior, and switch to a RRQR solver.

weights["FLD_TiAl"]["E"] = 20.0
W = ACEfit.assemble_weights(train)
solver = ACEfit.RRQR(; rtol = 1e-8, P = smoothness_prior(basis; p = 2))
results = ACEfit.solve(solver, W .* A, W .* Y)
pot_2 = JuLIP.MLIPs.SumIP(Vref, JuLIP.MLIPs.combine(basis, results["C"]));

[ Info: Assembling full weight vector.

We can now compare the errors in a nice table. Depending on the choice of solver, and solver parameters, the test errors might be very poor. Exploring different parameters in different applications can lead to significantly improved predictions.

test = [ACEpotentials.AtomsData(t; weights=weights, v_ref=Vref, datakeys...) for t in data[2:10:end]]

@info("Test Error Tables")
@info("First Potential: ")
ACEpotentials.linear_errors(test, pot_1);

@info("Second Potential: ")
ACEpotentials.linear_errors(test, pot_2);

[ Info: Test Error Tables
[ Info: First Potential:
[ Info: RMSE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type │ E [meV] │ F [eV/A] │ V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │  11.201 │    0.069 │  98.576 │
│ TiAl_T5000 │   2.003 │    0.416 │ 111.649 │
├────────────┼─────────┼──────────┼─────────┤
│        set │  11.035 │    0.286 │  98.997 │
└────────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type │ E [meV] │ F [eV/A] │ V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │   6.942 │    0.050 │  64.266 │
│ TiAl_T5000 │   2.003 │    0.335 │ 100.680 │
├────────────┼─────────┼──────────┼─────────┤
│        set │   6.792 │    0.181 │  65.369 │
└────────────┴─────────┴──────────┴─────────┘
[ Info: Second Potential:
[ Info: RMSE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type │ E [meV] │ F [eV/A] │ V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │  14.068 │    0.098 │  95.835 │
│ TiAl_T5000 │   1.703 │    0.398 │ 169.739 │
├────────────┼─────────┼──────────┼─────────┤
│        set │  13.857 │    0.278 │  98.889 │
└────────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type │ E [meV] │ F [eV/A] │ V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │   8.836 │    0.055 │  56.490 │
│ TiAl_T5000 │   1.703 │    0.320 │ 146.366 │
├────────────┼─────────┼──────────┼─────────┤
│        set │   8.619 │    0.177 │  59.213 │
└────────────┴─────────┴──────────┴─────────┘

If we want to save the fitted potentials to disk to later use we can use one of the following commands: the first saves the potential as an ACE1.jl compatible potential, while the second line exports it to a format that can be ready by the pacemaker code to be used within LAMMPS.

save_potential("./TiAl_tutorial_pot.json", pot_1)

The fitted potential can also be exported to a format compatible with LAMMPS (ML-PACE)

export2lammps("./TiAl_tutorial_pot.yace", pot_1)

This page was generated using Literate.jl.

The ace_basis interface

The `ace_basis` interface