Basic Workflow

This short tutorial introduces the minimal workflow for people working purely in Julia. There are (or are in development) separate tutorials on using ACEpotentials via shell scripts or via Python, and tutorials on more advanced usage.

Start by importing ACEpotentials (and possibly other required libraries)

using ACEpotentials

We need a dataset TiAl_tutorial.xyz for this tutorial. Normally we would get the path to a datset and then use read_extxyz to load in the training set.

# (don't execute this block)
using ExtXYZ
data_file = "path/to/TiAl_tutorial.xyz"
data = ExtXYZ.load(data_file)

For convenience we provide this dataset as a Julia artifact and make it accessible via ACEpotentials.example_dataset. We keep only a small subset of the structures for training and testing to keep the regression problem small.

data, _, meta = ACEpotentials.example_dataset("TiAl_tutorial")
train_data = data[1:5:end];
test_data = data[2:10:end];
 Downloading artifact: TiAl_tutorial

The next step is to generate a model. Here we generate a linear ACE model using the ACE1compat interface, which generates models that are essentially equivalent to those provided by the discontinued ACE1.jl package.

  • order = 3 : We take 3-correlation, i.e. a 4-body potential,
  • totaldegree = 6 : a very low polynomial degree just for testing
  • rcut = 5.5 : this is a typical cutoff radius for metals

These three are the most important approximation parameters to explore when trying to improve the fit-accuracy. There are many other parameters to explore, which are documented in ?acemodel. Even further model refinements are possible by studying the internals of ACE1.jl and ACE1x.jl. We also specify a reference potential that will be added to the learned 2-body and many-body potential components. Here we use a one-body potential i.e. a reference atom energy for each individual species. Usage of a one-body reference potential generally results in very slightly reduced fit accuracy but significantly improved 2-body potentials with a realistic dimer shape and improved robustness in predictions.

hyperparams = (elements = [:Ti, :Al],
					order = 3,
					totaldegree = 6,
					rcut = 5.5,
					Eref = [:Ti => -1586.0195, :Al => -105.5954])
model = ace1_model(; hyperparams...)
@show length_basis(model);
length_basis(model) = 270

The next line specifies the regression weights: in the least squares loss different observations are given different weights,

\[ \sum_{R} \Big( w_{E,R}^2 | E(R) - y_R^E |^2 + w_{F,R}^2 | {\rm forces}(R) - y_R^F |^2 + w_{V,R}^2 | {\rm virial}(R) - y_R^V |^2 \Big),\]

and this is specificed via the following dictionary. The keys correspond to the config_type of the training structures.

weights = Dict(
        "FLD_TiAl" => Dict("E" => 60.0, "F" => 1.0 , "V" => 1.0 ),
        "TiAl_T5000" => Dict("E" => 5.0, "F" => 1.0 , "V" => 1.0 ));

To estimate the parameters we still need to choose a solver for the least squares system. In this tutorial we use a Bayesian linear regression, which is the recommended default at the moment. Many other solvers are available, and can be explored by looking at the documentation of ACEfit.jl.

solver = ACEfit.BLR()
ACEfit.BLR(Dict{Any, Any}())

ACEpotentials provides a heuristic smoothness prior which assigns to each basis function Bi a scaling parameter si that estimates how "rough" that basis function is. The following line generates a regularizer (prior) with si^q on the diagonal, thus penalizing rougher basis functions and enforcing a smoother fitted potential.

P = algebraic_smoothness_prior(model; p = 4)    #  (p = 4 is in fact the default)
270×270 LinearAlgebra.Diagonal{Float64, Vector{Float64}}:
 1.0    ⋅     ⋅      ⋅      ⋅       ⋅   …      ⋅        ⋅        ⋅        ⋅ 
  ⋅   16.0    ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅   81.0     ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅   256.0     ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅   625.0      ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅   1296.0  …      ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
 ⋮                                 ⋮    ⋱                            
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅   …      ⋅        ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅      6561.0       ⋅        ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅   10000.0       ⋅        ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅   14641.0       ⋅ 
  ⋅     ⋅     ⋅      ⋅      ⋅       ⋅          ⋅        ⋅        ⋅   20736.0

We are now ready to estimate the parameters. We take a subset of the training data to speed up the tutorial. The prior is passed to the acefit! function via the prior keyword argument.

result = acefit!(train_data, model; solver=solver, prior = P, weights=weights);
┌────────────┬──────────┬───────┬────┬──────┬─────┐
│       Type  #Configs  #Envs  #E    #F   #V │
├────────────┼──────────┼───────┼────┼──────┼─────┤
│   FLD_TiAl │       63 │   126 │ 63 │  378 │ 378 │
│ TiAl_T5000 │        3 │   310 │  3 │  930 │  18 │
├────────────┼──────────┼───────┼────┼──────┼─────┤
│      total │       66 │   436 │ 66 │ 1308 │ 396 │
│    missing │        0 │     0 │  0 │    0 │   0 │
└────────────┴──────────┴───────┴────┴──────┴─────┘
[ Info: Assembling linear problem.
[ Info:   - Creating feature matrix with size (1770, 270).
[ Info:   - Beginning assembly with processor count:  1.

Progress:   3%|█▎                                       |  ETA: 0:04:24
Progress:   5%|█▉                                       |  ETA: 0:03:10
Progress:   9%|███▊                                     |  ETA: 0:01:50
Progress:  11%|████▍                                    |  ETA: 0:01:38
Progress:  12%|█████                                    |  ETA: 0:01:28
Progress:  14%|█████▋                                   |  ETA: 0:01:21
Progress:  15%|██████▎                                  |  ETA: 0:01:14
Progress:  17%|██████▉                                  |  ETA: 0:01:09
Progress:  18%|███████▌                                 |  ETA: 0:01:05
Progress:  20%|████████▏                                |  ETA: 0:01:01
Progress:  21%|████████▊                                |  ETA: 0:00:58
Progress:  23%|█████████▍                               |  ETA: 0:00:55
Progress:  24%|██████████                               |  ETA: 0:00:52
Progress:  26%|██████████▌                              |  ETA: 0:00:50
Progress:  27%|███████████▏                             |  ETA: 0:00:47
Progress:  29%|███████████▊                             |  ETA: 0:00:45
Progress:  30%|████████████▍                            |  ETA: 0:00:43
Progress:  32%|█████████████                            |  ETA: 0:00:42
Progress:  33%|█████████████▋                           |  ETA: 0:00:40
Progress:  35%|██████████████▎                          |  ETA: 0:00:38
Progress:  36%|██████████████▉                          |  ETA: 0:00:37
Progress:  38%|███████████████▌                         |  ETA: 0:00:36
Progress:  39%|████████████████▏                        |  ETA: 0:00:34
Progress:  41%|████████████████▊                        |  ETA: 0:00:33
Progress:  42%|█████████████████▍                       |  ETA: 0:00:32
Progress:  44%|██████████████████                       |  ETA: 0:00:30
Progress:  45%|██████████████████▋                      |  ETA: 0:00:29
Progress:  47%|███████████████████▎                     |  ETA: 0:00:28
Progress:  48%|███████████████████▉                     |  ETA: 0:00:27
Progress:  50%|████████████████████▌                    |  ETA: 0:00:26
Progress:  52%|█████████████████████▏                   |  ETA: 0:00:25
Progress:  53%|█████████████████████▊                   |  ETA: 0:00:24
Progress:  55%|██████████████████████▍                  |  ETA: 0:00:23
Progress:  56%|███████████████████████                  |  ETA: 0:00:22
Progress:  58%|███████████████████████▋                 |  ETA: 0:00:21
Progress:  59%|████████████████████████▎                |  ETA: 0:00:20
Progress:  61%|████████████████████████▉                |  ETA: 0:00:20
Progress:  62%|█████████████████████████▌               |  ETA: 0:00:19
Progress:  64%|██████████████████████████▏              |  ETA: 0:00:18
Progress:  65%|██████████████████████████▊              |  ETA: 0:00:17
Progress:  67%|███████████████████████████▍             |  ETA: 0:00:16
Progress:  68%|████████████████████████████             |  ETA: 0:00:15
Progress:  70%|████████████████████████████▋            |  ETA: 0:00:14
Progress:  71%|█████████████████████████████▎           |  ETA: 0:00:14
Progress:  73%|█████████████████████████████▉           |  ETA: 0:00:13
Progress:  74%|██████████████████████████████▌          |  ETA: 0:00:12
Progress:  79%|████████████████████████████████▎        |  ETA: 0:00:10
Progress:  80%|████████████████████████████████▉        |  ETA: 0:00:09
Progress:  82%|█████████████████████████████████▌       |  ETA: 0:00:08
Progress:  83%|██████████████████████████████████▏      |  ETA: 0:00:08
Progress:  85%|██████████████████████████████████▊      |  ETA: 0:00:07
Progress:  86%|███████████████████████████████████▍     |  ETA: 0:00:06
Progress:  88%|████████████████████████████████████     |  ETA: 0:00:06
Progress:  89%|████████████████████████████████████▋    |  ETA: 0:00:05
Progress:  91%|█████████████████████████████████████▎   |  ETA: 0:00:04
Progress:  92%|█████████████████████████████████████▉   |  ETA: 0:00:03
Progress:  94%|██████████████████████████████████████▌  |  ETA: 0:00:03
Progress:  95%|███████████████████████████████████████▏ |  ETA: 0:00:02
Progress:  97%|███████████████████████████████████████▊ |  ETA: 0:00:01
Progress:  98%|████████████████████████████████████████▍|  ETA: 0:00:01
Progress: 100%|█████████████████████████████████████████| Time: 0:00:44
[ Info:   - Assembly completed.
[ Info: Assembling full weight vector.
┌ Warning: x_tol is deprecated. Use x_abstol or x_reltol instead. The provided value (1.0e-8) will be used as x_abstol.
@ Optim ~/.julia/packages/Optim/gmigl/src/types.jl:110
Iter     Function value   Gradient norm
     0     4.671494e+03     2.457943e+03
 * time: 0.027335166931152344
     1     3.952872e+03     4.057872e+02
 * time: 0.4230771064758301
     2     3.371262e+03     4.421460e+02
 * time: 0.43309903144836426
     3     3.210146e+03     1.642349e+02
 * time: 0.4399881362915039
     4     3.174813e+03     1.968488e+02
 * time: 0.44417810440063477
     5     2.913051e+03     8.640587e+02
 * time: 0.44843316078186035
     6     2.675831e+03     7.730363e+02
 * time: 0.4525420665740967
     7     2.208479e+03     1.033571e+02
 * time: 0.4594240188598633
     8     2.095581e+03     1.164921e+03
 * time: 0.4648759365081787
     9     2.030514e+03     1.788361e+03
 * time: 0.47289013862609863
    10     1.860275e+03     3.236795e+03
 * time: 0.4795839786529541
    11     9.464183e+02     2.183138e+03
 * time: 0.48492002487182617
    12     8.279883e+02     1.258797e+03
 * time: 0.4889230728149414
    13     8.100142e+02     7.198075e+02
 * time: 0.4916231632232666
    14     8.036075e+02     6.032632e+01
 * time: 0.4942891597747803
    15     8.035513e+02     1.366089e+01
 * time: 0.49830007553100586
    16     8.031186e+02     1.007879e+01
 * time: 0.5049090385437012
    17     8.031153e+02     8.503946e-01
 * time: 0.5075950622558594
    18     8.031153e+02     2.557221e-03
 * time: 0.510235071182251
    19     8.031153e+02     5.514327e-05
 * time: 0.5128610134124756
    20     8.031153e+02     2.755087e-06
 * time: 0.5181560516357422
    21     8.031153e+02     1.233233e-05
 * time: 0.5208499431610107
    22     8.031153e+02     1.761847e-07
 * time: 0.5261509418487549
    23     8.031153e+02     1.018001e-05
 * time: 0.5512700080871582
    24     8.031153e+02     3.069041e-05
 * time: 0.5540010929107666
    25     8.031153e+02     2.968219e-05
 * time: 0.5579609870910645
    26     8.031153e+02     3.073939e-05
 * time: 0.5606181621551514
    27     8.031153e+02     1.692091e-06
 * time: 0.5658650398254395
    28     8.031153e+02     2.703804e-05
 * time: 0.5698261260986328
    29     8.031153e+02     6.440680e-06
 * time: 0.5724470615386963
    30     8.031153e+02     4.597944e-05
 * time: 0.5750501155853271
 * Status: success

 * Candidate solution
    Final objective value:     8.031153e+02

 * Found with
    Algorithm:     L-BFGS

 * Convergence measures
    |x - x'|               = 1.59e-10 ≤ 1.0e-08
    |x - x'|/|x'|          = 1.21e-12 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.96e-06 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 2.44e-09 ≰ 0.0e+00
    |g(x)|                 = 4.60e-05 ≰ 0.0e+00

 * Work counters
    Seconds run:   1  (vs limit Inf)
    Iterations:    30
    f(x) calls:    136
    ∇f(x) calls:   136

We can display an error table as follows:

@info("Training Error Table")
err_train = ACEpotentials.compute_errors(train_data, model; weights=weights);
[ Info: Training Error Table
[ Info: RMSE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type  E [meV]  F [eV/A]  V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │   3.615 │    0.043 │  59.849 │
│ TiAl_T5000 │  10.270 │    0.300 │  26.345 │
├────────────┼─────────┼──────────┼─────────┤
│        set │   4.155 │    0.254 │  58.742 │
└────────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type  E [meV]  F [eV/A]  V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │   2.556 │    0.029 │  38.571 │
│ TiAl_T5000 │   9.737 │    0.232 │  21.113 │
├────────────┼─────────┼──────────┼─────────┤
│        set │   2.882 │    0.173 │  37.778 │
└────────────┴─────────┴──────────┴─────────┘

We should of course also look at test errors, which can be done as follows. Depending on the choice of solver, and solver parameters, the test errors might be very poor. Exploring different parameters in different applications can lead to significantly improved predictions.

@info("Test Error Table")
err_test = ACEpotentials.compute_errors(test_data, model; weights=weights);
[ Info: Test Error Table
[ Info: RMSE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type  E [meV]  F [eV/A]  V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │   7.571 │    0.049 │  66.284 │
│ TiAl_T5000 │   9.361 │    0.402 │  94.015 │
├────────────┼─────────┼──────────┼─────────┤
│        set │   7.632 │    0.274 │  67.293 │
└────────────┴─────────┴──────────┴─────────┘
[ Info: MAE Table
┌────────────┬─────────┬──────────┬─────────┐
│       Type  E [meV]  F [eV/A]  V [meV] │
├────────────┼─────────┼──────────┼─────────┤
│   FLD_TiAl │   4.463 │    0.035 │  41.038 │
│ TiAl_T5000 │   9.361 │    0.313 │  83.386 │
├────────────┼─────────┼──────────┼─────────┤
│        set │   4.612 │    0.162 │  42.321 │
└────────────┴─────────┴──────────┴─────────┘

If we want to save the fitted potentials to disk to later use we can simply save the hyperparameters and the parameters. At the moment this must be done manually but a more complete and convenient interface for this will be provided, also adding various sanity checks.

using JSON
open("TiAl_model.json", "w") do f
	 JSON.print(f, Dict("hyperparams" => hyperparams, "params" => model.ps))
end

To load the model back from disk it is safest to work within the same Julia project, i.e. the same version of all packages; ideally the the Manifest should not be changed. One then generates the model again, loads the parameters from disk and then sets them in the model. Again, this will be automated in the future.

Finally, we delete the model to clean up.

rm("TiAl_model.json")

Fast Evaluator

ACEpotentials.jl provides an experimental "fast evaluator". This tries to merge some of the operations in the full model resulting in a "slimmer" and usually faster evaluator. In some cases the performance gain can be multiple factors up to an order of magnitude. This is particularly important when using a parameter estimation solver that sparsifies. In that case, the performance gain can be significant.

To construct the fast evaluator, simply use

fpot = fast_evaluator(model)

An optional keyword argument aa_static = true can be used to optimize the n-correlation layer for very small models (at most a few hundred parameters). For larger models this leads to a stack overflow.


This page was generated using Literate.jl.