Preliminary AD Capabilities

Our aim is to make ACE.jl fully differentiable. Some first steps in this direction are now complete providing some initial AD capabilities. This page provides a preliminary documentation and records some limitations and pitfalls.

Example:

# initialize the linear ACE with two invariant properties 
c_m = rand(SVector{2,Float64}, length(basis))
model = ACE.LinearACEModel(basis, c_m)
# wrap a nonlinearity around it 
FS = p -> sum( (1 .+ val.(p).^2).^0.5 )
fsmodel = cfg -> FS(evaluate(model, cfg))
# AD it to get the forces
grad_fsmodel = cfg -> Zygote.gradient(fsmodel, cfg)[1]  

# now define some loss that uses model values and model gradients 
y = randn(SVector{3, Float64}, length(cfg))
loss = model -> sum( sum(abs2, g.rr - y) 
                     for (g, y) in zip(grad_fsmodel(model, cfg), y) )
# and we can differentiate this w.r.t the parameters
Zygote.gradient(loss, model)[1]

Remarks:

  • val : evaluate(model, cfg) will return an SVector containing two Invariants. To extract an actual value from that, we use val which is simply defined as val(x) = x.val. The point though is that we also defined adjoints for val which propagage through the differentiation. This is why FS uses val in its definition.
  • Especially the AD capabilities of ACE.jl are very much a draft without much concern for performance.
  • composition of ACE models is not supported yet, but hopefully coming.
  • Nobody knows what will happen in the above example if the linear ace model produces covariant instead of invariant properties :). This is all untested and will likely break. Please file issues.
  • There are a few places that are still "hacks", see TODOs in the main ACE.jl code.