API for Tensor Layers
P4ML also implements a few standard layers that occur in atomic cluster expansion (ACE) type models. These are all expressed as (usually symmetric) tensor operations. These are documented here. Some of these operations are quite unique (sparse symmetric tensor contractions) while others are more standard (`LinearLayer') and are provided here for conformity with our interface.
- LinearLayer : todo add doc string
- Sparse product : todo add doc string
- Fused tensor product and pooling
PooledSparseProduct
- Sparse symmetric product
SparseSymmProd
- Recursive sparse symmetric product implementation
SparseSymmProdDAG
Their usage differs slightly from the polynomial embeddings. Evaluating a layer is the same and can be done both in-place and allocating, e.g.,
abasis::PooledSparseProduct
evaluate!(A, abasis, BB)
A = evaluate(abasis, BB)
We refer to the individual documentation for the details of the arguments to each layer.
All tensor layers can be conveniently used in a non-allocating way from a Bumper @no_escape
block, e.g.,
A = @withalloc evaluate!(absis, BB)
Pullbacks
All tensor layers have custom pullbacks implemented that can be accessed via non-allocating or allocating calls:
pullback!(∂X, ∂P, layer, X)
∂X = pullback(∂P, layer, X)
Pushforwards and Second-order derivatives
Pushforwards and reverse-over-reverse are implemented using ForwardDiff. This is quasi-optimal even for reverse-over-reverse due to the fact that it can be interpreted as a directional derivative on evaluate and pullback (after swapping derivatives). As a matter of fact, we generally recommend to not use these directly. ChainRules integration would give an easier use-pattern. For optimal performance the same technique to an entire model architecture rather than to each individual layer. This would avoid several unnecessary intermediate allocations.
The syntax for pushforwards is straightforward:
pushforward!(P, ∂P, layer, X, ∂X)
P, ∂P = pushforward(layer, X, ∂X)
For second-order pullbacks the syntax is
∇_∂P, ∇_X = pullback2(∂∂X, ∂P, layer, X)
pullback2!(∇_∂P, ∇_X, ∂∂X, ∂P, layer, X)
Bumper and WithAlloc usage
Using the WithAlloc.jl
interface these can be used conveniently as follows (always from within a @no_escape
block)
∂X = @withalloc pullback!(∂P, layer, X)
P, ∂P = @withalloc pushforward!(layer, X, ∂X)