The submodule ACEfriction.DataUtils.jl
provides structures to internally store friction data in ACEfriction.jl
and functions to import and export friction data from and to custom formatted hdf5 files.
Friction Data representation
For pre-training storage and manipulation of friction data, ACEfriction.jl
implements the structure FrictionData
to represent a single observation $(R_i,\Gamma_i)$ of an atomic configuration $R_i$ and corresponding friction tensor $\Gamma_i$:
struct FrictionData
atoms
friction_tensor
friction_indices
end
where
atoms
– stores data of the atomic configuration ans is assumed to be of typeJuLIP.Atoms
,friction_tensor
– stores data on the friction tensor and is assumed to be of typeSparseMatrix{SMatrix{3, 3, T, 9}}
, whereT<:Float
andTi<:Int
. That is, the friction tensor is stored in the form of sparse matrices with $3 \times 3$-matrix valued block entries,friction_indices
– is a one-dimensional integer array, which contains all atom indices for which the friction tensor is defined.
Importing & Exporting Friction Data
ACEfriction.DataUtils
implements the function save_h5fdata
to save arrays of friction data to a custom formatted hdf5
file, as well as the function load_h5fdata
to load friction data from such costum formatted hdf5
files:
ACEfriction.DataUtils.save_h5fdata
— Functionsave_h5fdata(rdata::Vector{FrictionData}, filename::String )
Saves a friction tensor data in a costum formatted hdf5 file.
Arguments
rdata
: Vector{FrictionData} : A vector of friction data entries. Each entry is a structure of typeFrictiondata
with the following fields:at
: JuLIP.Atoms : Atoms object containing the atomic positions, cell, and periodic boundary conditions.friction_tensor
: SparseMatrix{SMatrix{3,3,Float64,9}} : Sparse matrix representation of the friction tensor.friction_indices
: Vector{Int} : Indices of the atoms for which the friction tensor is defined.
filename
: String : Name of the file to save to (including h5 extension).
ACEfriction.DataUtils.load_h5fdata
— Functionload_h5fdata(filename::String)
Loads a friction tensor data from a costum formatted hdf5 file.
Arguments
filename
: String : Name of the file to load from (including h5 extension).
Returns
rdata
: Vector{FrictionData} : A vector of friction data entries. Each entry is a structure of typeFrictiondata
with the following fields:at
: JuLIP.Atoms : Atoms object containing the atomic positions, cell, and periodic boundary conditions.friction_tensor
: SparseMatrix{SMatrix{3,3,Float64,9}} : Sparse matrix representation of the friction tensor.friction_indices
: Vector{Int} : Indices of the atoms for which the friction tensor is defined.
Custom HDF5 File Format for Friction Data
The hierachical structure of such hdf5 files is as follows:
Each observation $(R_i,\Gamma_i)$ is saved in a separate group in named by respective index $i$, i.e., we have the following groups on root level of the hdf5 file:
├─ 📂 1
├─ 📂 2
├─ 📂 3
│ :
├─ 📂 N_obs
Within each of these groups, the data of the respective atomic configuration $R_i$ and friction tensor $\Gamma_i$ are saved in the subgroups atoms
and friction_tensor
, respectively:
├─ 📂 i # Index of data point
├─ 📂 atoms # Atom configuration data
│ ├─ 🔢 atypes
│ ├─ 🔢 cell
│ │ └─ 🏷️ column_major
│ ├─ 🔢 pbc
│ └─ 🔢 positions
│ └─ 🏷️ column_major
└─ 📂 friction_tensor # Friction tensor data
├─ 🔢 ft_I
├─ 🔢 ft_J
├─ 🔢 ft_mask
└─ 🔢 ft_val,
└─ 🏷️ column_major
Datasets in the group atoms
store the equivalent information provided by the attributes positions
, numbers
, cell
, and pbc
of atoms objects, i.e., an atomic configuration of N atoms is described by the following datasets contained in the group atoms
:
atypes
– A one-dimensional Integer dataset of length N. The ith entry corresponds to the atomic element number of the ith atom in the configuration. (Note:types
corresponds to the atoms attributenumbers
in the ase.)cell
– A two-dimensional Float64 dataset of size 3 x 3.pbc
– A one-dimensional Integer array of length 3 indicating the periodicity properties of the xyz dimension, e.g.,pbc = [1,0,0]
describes periodic boundary conditions in x dimension and non-periodic boundary conditions in the y and z dimensions.positions
– A two-dimensional Float64 dataset of size n N x 3. The ith column corresponds to the position of the ith atom in the configuration
Datasets in the group friction_tensor
store the friction tensor as a N x N sparse matrix with 3x3 valued block entries, i.e., a friction tensor with m non-zero 3x3 blocks is stored
ft_I
– A one-dimensional Integer dataset of length m specifying the column indices of non-zero block entries of the friction tensor.ft_J
– A one-dimensional Integer dataset of length m specifying the row indices of non-zero block entries of the friction tensor.ft_val
– A three-dimensional Float64 dataset of size m x 3 x 3 specifying the values of the non-zero 3 x 3 block entries of the friction tensor. For example, the 3 x 3 arrayft_val[k,:,:]
corresponds to the 3 x 3 block entry of the friction tensor with column indexft_I[k]
, and row indexft_J[k]
.ft_mask
– A one-dimensional Integer dataset/list containg the indices of atoms for which friction information is provided.
All two or three-dimensional datasets in the groups atoms
and friction_tensor
have an additional attribute column_major
. If the hdf5 file is created in a language that stores matrices in column-major form (e.g., julia), this attribute must be set to 1 (True). If the hdf5 file is created in a language that stores matrices in column-major form (e.g., python), this attribute must be set to 0 (False).