Generic decorators#

Decorate properties to a structure composed of Element.

This module offers generic classes and functions for defining an algorithm used to map VASP calculated site properties into the label of species. For example, BaseDecorator, MixtureGaussianDecorator, GpOptimizedDecorator and NoTrainDecorator. These abstract classes are meant to be inherited by any decorator class that maps specific site properties.

Currently, we can only decorate charge. Plan to allow decorating spin in the future updates.

Note

All entries should be re-decorated and all decorators should be retrained after an iteration.

class BaseDecorator(labels=None, **kwargs)[source]#

Bases: MSONable

Abstract decorator class.

  1. Each decorator should only be used to decorate one property.

  2. Currently, only supports assigning labels from one scalar site property, and requires that the site property can be accessed from ComputedStructureEntry, which should be sufficient for most purposes.

  3. Can not decorate entries with partial disorder.

Initialize.

Parameters:

labels (dict of str or Species to list) – optional A table of labels to decorate each element with. keys are species symbol, values are possible decorated property values, such as oxidation states, magnetic spin directions. Values are sorted such that the corresponding cluster centers of the required property is increasing. For example, in Mn(2, 3, 4)+ (high spin), the magnetic moments is sorted as [Mn4+, Mn3+, Mn2+], thus you should provide labels as {Element(“Mn”):[4, 3, 2]}. Keys can be either Element|Species object, or their string representations. Currently, do not support decoration of Vacancy. If you have multiple required properties, or required properties have multiple dimensions, the labels order must match the sort in the order of self.required_properties. Properties are sorted lexicographically. This argument may not be necessary for some decorator, such as GuessChargeDecorator. Be sure to provide labels for all the species you wish to assign a property to, otherwise, you are responsible for your own error!

as_dict()[source]#

Serialize the decorator.

abstract decorate(entries)[source]#

Give decoration to entries based on trained model.

If an assigned entry is not valid, for example, in charge assignment, if a decorated structure is not charge neutral, this entry will be returned as None.

Parameters:

entries (list of ComputedStructureEntry) – Entries of computed, undecorated structures.

Returns:

Entries with decorated structures or failed structures.

Return type:

list of NoneType or ComputedStructureEntry

decorated_prop_name = None#
abstract classmethod from_dict(d)[source]#

Deserialization.

static group_site_by_species(entries)[source]#

Group required properties on sites by species.

Parameters:

entries (list of ComputedStructureEntry) – Entries of computed structures.

Returns:

(Entry index, site index) belonging to each species.

Return type:

defaultdict

abstract property is_trained#

Gives whether this decorator is trained before.

If trained, will be blocked from training again.

Returns:

Whether the model has been trained.

Return type:

bool

required_prop_names = None#
abstract train(entries, reset=False)[source]#

Train the decoration model.

Model or model parameters should be stored in a property of the object.

Parameters:
  • entries (list of ComputedStructureEntry) – Entries of computed structures.

  • reset (bool) – optional If you want to re-train the decorator model, set this value to true. Otherwise, will skip training if the model is trained. Default to false.

class GpOptimizedDecorator(labels, cuts=None, **kwargs)[source]#

Bases: BaseDecorator

Gaussian process decorator class.

Uses Gaussian optimization process described by J. H. Yang et al.

Up to now, this class can only take as input a single scalar property per site.

Initialize.

Parameters:
  • labels (dict of str to list) – optional A table of labels to decorate each element with. keys are species symbol, values are possible decorated property values, such as oxidation states, magnetic spin directions. Values are sorted such that the cluster centers in the required property is increasing. For example, in Mn(2, 3, 4)+ all high spin, the magnetic moments is sorted as [Mn4+, Mn3+, Mn2+], thus you should provide labels as {Element(“Mn”):[4, 3, 2]}. Keys can be either Element and Species object, or their string representations. Currently, do not support decoration of Vacancy. If you have multiple required properties, or required properties have multiple dimensions, the labels order must match the sort in the order of self.required_properties. Properties are sorted lexicographically. This argument may not be necessary for some sub-classes, such as: GuessChargeDecorator. Be sure to provide labels for all the species you wish to assign a property to, otherwise, you are the cause of your own error!

  • cuts (dict of str or Species over list) –

    optional Cuts to divide required property value into sectors, so as to decide the label they belong to. Keys are the same as argument “labels”. For example, if labels={Element(“Mn”):[4, 3, 2]} and cuts={Element(“Mn”):[0.5, 1.0]}, and the required property is “total_magmom”, then Mn atoms with magnetic moment < 0.5 will be assigned label 4, atoms with 0.5 <= magnetic moment < 1.0 will be assigned label 3, and atoms with magnetic moment >= 1.0 will be assigned label 2. If provided:

    1. Cut values must be monotonically increasing,

    2. Must satisfy len(labels[key]) = len(cuts[key]) + 1 for any key.

as_dict()[source]#

Serialize the decorator.

decorate(entries)[source]#

Give decoration to entries based on trained model.

If an assigned entry is not valid, for example, in charge assignment, if a decorated structure is not charge neutral, then its corresponding entry will be returned as None.

Parameters:

entries (list of ComputedStructureEntry) – Entries of computed, undecorated structures.

Returns:

Entries with decorated structures or failed structures.

Return type:

list of NoneType or ComputedStructureEntry

decorated_prop_name = ''#
classmethod from_dict(d)[source]#

Deserialization.

property is_trained#

Gives whether this decorator is trained before.

If trained, will be blocked from training again.

Returns:

Whether the model is trained.

Return type:

bool

required_prop_names = []#
train(entries, reset=False, n_calls=50)[source]#

Train the decoration model.

First initialize with mixture of gaussian, then optimize some objective function with gaussian process.

Parameters:
  • entries (list of ComputedStructureEntry) – Entries of computed structures.

  • reset (bool) – optional If you want to re-train the decorator model, set this value to true. Otherwise, training will be skipped if the model is trained. Default to false.

  • n_calls (int) – optional The number of iterations to be used by gp_minimize(). Default is 50.

class MixtureGaussianDecorator(labels, gaussian_models=None, **kwargs)[source]#

Bases: BaseDecorator

Mixture of Gaussians (MoGs) decorator class.

Uses mixture of Gaussians method to label each species.

Note

No test has been added for this specific class yet.

Initialize.

Parameters:
  • labels (dict of str to list) – optional A table of labels to decorate each element with. keys are species symbol, values are possible decorated property values, such as oxidation states, magnetic spin directions. Values are sorted such that the cluster centers in the required property is increasing. For example, in Mn(2, 3, 4)+ all high spin, the magnetic moments is sorted as [Mn4+, Mn3+, Mn2+], thus you should provide labels as {Element(“Mn”):[4, 3, 2]}. If you have multiple required properties, or required properties have multiple dimensions, the labels order must match the sort in the order of self.required_properties. Properties are sorted lexicographically. Keys can be either Element|Species object, or their string representations. Currently, do not support decoration of Vacancy. This argument may not be necessary for some sub-classes, such as: GuessChargeDecorator. Be sure to provide labels for all the species you wish to assign a property to, otherwise, you are the cause of your own error!

  • gaussian_models (dict of str or Element or Species to GaussianMixture) – Gaussian models corresponding to each key in argument labels.

as_dict()[source]#

Serialize to dict.

decorate(entries)[source]#

Give decoration to entries based on trained model.

If an assigned entry is not valid, for example, in charge assignment, if an assigned structure is not charge neutral, then this entry will be returned as None.

Parameters:

entries (list of ComputedStructureEntry) – Entries of computed, undecorated structures.

Returns:

Entries with decorated structures or failed structures.

Return type:

List of NoneType or ComputedStructureEntry

decorated_prop_name = None#
static deserialize_gaussian_model(data)[source]#

Recover gaussian model from dict.

classmethod from_dict(d)[source]#

Load from dict.

gaussian_model_keys = ('weights_', 'means_', 'covariances_', 'precisions_', 'precisions_cholesky_', 'converged_', 'n_iter_', 'lower_bound_')#
property is_trained#

Determine whether the decorator has been trained.

Returns:

Whether the model has been trained.

Return type:

bool

static is_trained_gaussian_model(model)[source]#

Whether a gaussian model is trained.

required_prop_names = None#
static serialize_gaussian_model(model)[source]#

Serialize gaussian model into dict.

train(entries, reset=False)[source]#

Train the decoration model.

Model or model parameters should be stored in a property of the object.

Parameters:
  • entries (list of ComputedStructureEntry) – Entries of computed structures.

  • reset (bool) – optional If you want to re-train the decorator model, set this value to true. Otherwise, we will skip training if the model is trained before. Default to false.

class NoTrainDecorator(labels, **kwargs)[source]#

Bases: BaseDecorator

Decorators that does not need training.

Initialize.

Parameters:

labels (dict of str or Species to list}) – optional A table of labels to decorate each element with. keys are species symbol, values are possible decorated property values, such as oxidation states, magnetic spin directions. Values are sorted such that the corresponding cluster centers of the required property is increasing. For example, in Mn(2, 3, 4)+ (high spin), the magnetic moments is sorted as [Mn4+, Mn3+, Mn2+], thus you should provide labels as {Element(“Mn”):[4, 3, 2]}. Keys can be either Element and Species object, or their string representations. Currently, do not support decoration of Vacancy. If you have multiple required properties, or required properties have multiple dimensions, the labels order must match the sort in the order of self.required_properties. Properties are sorted lexicographically. This argument may not be necessary for some decorator, such as GuessChargeDecorator. Be sure to provide labels for all the species you wish to assign a property to, otherwise, you are responsible for your own error!

property is_trained#

Always considered trained.

train(entries=None, reset=False)[source]#

Train the model.

This decorator does not require training at all. Keep this method just for consistency.

decorator_factory(decorator_type, *args, **kwargs)[source]#

Create a BaseDecorator with its subclass name.

Parameters:
  • decorator_type (str) – The name of a subclass of BaseDecorator.

  • *args – Arguments used to initialize the class.

  • **kwargs

    Arguments used to initialize the class.

Returns:

The initialized decorator.

Return type:

BaseDecorator

get_site_property_query_names_from_decorator(decname)[source]#

Get the required properties from a decorator name.

Parameters:

decname (str) – Decorator name.

Returns:

The list of names of required site properties by the decorator.

Return type:

list of str