Analysis tools#

Functions to filter data in a StructureWrangler and calculate fitting weights.

For example, weights can be calculated by energy above hull or energy by composition.

max_ewald_energy_indices(wrangler, max_relative_energy, return_compliment=False)[source]#

Return set of indices of structures by electrostatic interaction energy.

Filter the input structures to remove those with low electrostatic energies (no large charge separation in cell). This energy is referenced to the lowest value at that composition. Note that this is before the division by the relative dielectric constant and is per primitive cell in the cluster expansion – 1.5 eV/atom seems to be a reasonable value for dielectric constants around 10.

Parameters:
  • wrangler (StructureWrangler) – a StructureWrangler containing data to be filtered.

  • max_relative_energy (float) – Ewald threshold. The maximum Ewald energy relative to the minimum energy at a structure’s composition (normalized per prim).

  • return_compliment (bool) – optional If True will return the compliment of the unique indices.

Returns:

indices, compliment

unique_corr_vector_indices(wrangler, property_key, filter_by='min', cutoffs=None, return_compliment=False)[source]#

Return set of indices of structures with unique correlation vectors.

Keep structures with the min or max value of the given property. Note correlation vectors exclude external terms such that even if the external term is different for the structure but all correlations are the same, the structures will be considered duplicates.

Parameters:
  • wrangler (StructureWrangler) – a StructureWrangler containing data to be filtered.

  • property_key (str) – name of property to consider when returning structure indices for the feature matrix with duplicate corr vectors removed.

  • filter_by (str) – the criteria for the property value to keep. Options are min or max.

  • cutoffs (dict) – optional Dictionary with cluster diameter cutoffs for correlation functions to consider in correlation vectors.

  • return_compliment (bool) – optional if True, will return the complement of the unique indices

Returns:

indices, complement

weights_energy_above_composition(structures, energies, temperature=2000)[source]#

Compute weights for energy above the minimum reduced composition energy.

Parameters:
  • structures (list) – list of pymatgen Structures

  • energies (ndarray) – energies of corresponding Structures

  • temperature (float) – temperature to use in Boltzmann weight

Returns: weights for each structure.

array

weights_energy_above_hull(structures, energies, cs_structure, temperature=2000)[source]#

Compute weights for structure energy above the hull of given structures.

Parameters:
  • structures (list) – list of pymatgen Structures

  • energies (ndarray) – energies of corresponding Structures.

  • cs_structure (Structure) – The pymatgen Structure used to define the ClusterSubspace

  • temperature (float) – temperature to use in Boltzmann weight.

Returns: weights for each structure.

array