Structure selection#

Structure selection methods from feature matrices.

select_added_rows(femat, old_femat, n_select=10, method='leverage', keep_indices=None, num_external_terms=0, domain_matrix=None)[source]#

Select structures to add to an existing CE project.

We select structures by minimizing the leverage score under a certain domain matrix, or fully at random. Refer to T. Mueller et al.

Parameters:
  • femat (2D arraylike) – Correlation vectors of new structures.

  • old_femat (2D arraylike) – Existing old feature matrix.

  • n_select (int) – optional Number of structures to select. Default is 10.

  • method (str) – optional The method used to select structures. Default is by maximizing leverage score reduction (“leverage”). “random” is also supported.

  • keep_indices (list of int) – optional Indices of structures that must be selected. Usually those of important ground state structures.

  • num_external_terms (int) – optional Number of external terms in cluster subspace. These terms should not be compared in a structure selection.

  • domain_matrix (2D arraylike) – optional The domain matrix used to compute leverage score. By default, we use an identity matrix.

Returns:

Indices of selected rows in the feature matrix, corresponding to the selected structures.

Return type:

list of int

select_initial_rows(femat, n_select=10, method='leverage', num_external_terms=0, keep_indices=None)[source]#

Select structures to initialize an empty CE project.

Parameters:
  • femat (2D arrayLike) – Correlation vectors of each structure.

  • n_select (int) – optional Number of structures to select. Default is 10.

  • method (str) – optional The method used to select structures. Default is base on leverage score, which minimizes the Frobenius norm difference between the covariance matrix of all structures and the covariance matrix of a selection. “random” is also supported.

  • num_external_terms (int) – optional Number of external terms in cluster subspace. These terms should not be compared in a structure selection.

  • keep_indices (list of int) – optional Indices of structures that must be selected. Usually those of important ground state structures.

Returns:

Indices of selected rows in the feature matrix, corresponding to the selected structures.

Return type:

list of int