autoemxsp.utils.helper module

Utility Functions for EDS Spectrum Analysis and Data Handling

A collection of general-purpose utility functions and lightweight classes for data handling, visualization, and file management within the EDS analysis and modeling framework.

This module is designed to be self-contained and importable across different parts of the project, covering tasks from compositional conversions to structured file loading and formatted console output.

Main Features

Compositional Conversions - atomic_to_weight_fr(): Convert atomic fractions to weight fractions. - weight_to_atomic_fr(): Convert weight fractions to atomic fractions.

Both functions use pymatgen.Element for accurate atomic mass values.

Formula Handling - get_std_comp_from_formula(): Parse chemical formulas into standardized element dictionaries. - to_latex_formula(): Convert a chemical formula into LaTeX format (e.g., Fe2O3 → Fe$_2$O$_3$).

String and Table Utilities - print_nice_1d_row(): Print a formatted 1D table row (with adjustable width and alignment). - print_element_fractions_table(): Display element names with their atomic and weight percentages. - print_single_separator(): Print a single horizontal line separator. - print_double_separator(): Print a double horizontal line separator for clear sectioning. - AlphabetMapper: Map integers to alphabetic letters (0 → ‘A’, 25 → ‘Z’, etc.).

Image Utilities - draw_scalebar(): Draw a scale bar on an image using OpenCV, with optional text label and styling.

File and Directory Handling - get_sample_dir(): Locate a directory named after a given sample ID, with optional recursive search. - make_unique_path(): Generate a unique file or directory path if one already exists (e.g., file_1.png). - load_configurations_from_json(): Reconstruct configuration dataclasses and metadata from a JSON file. - extract_spectral_data(): Extract spectra, quantification data, and coordinates from a Data.csv file. - load_msa(): Load and parse a .msa spectral file, returning energy and intensity arrays.

User Interaction - Prompt_User: Display a Tkinter prompt window for manual user confirmation (e.g., proceed/stop execution).

Error Handling - EDSError: Custom exception class for handling EDS-related errors gracefully.

Created on Fri Jun 28 11:50:53 2024

@author: Andrea Giunto

autoemxsp.utils.helper.atomic_to_weight_fr(atomic_fractions: Sequence[float] | ndarray, elements: List[str], verbose: bool = True) → ndarray[source]

Convert atomic fractions to weight fractions for a given set of elements.

Parameters:

atomic_fractions (Sequence[float] or np.ndarray) – The atomic fractions of the elements. Should sum to approximately 1.
elements (List[str]) – List of element symbols (e.g., [“Si”, “O”, “Al”]).
verbose (bool, optional) – If True, prints warnings about normalization. Default is True.

Returns:

weight_fractions – The corresponding weight fractions, summing to 1.

Return type:

np.ndarray

Raises:

ValueError – If the lengths of atomic_fractions and elements do not match.

Examples

>>> atomic_to_weight_fr([0.333, 0.667], ["Si", "O"])
array([0.467..., 0.532...])

autoemxsp.utils.helper.weight_to_atomic_fr(weight_fractions: Sequence[float] | ndarray, elements: List[str], verbose: bool = True) → ndarray[source]

Convert weight fractions to atomic fractions for a given set of elements.

Parameters:

weight_fractions (Sequence[float] or np.ndarray) – The weight fractions of the elements. Should sum to approximately 1.
elements (List[str]) – List of element symbols (e.g., [“Si”, “O”, “Al”]).
verbose (bool, optional) – If True, prints warnings about normalization. Default is True.

Returns:

atomic_fractions – The corresponding atomic fractions, summing to 1.

Return type:

np.ndarray

Raises:

ValueError – If the lengths of weight_fractions and elements do not match.

Examples

>>> weight_to_atomic_fr([0.467, 0.533], ["Si", "O"])
array([0.333..., 0.666...])

autoemxsp.utils.helper.to_latex_formula(formula: str, include_dollar_signs: bool = True) → str[source]

Convert a chemical formula string into its LaTeX representation.

Supports nested parentheses, element subscripts, and group multipliers. For example:

‘Al2(SO4)3’ -> ‘Al$_{2}$(SO$_{4}$)$_{3}$’

Parameters:

formula (str) – The chemical formula as a string.
include_dollar_signs (bool) – Whether to wrap the LaTeX in $…$.

Returns:

The LaTeX-formatted formula.

Return type:

str

autoemxsp.utils.helper.print_nice_1d_row(first_col: Any, row: Sequence[Any], floatfmt: str = '.2f', first_col_width: int = 10, col_width: int = 10) → None[source]

Print a single row for a table: first entry (label) left-aligned, rest as formatted floats. Used to print quantification results during ZAF corrections

Parameters:

first_col – The value for the first (label) column.
row – Sequence of values for the remaining columns (numbers or strings).
floatfmt – Format for floating point numbers (default: ‘.3f’).
first_col_width – Width for the first (label) column.
col_width – Width for each numeric column.

autoemxsp.utils.helper.print_element_fractions_table(formula)[source]: Given a chemical formula, print a table with elements, atomic %, and weight % using pymatgen.

autoemxsp.utils.helper.print_single_separator()[source]: Print a single-line separator (50 dashes) for visual clarity.

autoemxsp.utils.helper.print_double_separator()[source]: Print a double-line separator (50 equals signs) for visual clarity.

class autoemxsp.utils.helper.AlphabetMapper[source]

Bases: object

Maps a zero-based integer index to Excel-style column letters. 0 -> ‘A’, 1 -> ‘B’, …, 25 -> ‘Z’, 26 -> ‘AA’, etc.

Used for labeling frames analyzed during particle search.

get_letter(index: int) → str[source]

Convert a zero-based index to Excel-style column letters.

Parameters:: index (int) – Zero-based index (e.g., 0 for ‘A’, 25 for ‘Z’, 26 for ‘AA’, …)
Returns:: Excel-style column letters.
Return type:: str
Raises:: ValueError – If index is negative.

autoemxsp.utils.helper.load_configurations_from_json(json_path, config_classes_dict)[source]

Load configuration dataclasses and metadata from a spectrum collection info JSON file.

Parameters:

json_path (str) – Path to the JSON file saved by EMXSp_Composition_Analyzer._save_spectrum_collection_info.
config_classes_dict (dict) –
Mapping from JSON keys to dataclass types, e.g.:
{‘sample_cfg’: SampleConfig, …}

See configurations in tools/config_classes.py:
- MicroscopeConfig: Settings for microscope hardware, calibration, and imaging parameters.
- SampleConfig: Defines the sample’s identity, elements, and spatial properties.
- SampleSubstrateConfig: Specifies the substrate composition and geometry supporting the sample.
- MeasurementConfig: Controls measurement type, beam parameters, and acquisition settings.
- FittingConfig: Parameters for spectral fitting and background handling.
- QuantConfig: Options for quantification and filtering of X-ray spectra.
- PowderMeasurementConfig: Settings for analyzing powder samples and particle selection.
- BulkMeasurementConfig: Settings for analyzing bulk samples.
- ExpStandardsConfig: Settings for experimental standard measurements
- ClusteringConfig: Configures clustering algorithms and feature selection for data grouping.
- PlotConfig: Options for saving, displaying, and customizing plots.

Returns:

configs (dict) – Dictionary of configuration objects reconstructed from JSON, keyed by their JSON key. If a key from config_classes_dict is missing in the JSON, it will not be present in configs.
metadata (dict) – Dictionary of any additional metadata (e.g., timestamp) found in the JSON.

Raises:

FileNotFoundError – If the JSON file does not exist.

Example

>>> config_classes = {'sample_cfg': SampleConfig, ...}
>>> configs, metadata = load_configurations_from_json('acquisition_info.json', config_classes)

autoemxsp.utils.helper.extract_spectral_data(data_csv_path)[source]

Extract spectra quantification, spectral data, and coordinates from Data.csv file.

Parameters:

data_csv_path (str) – Path to the Data.csv file.
cnst (module or object) – Should provide all the necessary attribute keys as in your code.

Returns:

spectra_quant (list of dict or None) – List of quantification results per spectrum (None if not quantified). If all entries are None, returns None.
spectral_data (dict) – Dictionary of lists for each spectral data column (e.g., spectrum, background, real_time, live_time, etc.). If a column is missing, the value is an empty list.
sp_coords (list of dict) – List of dicts for each spectrum’s coordinates, keys as in cnst.LIST_SPECTRUM_COORDINATES_KEYS.
df (pandas.DataFrame) – The loaded DataFrame from the CSV file.

autoemxsp.utils.helper.make_unique_path(parent_dir: str | Path, base_name: str | Path, extension: str | None = None) → Path[source]

Generate a unique file or directory path inside parent_dir based on base_name. If a path with the base name exists, appends a counter (e.g., ‘Sample1_2’). Optionally, add an extension for files.

Parameters:

parent_dir (str or Path) – The parent directory in which to generate the new path.
base_name (str or Path) – The base name for the new file or directory.
extension (str, optional) – The file extension (e.g., ‘txt’ or ‘.txt’). If None, treat as directory.

Returns:

unique_path – The full, unique path (not created).

Return type:

Path

Raises:

ValueError – If parent_dir or base_name is invalid.

autoemxsp.utils.helper.get_sample_dir(results_path: str, sample_ID: str, case_insensitive: bool = True, verbose: bool = False) → str[source]

Find a directory named sample_ID under results_path or its subdirectories.

Strategy:

Walk the entire directory tree under results_path and collect exact matches.
If multiple matches -> raise RuntimeError (ambiguous).
If none -> show close matches and raise FileNotFoundError.

Parameters:

results_path (str) – Root directory to search from.
sample_ID (str) – Directory name to search for (exact match).
case_insensitive (bool, optional) – Whether to ignore case when matching. Default is True.
verbose (bool, optional) – Print additional debug information. Default is False.

Returns:

sample_dir – Full path to the matched directory.

Return type:

str

Raises:

RuntimeError – If multiple matches are found.
FileNotFoundError – If no match is found.

autoemxsp.utils.helper.load_msa(filepath: str) → Tuple[ndarray, ndarray, Dict[str, str]][source]

Load a .msa or .msg file containing Y-only spectral data and compute the energy scale. Designed for raw spectra exported from Thermo Fisher Phenom systems. May work with other EMSA/MAS format files, but minor variations can occur.

Parameters:

filepath (str) – Path to the .msa or .msg file.

Returns:

energy (np.ndarray) – Energy values computed from OFFSET and XPERCHAN (in eV).
counts (np.ndarray) – Measured counts per energy channel.
metadata (dict) – Parsed metadata from the header (all values as strings).

autoemxsp.utils.helper.draw_scalebar(image, pixel_size_um, bar_width=0.25)[source]

Draw a scale bar on the given image.

The scale bar is drawn as a filled white rectangle in the bottom-left corner of the image, with a label indicating its length in micrometers (um). The actual scale bar length is chosen from a set of standard values (0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200 um) to best match the ‘bar_width’, expressed as fraction of image width.

Parameters:

image (np.ndarray) – Input image (grayscale or color, as a NumPy array). The scale bar will be drawn directly on this image.
pixel_size_um (float) – Size of a pixel in micrometers (um/pixel).
bar_width (float, optional) – Target width of scale bar, in fraction of image width. Default: 0.25

Returns:

image – The same image with the scale bar and label drawn on it.

Return type:

np.ndarray

Notes

The function detects if the image is grayscale or color and draws the scale bar in white accordingly.
The scale bar is placed in the bottom-left corner with a margin from the edges.
The label is centered above the scale bar rectangle.

class autoemxsp.utils.helper.Prompt_User(title: str, message: str)[source]

Bases: object

A simple GUI prompt using Tkinter to display a message and wait for user confirmation.

This is useful for pausing execution and prompting the user to take action, such as selecting a position at the micriscope for manual EDS spectrum collection.

title

The window title.

Type:: str

message

The message to display in the prompt.

Type:: str

execution_stopped

True if the user closed the window or pressed Esc.

Type:: bool

ok_pressed

True if the user pressed OK or Return.

Type:: bool

root

The Tkinter root window.

Type:: tk.Tk or None

press_ok()[source]: Handle OK button or Return key press.

stop_execution()[source]: Handle window close (X) or Esc key press.

run()[source]: Start the prompt window and wait for user interaction. Sets ok_pressed or execution_stopped depending on user action.

exception autoemxsp.utils.helper.RefLineError[source]

Bases: Exception

Exception raised for errors related to reference lines.

exception autoemxsp.utils.helper.MissingHintError[source]

Bases: Exception

Exception raised when a required hint is missing.

exception autoemxsp.utils.helper.EMError(message: str, code: int | None = None)[source]

Bases: Exception

Custom exception class for electron microscope (EM)-related errors.

Parameters:

message (str) – Description of the error.
code (int, optional) – Optional error code.

exception autoemxsp.utils.helper.EDSError(message: str, code: int | None = None)[source]

Bases: EMError

Custom exception class for EDS-related errors.

Parameters:

message (str) – Description of the error.
code (int, optional) – Optional error code.