autoemxsp.utils.helper module

Utility Functions for EDS Spectrum Analysis and Data Handling

A collection of general-purpose utility functions and lightweight classes for data handling, visualization, and file management within the EDS analysis and modeling framework.

This module is designed to be self-contained and importable across different parts of the project, covering tasks from compositional conversions to structured file loading and formatted console output.


Main Features

Compositional Conversions - atomic_to_weight_fr(): Convert atomic fractions to weight fractions. - weight_to_atomic_fr(): Convert weight fractions to atomic fractions.

Both functions use pymatgen.Element for accurate atomic mass values.

Formula Handling - get_std_comp_from_formula(): Parse chemical formulas into standardized element dictionaries. - to_latex_formula(): Convert a chemical formula into LaTeX format (e.g., Fe2O3Fe$_2$O$_3$).

String and Table Utilities - print_nice_1d_row(): Print a formatted 1D table row (with adjustable width and alignment). - print_element_fractions_table(): Display element names with their atomic and weight percentages. - print_single_separator(): Print a single horizontal line separator. - print_double_separator(): Print a double horizontal line separator for clear sectioning. - AlphabetMapper: Map integers to alphabetic letters (0 → ‘A’, 25 → ‘Z’, etc.).

Image Utilities - draw_scalebar(): Draw a scale bar on an image using OpenCV, with optional text label and styling.

File and Directory Handling - get_sample_dir(): Locate a directory named after a given sample ID, with optional recursive search. - make_unique_path(): Generate a unique file or directory path if one already exists (e.g., file_1.png). - load_configurations_from_json(): Reconstruct configuration dataclasses and metadata from a JSON file. - extract_spectral_data(): Extract spectra, quantification data, and coordinates from a Data.csv file. - load_msa(): Load and parse a .msa spectral file, returning energy and intensity arrays.

User Interaction - Prompt_User: Display a Tkinter prompt window for manual user confirmation (e.g., proceed/stop execution).

Error Handling - EDSError: Custom exception class for handling EDS-related errors gracefully.

Created on Fri Jun 28 11:50:53 2024

@author: Andrea Giunto

autoemxsp.utils.helper.atomic_to_weight_fr(atomic_fractions: Sequence[float] | ndarray, elements: List[str], verbose: bool = True) ndarray[source]

Convert atomic fractions to weight fractions for a given set of elements.

Parameters:
  • atomic_fractions (Sequence[float] or np.ndarray) – The atomic fractions of the elements. Should sum to approximately 1.

  • elements (List[str]) – List of element symbols (e.g., [“Si”, “O”, “Al”]).

  • verbose (bool, optional) – If True, prints warnings about normalization. Default is True.

Returns:

weight_fractions – The corresponding weight fractions, summing to 1.

Return type:

np.ndarray

Raises:

ValueError – If the lengths of atomic_fractions and elements do not match.

Examples

>>> atomic_to_weight_fr([0.333, 0.667], ["Si", "O"])
array([0.467..., 0.532...])
autoemxsp.utils.helper.weight_to_atomic_fr(weight_fractions: Sequence[float] | ndarray, elements: List[str], verbose: bool = True) ndarray[source]

Convert weight fractions to atomic fractions for a given set of elements.

Parameters:
  • weight_fractions (Sequence[float] or np.ndarray) – The weight fractions of the elements. Should sum to approximately 1.

  • elements (List[str]) – List of element symbols (e.g., [“Si”, “O”, “Al”]).

  • verbose (bool, optional) – If True, prints warnings about normalization. Default is True.

Returns:

atomic_fractions – The corresponding atomic fractions, summing to 1.

Return type:

np.ndarray

Raises:

ValueError – If the lengths of weight_fractions and elements do not match.

Examples

>>> weight_to_atomic_fr([0.467, 0.533], ["Si", "O"])
array([0.333..., 0.666...])
autoemxsp.utils.helper.to_latex_formula(formula: str, include_dollar_signs: bool = True) str[source]

Convert a chemical formula string into its LaTeX representation.

Supports nested parentheses, element subscripts, and group multipliers. For example:

‘Al2(SO4)3’ -> ‘Al$_{2}$(SO$_{4}$)$_{3}$’

Parameters:
  • formula (str) – The chemical formula as a string.

  • include_dollar_signs (bool) – Whether to wrap the LaTeX in $…$.

Returns:

The LaTeX-formatted formula.

Return type:

str

autoemxsp.utils.helper.print_nice_1d_row(first_col: Any, row: Sequence[Any], floatfmt: str = '.2f', first_col_width: int = 10, col_width: int = 10) None[source]

Print a single row for a table: first entry (label) left-aligned, rest as formatted floats. Used to print quantification results during ZAF corrections

Parameters:
  • first_col – The value for the first (label) column.

  • row – Sequence of values for the remaining columns (numbers or strings).

  • floatfmt – Format for floating point numbers (default: ‘.3f’).

  • first_col_width – Width for the first (label) column.

  • col_width – Width for each numeric column.

autoemxsp.utils.helper.print_element_fractions_table(formula)[source]

Given a chemical formula, print a table with elements, atomic %, and weight % using pymatgen.

autoemxsp.utils.helper.print_single_separator()[source]

Print a single-line separator (50 dashes) for visual clarity.

autoemxsp.utils.helper.print_double_separator()[source]

Print a double-line separator (50 equals signs) for visual clarity.

class autoemxsp.utils.helper.AlphabetMapper[source]

Bases: object

Maps a zero-based integer index to Excel-style column letters. 0 -> ‘A’, 1 -> ‘B’, …, 25 -> ‘Z’, 26 -> ‘AA’, etc.

Used for labeling frames analyzed during particle search.

get_letter(index: int) str[source]

Convert a zero-based index to Excel-style column letters.

Parameters:

index (int) – Zero-based index (e.g., 0 for ‘A’, 25 for ‘Z’, 26 for ‘AA’, …)

Returns:

Excel-style column letters.

Return type:

str

Raises:

ValueError – If index is negative.

autoemxsp.utils.helper.load_configurations_from_json(json_path, config_classes_dict)[source]

Load configuration dataclasses and metadata from a spectrum collection info JSON file.

Parameters:
  • json_path (str) – Path to the JSON file saved by EMXSp_Composition_Analyzer._save_spectrum_collection_info.

  • config_classes_dict (dict) –

    Mapping from JSON keys to dataclass types, e.g.:

    {‘sample_cfg’: SampleConfig, …}

    See configurations in tools/config_classes.py:
    • MicroscopeConfig: Settings for microscope hardware, calibration, and imaging parameters.

    • SampleConfig: Defines the sample’s identity, elements, and spatial properties.

    • SampleSubstrateConfig: Specifies the substrate composition and geometry supporting the sample.

    • MeasurementConfig: Controls measurement type, beam parameters, and acquisition settings.

    • FittingConfig: Parameters for spectral fitting and background handling.

    • QuantConfig: Options for quantification and filtering of X-ray spectra.

    • PowderMeasurementConfig: Settings for analyzing powder samples and particle selection.

    • BulkMeasurementConfig: Settings for analyzing bulk samples.

    • ExpStandardsConfig: Settings for experimental standard measurements

    • ClusteringConfig: Configures clustering algorithms and feature selection for data grouping.

    • PlotConfig: Options for saving, displaying, and customizing plots.

Returns:

  • configs (dict) – Dictionary of configuration objects reconstructed from JSON, keyed by their JSON key. If a key from config_classes_dict is missing in the JSON, it will not be present in configs.

  • metadata (dict) – Dictionary of any additional metadata (e.g., timestamp) found in the JSON.

Raises:

FileNotFoundError – If the JSON file does not exist.

Example

>>> config_classes = {'sample_cfg': SampleConfig, ...}
>>> configs, metadata = load_configurations_from_json('acquisition_info.json', config_classes)
autoemxsp.utils.helper.extract_spectral_data(data_csv_path)[source]

Extract spectra quantification, spectral data, and coordinates from Data.csv file.

Parameters:
  • data_csv_path (str) – Path to the Data.csv file.

  • cnst (module or object) – Should provide all the necessary attribute keys as in your code.

Returns:

  • spectra_quant (list of dict or None) – List of quantification results per spectrum (None if not quantified). If all entries are None, returns None.

  • spectral_data (dict) – Dictionary of lists for each spectral data column (e.g., spectrum, background, real_time, live_time, etc.). If a column is missing, the value is an empty list.

  • sp_coords (list of dict) – List of dicts for each spectrum’s coordinates, keys as in cnst.LIST_SPECTRUM_COORDINATES_KEYS.

  • df (pandas.DataFrame) – The loaded DataFrame from the CSV file.

autoemxsp.utils.helper.make_unique_path(parent_dir: str, base_name: str, extension: str = None) str[source]

Generate a unique file or directory path inside parent_dir based on base_name. If a path with the base name exists, appends a counter (e.g., ‘Sample1_2’). Optionally, add an extension for files.

Parameters:
  • parent_dir (str) – The parent directory in which to generate the new path.

  • base_name (str) – The base name for the new file or directory.

  • extension (str, optional) – The file extension (e.g., ‘txt’ or ‘.txt’). If None, treat as directory.

Returns:

unique_path – The full, unique path (not created).

Return type:

str

Raises:

ValueError – If parent_dir or base_name is invalid.

Example

>>> make_unique_path('./results', 'Sample1')
'./results/Sample1'
>>> make_unique_path('./results', 'Sample1', extension='txt')
'./results/Sample1.txt'
>>> make_unique_path('./results', 'Sample1', extension='txt')  # If exists
'./results/Sample1_2.txt'
autoemxsp.utils.helper.get_sample_dir(results_path: str, sample_ID: str, case_insensitive: bool = True, verbose: bool = False) str[source]

Find a directory named sample_ID under results_path or its subdirectories.

Strategy:
  1. Walk the entire directory tree under results_path and collect exact matches.

  2. If multiple matches -> raise RuntimeError (ambiguous).

  3. If none -> show close matches and raise FileNotFoundError.

Parameters:
  • results_path (str) – Root directory to search from.

  • sample_ID (str) – Directory name to search for (exact match).

  • case_insensitive (bool, optional) – Whether to ignore case when matching. Default is True.

  • verbose (bool, optional) – Print additional debug information. Default is False.

Returns:

sample_dir – Full path to the matched directory.

Return type:

str

Raises:
  • RuntimeError – If multiple matches are found.

  • FileNotFoundError – If no match is found.

autoemxsp.utils.helper.load_msa(filepath: str) Tuple[ndarray, ndarray, Dict[str, str]][source]

Load a .msa or .msg file containing Y-only spectral data and compute the energy scale. Designed for raw spectra exported from Thermo Fisher Phenom systems. May work with other EMSA/MAS format files, but minor variations can occur.

Parameters:

filepath (str) – Path to the .msa or .msg file.

Returns:

  • energy (np.ndarray) – Energy values computed from OFFSET and XPERCHAN.

  • counts (np.ndarray) – Measured counts per energy channel.

  • metadata (dict) – Parsed metadata from the header (all values as strings).

autoemxsp.utils.helper.draw_scalebar(image, pixel_size_um, bar_width=0.25)[source]

Draw a scale bar on the given image.

The scale bar is drawn as a filled white rectangle in the bottom-left corner of the image, with a label indicating its length in micrometers (um). The actual scale bar length is chosen from a set of standard values (0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200 um) to best match the ‘bar_width’, expressed as fraction of image width.

Parameters:
  • image (np.ndarray) – Input image (grayscale or color, as a NumPy array). The scale bar will be drawn directly on this image.

  • pixel_size_um (float) – Size of a pixel in micrometers (um/pixel).

  • bar_width (float, optional) – Target width of scale bar, in fraction of image width. Default: 0.25

Returns:

image – The same image with the scale bar and label drawn on it.

Return type:

np.ndarray

Notes

  • The function detects if the image is grayscale or color and draws the scale bar in white accordingly.

  • The scale bar is placed in the bottom-left corner with a margin from the edges.

  • The label is centered above the scale bar rectangle.

class autoemxsp.utils.helper.Prompt_User(title: str, message: str)[source]

Bases: object

A simple GUI prompt using Tkinter to display a message and wait for user confirmation.

This is useful for pausing execution and prompting the user to take action, such as selecting a position at the micriscope for manual EDS spectrum collection.

title

The window title.

Type:

str

message

The message to display in the prompt.

Type:

str

execution_stopped

True if the user closed the window or pressed Esc.

Type:

bool

ok_pressed

True if the user pressed OK or Return.

Type:

bool

root

The Tkinter root window.

Type:

tk.Tk or None

press_ok()[source]

Handle OK button or Return key press.

stop_execution()[source]

Handle window close (X) or Esc key press.

run()[source]

Start the prompt window and wait for user interaction. Sets ok_pressed or execution_stopped depending on user action.

exception autoemxsp.utils.helper.RefLineError[source]

Bases: Exception

Exception raised for errors related to reference lines.

exception autoemxsp.utils.helper.MissingHintError[source]

Bases: Exception

Exception raised when a required hint is missing.

exception autoemxsp.utils.helper.EMError(message: str, code: int | None = None)[source]

Bases: Exception

Custom exception class for electron microscope (EM)-related errors.

Parameters:
  • message (str) – Description of the error.

  • code (int, optional) – Optional error code.

exception autoemxsp.utils.helper.EDSError(message: str, code: int | None = None)[source]

Bases: EMError

Custom exception class for EDS-related errors.

Parameters:
  • message (str) – Description of the error.

  • code (int, optional) – Optional error code.