autoemxsp.utils.helper module
Utility Functions for EDS Spectrum Analysis and Data Handling
A collection of general-purpose utility functions and lightweight classes for data handling, visualization, and file management within the EDS analysis and modeling framework.
This module is designed to be self-contained and importable across different parts of the project, covering tasks from compositional conversions to structured file loading and formatted console output.
Main Features
Compositional Conversions - atomic_to_weight_fr(): Convert atomic fractions to weight fractions. - weight_to_atomic_fr(): Convert weight fractions to atomic fractions.
Both functions use pymatgen.Element for accurate atomic mass values.
Formula Handling
- get_std_comp_from_formula(): Parse chemical formulas into standardized element dictionaries.
- to_latex_formula(): Convert a chemical formula into LaTeX format (e.g., Fe2O3 → Fe$_2$O$_3$).
String and Table Utilities - print_nice_1d_row(): Print a formatted 1D table row (with adjustable width and alignment). - print_element_fractions_table(): Display element names with their atomic and weight percentages. - print_single_separator(): Print a single horizontal line separator. - print_double_separator(): Print a double horizontal line separator for clear sectioning. - AlphabetMapper: Map integers to alphabetic letters (0 → ‘A’, 25 → ‘Z’, etc.).
Image Utilities - draw_scalebar(): Draw a scale bar on an image using OpenCV, with optional text label and styling.
File and Directory Handling
- get_sample_dir(): Locate a directory named after a given sample ID, with optional recursive search.
- make_unique_path(): Generate a unique file or directory path if one already exists (e.g., file_1.png).
- load_configurations_from_json(): Reconstruct configuration dataclasses and metadata from a JSON file.
- extract_spectral_data(): Extract spectra, quantification data, and coordinates from a Data.csv file.
- load_msa(): Load and parse a .msa spectral file, returning energy and intensity arrays.
User Interaction - Prompt_User: Display a Tkinter prompt window for manual user confirmation (e.g., proceed/stop execution).
Error Handling - EDSError: Custom exception class for handling EDS-related errors gracefully.
Created on Fri Jun 28 11:50:53 2024
@author: Andrea Giunto
- autoemxsp.utils.helper.atomic_to_weight_fr(atomic_fractions: Sequence[float] | ndarray, elements: List[str], verbose: bool = True) ndarray[source]
Convert atomic fractions to weight fractions for a given set of elements.
- Parameters:
atomic_fractions (Sequence[float] or np.ndarray) – The atomic fractions of the elements. Should sum to approximately 1.
elements (List[str]) – List of element symbols (e.g., [“Si”, “O”, “Al”]).
verbose (bool, optional) – If True, prints warnings about normalization. Default is True.
- Returns:
weight_fractions – The corresponding weight fractions, summing to 1.
- Return type:
np.ndarray
- Raises:
ValueError – If the lengths of atomic_fractions and elements do not match.
Examples
>>> atomic_to_weight_fr([0.333, 0.667], ["Si", "O"]) array([0.467..., 0.532...])
- autoemxsp.utils.helper.weight_to_atomic_fr(weight_fractions: Sequence[float] | ndarray, elements: List[str], verbose: bool = True) ndarray[source]
Convert weight fractions to atomic fractions for a given set of elements.
- Parameters:
weight_fractions (Sequence[float] or np.ndarray) – The weight fractions of the elements. Should sum to approximately 1.
elements (List[str]) – List of element symbols (e.g., [“Si”, “O”, “Al”]).
verbose (bool, optional) – If True, prints warnings about normalization. Default is True.
- Returns:
atomic_fractions – The corresponding atomic fractions, summing to 1.
- Return type:
np.ndarray
- Raises:
ValueError – If the lengths of weight_fractions and elements do not match.
Examples
>>> weight_to_atomic_fr([0.467, 0.533], ["Si", "O"]) array([0.333..., 0.666...])
- autoemxsp.utils.helper.to_latex_formula(formula: str, include_dollar_signs: bool = True) str[source]
Convert a chemical formula string into its LaTeX representation.
Supports nested parentheses, element subscripts, and group multipliers. For example:
‘Al2(SO4)3’ -> ‘Al$_{2}$(SO$_{4}$)$_{3}$’
- Parameters:
formula (str) – The chemical formula as a string.
include_dollar_signs (bool) – Whether to wrap the LaTeX in $…$.
- Returns:
The LaTeX-formatted formula.
- Return type:
str
- autoemxsp.utils.helper.print_nice_1d_row(first_col: Any, row: Sequence[Any], floatfmt: str = '.2f', first_col_width: int = 10, col_width: int = 10) None[source]
Print a single row for a table: first entry (label) left-aligned, rest as formatted floats. Used to print quantification results during ZAF corrections
- Parameters:
first_col – The value for the first (label) column.
row – Sequence of values for the remaining columns (numbers or strings).
floatfmt – Format for floating point numbers (default: ‘.3f’).
first_col_width – Width for the first (label) column.
col_width – Width for each numeric column.
- autoemxsp.utils.helper.print_element_fractions_table(formula)[source]
Given a chemical formula, print a table with elements, atomic %, and weight % using pymatgen.
- autoemxsp.utils.helper.print_single_separator()[source]
Print a single-line separator (50 dashes) for visual clarity.
- autoemxsp.utils.helper.print_double_separator()[source]
Print a double-line separator (50 equals signs) for visual clarity.
- class autoemxsp.utils.helper.AlphabetMapper[source]
Bases:
objectMaps a zero-based integer index to Excel-style column letters. 0 -> ‘A’, 1 -> ‘B’, …, 25 -> ‘Z’, 26 -> ‘AA’, etc.
Used for labeling frames analyzed during particle search.
- autoemxsp.utils.helper.load_configurations_from_json(json_path, config_classes_dict)[source]
Load configuration dataclasses and metadata from a spectrum collection info JSON file.
- Parameters:
json_path (str) – Path to the JSON file saved by EMXSp_Composition_Analyzer._save_spectrum_collection_info.
config_classes_dict (dict) –
- Mapping from JSON keys to dataclass types, e.g.:
{‘sample_cfg’: SampleConfig, …}
- See configurations in tools/config_classes.py:
MicroscopeConfig: Settings for microscope hardware, calibration, and imaging parameters.
SampleConfig: Defines the sample’s identity, elements, and spatial properties.
SampleSubstrateConfig: Specifies the substrate composition and geometry supporting the sample.
MeasurementConfig: Controls measurement type, beam parameters, and acquisition settings.
FittingConfig: Parameters for spectral fitting and background handling.
QuantConfig: Options for quantification and filtering of X-ray spectra.
PowderMeasurementConfig: Settings for analyzing powder samples and particle selection.
BulkMeasurementConfig: Settings for analyzing bulk samples.
ExpStandardsConfig: Settings for experimental standard measurements
ClusteringConfig: Configures clustering algorithms and feature selection for data grouping.
PlotConfig: Options for saving, displaying, and customizing plots.
- Returns:
configs (dict) – Dictionary of configuration objects reconstructed from JSON, keyed by their JSON key. If a key from config_classes_dict is missing in the JSON, it will not be present in configs.
metadata (dict) – Dictionary of any additional metadata (e.g., timestamp) found in the JSON.
- Raises:
FileNotFoundError – If the JSON file does not exist.
Example
>>> config_classes = {'sample_cfg': SampleConfig, ...} >>> configs, metadata = load_configurations_from_json('acquisition_info.json', config_classes)
- autoemxsp.utils.helper.extract_spectral_data(data_csv_path)[source]
Extract spectra quantification, spectral data, and coordinates from Data.csv file.
- Parameters:
data_csv_path (str) – Path to the Data.csv file.
cnst (module or object) – Should provide all the necessary attribute keys as in your code.
- Returns:
spectra_quant (list of dict or None) – List of quantification results per spectrum (None if not quantified). If all entries are None, returns None.
spectral_data (dict) – Dictionary of lists for each spectral data column (e.g., spectrum, background, real_time, live_time, etc.). If a column is missing, the value is an empty list.
sp_coords (list of dict) – List of dicts for each spectrum’s coordinates, keys as in cnst.LIST_SPECTRUM_COORDINATES_KEYS.
df (pandas.DataFrame) – The loaded DataFrame from the CSV file.
- autoemxsp.utils.helper.make_unique_path(parent_dir: str, base_name: str, extension: str = None) str[source]
Generate a unique file or directory path inside parent_dir based on base_name. If a path with the base name exists, appends a counter (e.g., ‘Sample1_2’). Optionally, add an extension for files.
- Parameters:
parent_dir (str) – The parent directory in which to generate the new path.
base_name (str) – The base name for the new file or directory.
extension (str, optional) – The file extension (e.g., ‘txt’ or ‘.txt’). If None, treat as directory.
- Returns:
unique_path – The full, unique path (not created).
- Return type:
str
- Raises:
ValueError – If parent_dir or base_name is invalid.
Example
>>> make_unique_path('./results', 'Sample1') './results/Sample1' >>> make_unique_path('./results', 'Sample1', extension='txt') './results/Sample1.txt' >>> make_unique_path('./results', 'Sample1', extension='txt') # If exists './results/Sample1_2.txt'
- autoemxsp.utils.helper.get_sample_dir(results_path: str, sample_ID: str, case_insensitive: bool = True, verbose: bool = False) str[source]
Find a directory named sample_ID under results_path or its subdirectories.
- Strategy:
Walk the entire directory tree under results_path and collect exact matches.
If multiple matches -> raise RuntimeError (ambiguous).
If none -> show close matches and raise FileNotFoundError.
- Parameters:
results_path (str) – Root directory to search from.
sample_ID (str) – Directory name to search for (exact match).
case_insensitive (bool, optional) – Whether to ignore case when matching. Default is True.
verbose (bool, optional) – Print additional debug information. Default is False.
- Returns:
sample_dir – Full path to the matched directory.
- Return type:
str
- Raises:
RuntimeError – If multiple matches are found.
FileNotFoundError – If no match is found.
- autoemxsp.utils.helper.load_msa(filepath: str) Tuple[ndarray, ndarray, Dict[str, str]][source]
Load a .msa or .msg file containing Y-only spectral data and compute the energy scale. Designed for raw spectra exported from Thermo Fisher Phenom systems. May work with other EMSA/MAS format files, but minor variations can occur.
- Parameters:
filepath (str) – Path to the .msa or .msg file.
- Returns:
energy (np.ndarray) – Energy values computed from OFFSET and XPERCHAN.
counts (np.ndarray) – Measured counts per energy channel.
metadata (dict) – Parsed metadata from the header (all values as strings).
- autoemxsp.utils.helper.draw_scalebar(image, pixel_size_um, bar_width=0.25)[source]
Draw a scale bar on the given image.
The scale bar is drawn as a filled white rectangle in the bottom-left corner of the image, with a label indicating its length in micrometers (um). The actual scale bar length is chosen from a set of standard values (0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200 um) to best match the ‘bar_width’, expressed as fraction of image width.
- Parameters:
image (np.ndarray) – Input image (grayscale or color, as a NumPy array). The scale bar will be drawn directly on this image.
pixel_size_um (float) – Size of a pixel in micrometers (um/pixel).
bar_width (float, optional) – Target width of scale bar, in fraction of image width. Default: 0.25
- Returns:
image – The same image with the scale bar and label drawn on it.
- Return type:
np.ndarray
Notes
The function detects if the image is grayscale or color and draws the scale bar in white accordingly.
The scale bar is placed in the bottom-left corner with a margin from the edges.
The label is centered above the scale bar rectangle.
- class autoemxsp.utils.helper.Prompt_User(title: str, message: str)[source]
Bases:
objectA simple GUI prompt using Tkinter to display a message and wait for user confirmation.
This is useful for pausing execution and prompting the user to take action, such as selecting a position at the micriscope for manual EDS spectrum collection.
- title
The window title.
- Type:
str
- message
The message to display in the prompt.
- Type:
str
- execution_stopped
True if the user closed the window or pressed Esc.
- Type:
bool
- ok_pressed
True if the user pressed OK or Return.
- Type:
bool
- root
The Tkinter root window.
- Type:
tk.Tk or None
- exception autoemxsp.utils.helper.RefLineError[source]
Bases:
ExceptionException raised for errors related to reference lines.
- exception autoemxsp.utils.helper.MissingHintError[source]
Bases:
ExceptionException raised when a required hint is missing.