photon_mosaic.dataset_discovery#
Dataset discovery module.
This module provides functions to discover datasets using regex patterns. All filtering and transformations are handled through regex substitutions.
Functions
|
Discover datasets and their TIFF files in a directory using regex patterns. |
- photon_mosaic.dataset_discovery.discover_datasets(base_path, pattern='.*', exclude_patterns=None, substitutions=None, tiff_patterns=['*.tif'])[source]#
Discover datasets and their TIFF files in a directory using regex patterns.
- Parameters:
base_path (str or Path) – Base path to search for datasets.
pattern (str, optional) – Regex pattern to match dataset names, defaults to “.*” (all directories).
exclude_patterns (List[str], optional) – List of regex patterns for datasets to exclude.
substitutions (List[Dict[str, str]], optional) – List of regex substitution pairs to transform dataset names. Each dict should have ‘pattern’ and ‘repl’ keys for re.sub().
tiff_patterns (list, optional) – List of glob patterns for TIFF files. Each pattern corresponds to a session. Defaults to [”*.tif”] for a single session.
- Returns:
List of original dataset names (sorted)
List of transformed dataset names (sorted)
Dictionary mapping original dataset names to their TIFF files by session (session index as key)
List of all TIFF files found across all datasets
- Return type:
Tuple[List[str], List[str], Dict[str, Dict[int, List[str]]], List[str]]
Notes
Datasets without any TIFF files are automatically excluded from the results
Both original and transformed dataset lists are sorted alphabetically
Sessions are numbered starting from 0 based on the order in tiff_patterns
Empty sessions (no files found) are included with empty lists