15. Skip to content

15. Catalogs and registries

This reference collects registry-backed catalogs (datasets, steps, methods, metrics) and shows how to query them.

15.1 What this page is

ModSSC exposes registry-backed lists for datasets, preprocess steps/models, augmentation ops, methods, and metrics through CLI commands and Python APIs. [1][2][3][4][5][6]

Use this page when you need IDs for configs or when you want to check optional dependencies. Dataset specs and preprocess registries expose required_extra, and modssc doctor reports missing CLI bricks. [7][8][9][10]

Use the CLI blocks for quick terminal inspection, and use the Python blocks when you want registry metadata inside a script.

15.2 Datasets and providers

Use providers to understand which backends are available, and use dataset keys when wiring configs or CLI commands. Dataset info includes modality and required_extra. [1][7][11]

CLI: modssc datasets in src/modssc/cli/datasets.py. Python: data loader helpers in src/modssc/data_loader/api.py. [1][11]

CLI:

modssc datasets providers
modssc datasets list --modalities text
modssc datasets info --dataset toy

Python:

from modssc.data_loader import available_datasets, dataset_info, provider_names

print(provider_names())
print(available_datasets())
print(dataset_info("toy").as_dict())

15.3 Preprocess steps

Steps are registered in the preprocess catalog and surfaced through the CLI and registry helpers. Use step_info to check required_extra before you add a step to a plan. [2][8][12]

CLI: modssc preprocess in src/modssc/cli/preprocess.py. Python: preprocess registry in src/modssc/preprocess/registry.py. [2][12]

CLI:

modssc preprocess steps list
modssc preprocess steps info core.ensure_2d

Python:

from modssc.preprocess import available_steps, step_info

print(available_steps())
print(step_info("core.ensure_2d"))

15.4 Pretrained models

Pretrained encoder models are listed by the preprocess model registry. Use the CLI for quick inspection or the Python helpers when you need the metadata in code. [2][9]

CLI: modssc preprocess in src/modssc/cli/preprocess.py. Python: model registry in src/modssc/preprocess/models.py. [2][9]

CLI:

modssc preprocess models list --modality text
modssc preprocess models info stub:text

Python:

from modssc.preprocess import available_models, model_info

print(available_models(modality="text"))
print(model_info("stub:text"))

15.5 Augmentation ops

Augmentation operations are registered in the augmentation registry and can be listed or inspected from the CLI. [3][13]

CLI: modssc augmentation in src/modssc/cli/augmentation.py. Python: augmentation registry in src/modssc/data_augmentation/registry.py. [3][13]

CLI:

modssc augmentation list --modality text
modssc augmentation info text.word_dropout --as-json

Python:

from modssc.data_augmentation.registry import available_ops, op_info

print(available_ops(modality="text"))
print(op_info("text.word_dropout"))

15.6 Methods

Inductive and transductive registries expose method IDs. Use --available-only if you want to exclude planned or unresolvable methods. [4][5][14][15]

CLI: inductive/transductive CLIs in src/modssc/cli/inductive.py and src/modssc/cli/transductive.py. Python: registries in src/modssc/inductive/registry.py and src/modssc/transductive/registry.py. [4][5][14][15]

CLI:

modssc inductive methods list --available-only
modssc transductive methods list --available-only

Python:

from modssc.inductive import registry as inductive_registry
from modssc.transductive import registry as transductive_registry

print(inductive_registry.available_methods())
print(transductive_registry.available_methods())

15.7 Evaluation metrics

Metric names are listed by the evaluation module and exposed in the CLI. [6][16]

CLI: modssc evaluation in src/modssc/cli/evaluation.py. Python: metric helpers in src/modssc/evaluation/metrics.py. [6][16]

CLI:

modssc evaluation list

Python:

from modssc.evaluation import list_metrics

print(list_metrics())
Sources
  1. src/modssc/cli/datasets.py
  2. src/modssc/cli/preprocess.py
  3. src/modssc/cli/augmentation.py
  4. src/modssc/cli/inductive.py
  5. src/modssc/cli/transductive.py
  6. src/modssc/cli/evaluation.py
  7. src/modssc/data_loader/types.py
  8. src/modssc/preprocess/catalog.py
  9. src/modssc/preprocess/models.py
  10. src/modssc/cli/app.py
  11. src/modssc/data_loader/api.py
  12. src/modssc/preprocess/registry.py
  13. src/modssc/data_augmentation/registry.py
  14. src/modssc/inductive/registry.py
  15. src/modssc/transductive/registry.py
  16. src/modssc/evaluation/metrics.py