5. Skip to content

5. Concepts

This page introduces the key ideas and vocabulary used throughout the documentation. For runnable examples, go to the inductive tutorial or transductive tutorial.

5.1 Problem framing

ModSSC targets semi-supervised classification, where a small labeled set and a larger unlabeled set are used together. This framing is reflected in the inductive and transductive bricks and their datasets. [1][2][3]

5.2 Inductive vs transductive in this project

Inductive methods operate on feature matrices and labeled/unlabeled splits, without requiring a graph. The inductive brick is located in src/modssc/inductive/ and validates InductiveDataset inputs. [4][5][1]

Transductive methods operate on a fixed graph over all nodes and accept NodeDataset-like objects with a graph and optional masks. Sampling outputs for graph datasets use masks like train/val/test/labeled/unlabeled. The transductive brick is located in src/modssc/transductive/, and graph utilities are in src/modssc/graph/. [2][6][7][8]

5.3 Key abstractions in this codebase

  • Dataset catalog and providers: curated dataset keys and provider URIs for downloading and caching. [9][10][11]

  • Sampling plans: deterministic split + labeling specifications that produce reproducible indices/masks. [12][13]

  • Preprocess plans: ordered steps that transform raw datasets into feature representations. [14][15]

  • Graph specs and views: graph construction specifications and view generation (attr/diffusion/struct). [16][17]

  • View plans: multi-view feature generation for methods like co-training. [18][19]

  • Registries: method registries for inductive and transductive algorithms. [20][21]

  • Benchmark configurations: end-to-end experiment configuration for reproducible runs. [22][23]

5.4 Code layout in practice

  • The top-level brick packages (modssc.data_loader, modssc.preprocess, modssc.sampling, modssc.views, modssc.graph, and others) are the public Python entrypoints.
  • Runtime-facing utilities now live under modssc.runtime.
  • Cache and optional-dependency support code now lives under modssc.cache and modssc.dependencies.
  • Several bricks use internal support directories such as services/, helpers/, adapters/, or bundle_factories/; those folders explain implementation structure, not additional public import paths.

For the current repository layout, see the Architecture page.

5.5 Small illustrative examples

Inductive dataset payload (labeled + unlabeled):

import numpy as np
from modssc.inductive import InductiveDataset

X_l = np.random.randn(10, 4)
y_l = np.random.randint(0, 3, size=(10,))
X_u = np.random.randn(50, 4)

payload = InductiveDataset(X_l=X_l, y_l=y_l, X_u=X_u)

Transductive dataset payload (graph + masks):

import numpy as np
from modssc.graph import GraphBuilderSpec, build_graph
from modssc.graph.artifacts import NodeDataset

X = np.random.randn(20, 8).astype(np.float32)
edge_spec = GraphBuilderSpec(scheme="knn", metric="cosine", k=3)
graph = build_graph(X, spec=edge_spec, seed=0, cache=False)

train_mask = np.zeros((20,), dtype=bool)
train_mask[:3] = True
node_data = NodeDataset(X=X, y=np.zeros((20,), dtype=np.int64), graph=graph, masks={"train_mask": train_mask})

The inductive and transductive dataset types are defined in src/modssc/inductive/types.py and src/modssc/graph/artifacts.py, and graph construction is implemented in src/modssc/graph/construction/api.py. [1][6][24]

Sources
  1. src/modssc/inductive/types.py
  2. src/modssc/transductive/base.py
  3. README.md
  4. src/modssc/inductive/
  5. src/modssc/inductive/validation.py
  6. src/modssc/graph/artifacts.py
  7. src/modssc/sampling/result.py
  8. src/modssc/graph/
  9. src/modssc/data_loader/catalog/
  10. src/modssc/data_loader/providers/
  11. src/modssc/data_loader/api.py
  12. src/modssc/sampling/plan.py
  13. src/modssc/sampling/api.py
  14. src/modssc/preprocess/plan.py
  15. src/modssc/preprocess/catalog.py
  16. src/modssc/graph/specs.py
  17. src/modssc/graph/featurization/api.py
  18. src/modssc/views/plan.py
  19. src/modssc/views/api.py
  20. src/modssc/inductive/registry.py
  21. src/modssc/transductive/registry.py
  22. bench/schema.py
  23. bench/configs/experiments/
  24. src/modssc/graph/construction/api.py