5. Skip to content

5. Transductive tutorial: toy label propagation

This is an end-to-end transductive walkthrough with graph construction and label propagation. If you want to assemble it brick-by-brick, start with datasets, sampling, preprocess, and graph.

5.1 Goal

Run a full transductive SSL experiment on the toy dataset using a graph construction spec and label propagation. [1][2][3]

5.2 Why this tutorial

Use this tutorial when your method expects a graph and node masks (NodeDatasetLike) and you plan to run a graph construction step. If you only need feature matrices without a graph, use the inductive tutorial instead. [14]

This walkthrough uses the bench runner because it validates a single YAML config and orchestrates dataset, sampling, preprocess, graph build, and method execution. For individual bricks, start with the dataset, sampling, preprocess, and graph how-to guides instead. [1][10]

5.3 Prerequisites

  • Python 3.11+ with ModSSC installed from source (bench runner is in the repo). [4][5]

  • No extra dependencies are required for the toy dataset and numpy graph backend. [2][6]

5.4 Files used

5.5 Step by step commands

1) Install the repo in editable mode:

python -m pip install -e "."

2) Run the transductive toy experiment:

python -m bench.main --config bench/configs/experiments/toy_transductive.yaml

The benchmark runner and config paths are in bench/main.py and bench/configs/experiments/toy_transductive.yaml. [1][2]

5.6 Full YAML config used

This is the full config file from bench/configs/experiments/toy_transductive.yaml:

run:
  name: "toy_label_propagation_knn"
  seed: 7
  output_dir: "runs"
  fail_fast: true

dataset:
  id: "toy"

sampling:
  seed: 7
  plan:
    split:
      kind: "holdout"
      test_fraction: 0.0
      val_fraction: 0.2
      stratify: true
      shuffle: true
    labeling:
      mode: "fraction"
      value: 0.1
      strategy: "balanced"
      min_per_class: 1
    imbalance:
      kind: "none"
    policy:
      respect_official_test: true
      allow_override_official: false

preprocess:
  seed: 7
  fit_on: "train_labeled"
  cache: true
  plan:
    output_key: "features.X"
    steps:
      - id: "core.ensure_2d"
      - id: "core.to_numpy"

graph:
  enabled: true
  seed: 7
  cache: true
  spec:
    scheme: "knn"
    metric: "euclidean"
    k: 8
    symmetrize: "mutual"
    weights:
      kind: "heat"
      sigma: 1.0
    normalize: "rw"
    self_loops: true
    backend: "numpy"
    chunk_size: 128
    feature_field: "features.X"

method:
  kind: "transductive"
  id: "label_propagation"
  device:
    device: "auto"
    dtype: "float32"
  params:
    max_iter: 50
    tol: 1.0e-4
    normalize_rows: true

evaluation:
  report_splits: ["val", "test"]
  metrics: ["accuracy", "macro_f1"]

5.7 Expected outputs and where they appear

A run directory is created under runs/ with the config snapshot and the run.json summary. [7][8]

Graph artifacts are cached when graph.cache: true is set. The cache layout is managed by modssc.graph.cache.GraphCache. [9][2]

5.8 How it works

  • The bench runner validates the config and orchestrates dataset, sampling, preprocess, graph build, and method execution. [1][10]

  • The graph is constructed using the GraphBuilderSpec fields in the config. [11][12]

  • Label propagation runs with hard clamping over the graph. [3]

5.9 Common pitfalls and troubleshooting

Warning

Transductive methods require a graph; if graph.enabled is false and the dataset is not a graph dataset, the bench runner raises a config error. [1]

Tip

Use modssc graph build --help to see graph options and validate the spec. [13]

Sources
  1. bench/main.py
  2. bench/configs/experiments/toy_transductive.yaml
  3. src/modssc/transductive/methods/classic/label_propagation.py
  4. pyproject.toml
  5. bench/README.md
  6. src/modssc/graph/construction/backends/numpy_backend.py
  7. bench/context.py
  8. bench/orchestrators/reporting.py
  9. src/modssc/graph/cache.py
  10. bench/schema.py
  11. src/modssc/graph/specs.py
  12. src/modssc/graph/construction/api.py
  13. src/modssc/cli/graph.py
  14. src/modssc/transductive/base.py