11. How to generate multi-view features¶

Multi-view methods expect multiple feature sets for the same samples. This recipe shows how to define and generate those views, and it includes a bench config excerpt for reference. For an end-to-end example, see the inductive tutorial.

11.1 Problem statement¶

You need multiple feature views of the same dataset for classic multi-view SSL methods like co-training. ^[1][2][3] If your method consumes a single feature matrix, you can skip views and rely on preprocessing alone.

11.2 When to use¶

Use this when a method expects data.views instead of a single feature matrix (for example, co-training). ^[3][4]

11.3 Steps¶

1) Define a ViewsPlan (two or more views). ^[1]

2) Optionally attach preprocessing to each view. ^[2][5]

3) Generate views and pass them to inductive methods. ^[2][4]

11.4 Copy-paste example¶

Use the Python helper when you want to generate views directly in code. The YAML excerpt shows the equivalent views block inside a bench config. ^[1][6][7]

Python:

from modssc.data_loader import load_dataset
from modssc.views import generate_views, two_view_random_feature_split

plan = two_view_random_feature_split(fraction=0.5)
ds = load_dataset("toy", download=True)
views = generate_views(ds, plan=plan, seed=0)
print(list(views.views.keys()))

Bench config excerpt (co-training views): ^[6]

views:
  seed: 2
  plan:
    views:
    - name: view_a
      columns:
        mode: random
        fraction: 0.5
    - name: view_b
      columns:
        mode: complement
        complement_of: view_a

The view plan schema is defined in src/modssc/views/plan.py, and the bench schema accepts views.plan when present. ^[1][7]

11.5 Pitfalls¶

Warning

ViewsPlan must define at least two views, and complement views must refer to an earlier view in the plan. ^[1]

Tip

If you need preprocessing per view, attach a PreprocessPlan to each ViewSpec. ^[1][5]

Sources