11. Skip to content

11. How to generate multi-view features

Multi-view methods expect multiple feature sets for the same samples. This recipe shows how to define and generate those views, and it includes a bench config excerpt for reference. For an end-to-end example, see the inductive tutorial.

11.1 Problem statement

You need multiple feature views of the same dataset for classic multi-view SSL methods like co-training. [1][2][3] If your method consumes a single feature matrix, you can skip views and rely on preprocessing alone.

11.2 When to use

Use this when a method expects data.views instead of a single feature matrix (for example, co-training). [3][4]

11.3 Steps

1) Define a ViewsPlan (two or more views). [1]

2) Optionally attach preprocessing to each view. [2][5]

3) Generate views and pass them to inductive methods. [2][4]

11.4 Copy-paste example

Use the Python helper when you want to generate views directly in code. The YAML excerpt shows the equivalent views block inside a bench config. [1][6][7]

Python:

from modssc.data_loader import load_dataset
from modssc.views import generate_views, two_view_random_feature_split

plan = two_view_random_feature_split(fraction=0.5)
ds = load_dataset("toy", download=True)
views = generate_views(ds, plan=plan, seed=0)
print(list(views.views.keys()))

Bench config excerpt (co-training views): [6]

views:
  seed: 2
  plan:
    views:
    - name: view_a
      columns:
        mode: random
        fraction: 0.5
    - name: view_b
      columns:
        mode: complement
        complement_of: view_a

The view plan schema is defined in src/modssc/views/plan.py, and the bench schema accepts views.plan when present. [1][7]

11.5 Pitfalls

Warning

ViewsPlan must define at least two views, and complement views must refer to an earlier view in the plan. [1]

Tip

If you need preprocessing per view, attach a PreprocessPlan to each ViewSpec. [1][5]

Sources
  1. src/modssc/views/plan.py
  2. src/modssc/views/api.py
  3. src/modssc/inductive/methods/co_training.py
  4. src/modssc/inductive/types.py
  5. src/modssc/preprocess/plan.py
  6. bench/configs/experiments/best/inductive/co_training/text/imdb.yaml
  7. bench/schema.py