11. How to generate multi-view features¶
Multi-view methods expect multiple feature sets for the same samples. This recipe shows how to define and generate those views, and it includes a bench config excerpt for reference. For an end-to-end example, see the inductive tutorial.
11.1 Problem statement¶
You need multiple feature views of the same dataset for classic multi-view SSL methods like co-training. [1][2][3] If your method consumes a single feature matrix, you can skip views and rely on preprocessing alone.
11.2 When to use¶
Use this when a method expects data.views instead of a single feature matrix (for example, co-training). [3][4]
11.3 Steps¶
1) Define a ViewsPlan (two or more views). [1]
2) Optionally attach preprocessing to each view. [2][5]
3) Generate views and pass them to inductive methods. [2][4]
11.4 Copy-paste example¶
Use the Python helper when you want to generate views directly in code. The YAML excerpt shows the equivalent views block inside a bench config. [1][6][7]
Python:
from modssc.data_loader import load_dataset
from modssc.views import generate_views, two_view_random_feature_split
plan = two_view_random_feature_split(fraction=0.5)
ds = load_dataset("toy", download=True)
views = generate_views(ds, plan=plan, seed=0)
print(list(views.views.keys()))
Bench config excerpt (co-training views): [6]
views:
seed: 2
plan:
views:
- name: view_a
columns:
mode: random
fraction: 0.5
- name: view_b
columns:
mode: complement
complement_of: view_a
The view plan schema is defined in src/modssc/views/plan.py, and the bench schema accepts views.plan when present. [1][7]
11.5 Pitfalls¶
Warning
ViewsPlan must define at least two views, and complement views must refer to an earlier view in the plan. [1]