TabM

A model based on MLP and variations of BatchEnsemble.

Functions

def _init_scaling_by_sections(weight: Tensor, distribution: Literal['normal', 'random-signs'], init_sections: list[int]) -> None

Initializes scaling weights by sections for efficient ensemble members.

Parameters:

weight (Tensor) - Weight tensor to initialize.
distribution (str) - Initialization distribution (‘normal’ or ‘random-signs’).
init_sections (list[int]) - List of section sizes.

def init_rsqrt_uniform_(x: Tensor, d: int) -> Tensor

Initializes tensor with uniform distribution scaled by reciprocal square root.

Parameters:

x (Tensor) - Tensor to initialize.
d (int) - Dimension for scaling.

Returns:

Tensor - Initialized tensor.

def init_random_signs_(x: Tensor) -> Tensor

Initializes tensor with random signs (-1 or 1).

Parameters:

x (Tensor) - Tensor to initialize.

Returns:

Tensor - Tensor with random signs.

class Identity(nn.Module)

Identity module that returns input unchanged.

Input:

x (Tensor) - Input tensor.

Output:

Tensor - Same as input.

class Mean(nn.Module)

Computes mean along specified dimension.

Parameters:

dim (int) - Dimension to compute mean along.

Input:

x (Tensor) - Input tensor.

Output:

Tensor - Mean along specified dimension.

class ScaleEnsemble(nn.Module)

Scales ensemble members with learnable weights.

Parameters:

k (int) - Number of ensemble members.
d (int) - Feature dimension.
init (str) - Weight initialization (‘ones’, ‘normal’, ‘random-signs’).

Input:

x (Tensor) - Input tensor of shape (B, K, D).

Output:

Tensor - Scaled tensor.

class ElementwiseAffineEnsemble(nn.Module)

Element-wise affine transformation for ensemble members.

Parameters:

k (int) - Number of ensemble members.
d (int) - Feature dimension.
bias (bool) - Whether to use bias.
weight_init (str) - Weight initialization method.

Input:

x (Tensor) - Input tensor of shape (B, K, D).

Output:

Tensor - Transformed tensor.

class LinearEfficientEnsemble(nn.Module)

Efficient ensemble linear layer with configurable scaling.

Parameters:

in_features (int) - Input feature dimension.
out_features (int) - Output feature dimension.
bias (bool, optional, Default is True) - Whether to use bias.
k (int) - Number of ensemble members.
ensemble_scaling_in (bool) - Whether to ensemble input scaling.
ensemble_scaling_out (bool) - Whether to ensemble output scaling.
ensemble_bias (bool) - Whether to ensemble bias.
scaling_init (str) - Scaling initialization method.

Input:

x (Tensor) - Input tensor of shape (B, K, D).

Output:

Tensor - Linear transformation result.

def make_efficient_ensemble(module: nn.Module, **kwargs) -> None

Converts a module to use efficient ensemble methods.

Parameters:

module (nn.Module) - Module to convert.
kwargs - Ensemble configuration parameters.

class OneHotEncoding0d(nn.Module)

One-hot encoding for categorical features.

Parameters:

cardinalities (list[int]) - List of category counts for each feature.

Input:

x (Tensor) - Categorical feature tensor.

Output:

Tensor - One-hot encoded tensor.

References:

Yury Gorishniy, Akim Kotelnikov, and Artem Babenko. TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling. arXiv:2410.24210 [cs.LG], 2025. https://arxiv.org/abs/2410.24210