TabM
A model based on MLP and variations of BatchEnsemble.
Functions
def _init_scaling_by_sections(weight: Tensor, distribution: Literal['normal', 'random-signs'], init_sections: list[int]) -> None
Initializes scaling weights by sections for efficient ensemble members.
Parameters:
weight (Tensor) - Weight tensor to initialize.
distribution (str) - Initialization distribution (‘normal’ or ‘random-signs’).
init_sections (list[int]) - List of section sizes.
def init_rsqrt_uniform_(x: Tensor, d: int) -> Tensor
Initializes tensor with uniform distribution scaled by reciprocal square root.
Parameters:
x (Tensor) - Tensor to initialize.
d (int) - Dimension for scaling.
Returns:
Tensor - Initialized tensor.
def init_random_signs_(x: Tensor) -> Tensor
Initializes tensor with random signs (-1 or 1).
Parameters:
x (Tensor) - Tensor to initialize.
Returns:
Tensor - Tensor with random signs.
class Identity(nn.Module)
Identity module that returns input unchanged.
Input:
x (Tensor) - Input tensor.
Output:
Tensor - Same as input.
class Mean(nn.Module)
Computes mean along specified dimension.
Parameters:
dim (int) - Dimension to compute mean along.
Input:
x (Tensor) - Input tensor.
Output:
Tensor - Mean along specified dimension.
class ScaleEnsemble(nn.Module)
Scales ensemble members with learnable weights.
Parameters:
k (int) - Number of ensemble members.
d (int) - Feature dimension.
init (str) - Weight initialization (‘ones’, ‘normal’, ‘random-signs’).
Input:
x (Tensor) - Input tensor of shape (B, K, D).
Output:
Tensor - Scaled tensor.
class ElementwiseAffineEnsemble(nn.Module)
Element-wise affine transformation for ensemble members.
Parameters:
k (int) - Number of ensemble members.
d (int) - Feature dimension.
bias (bool) - Whether to use bias.
weight_init (str) - Weight initialization method.
Input:
x (Tensor) - Input tensor of shape (B, K, D).
Output:
Tensor - Transformed tensor.
class LinearEfficientEnsemble(nn.Module)
Efficient ensemble linear layer with configurable scaling.
Parameters:
in_features (int) - Input feature dimension.
out_features (int) - Output feature dimension.
bias (bool, optional, Default is True) - Whether to use bias.
k (int) - Number of ensemble members.
ensemble_scaling_in (bool) - Whether to ensemble input scaling.
ensemble_scaling_out (bool) - Whether to ensemble output scaling.
ensemble_bias (bool) - Whether to ensemble bias.
scaling_init (str) - Scaling initialization method.
Input:
x (Tensor) - Input tensor of shape (B, K, D).
Output:
Tensor - Linear transformation result.
def make_efficient_ensemble(module: nn.Module, **kwargs) -> None
Converts a module to use efficient ensemble methods.
Parameters:
module (nn.Module) - Module to convert.
kwargs - Ensemble configuration parameters.
class OneHotEncoding0d(nn.Module)
One-hot encoding for categorical features.
Parameters:
cardinalities (list[int]) - List of category counts for each feature.
Input:
x (Tensor) - Categorical feature tensor.
Output:
Tensor - One-hot encoded tensor.
References:
Yury Gorishniy, Akim Kotelnikov, and Artem Babenko. TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling. arXiv:2410.24210 [cs.LG], 2025. https://arxiv.org/abs/2410.24210