TabICL

A comparable tabular foundation model with performance on par with TabPFN v2.

Functions

class TabICLClassifier(ClassifierMixin, BaseEstimator)

Tabular In-Context Learning classifier with scikit-learn interface.

Parameters:

  • n_estimators (int, optional, Default is 32) - Number of estimators for ensemble predictions.

  • norm_methods (Optional[str | List[str]], optional, Default is None) - Normalization methods to apply.

  • feat_shuffle_method (str, optional, Default is “latin”) - Feature permutation strategy.

  • class_shift (bool, optional, Default is True) - Whether to apply cyclic shifts to class labels.

  • outlier_threshold (float, optional, Default is 4.0) - Z-score threshold for outlier detection.

  • softmax_temperature (float, optional, Default is 0.9) - Temperature for softmax function.

  • average_logits (bool, optional, Default is True) - Whether to average logits or probabilities.

  • use_hierarchical (bool, optional, Default is True) - Whether to enable hierarchical classification.

  • use_amp (bool, optional, Default is True) - Whether to use automatic mixed precision.

  • batch_size (Optional[int], optional, Default is 8) - Batch size for inference.

  • model_path (Optional[str | Path], optional, Default is None) - Path to pre-trained model.

  • allow_auto_download (bool, optional, Default is True) - Whether to allow auto-download.

  • checkpoint_version (str, optional, Default is “tabicl-classifier-v1.1-0506.ckpt”) - Checkpoint version.

  • device (Optional[str | torch.device], optional, Default is None) - Device for computation.

  • random_state (int | None, optional, Default is 42) - Random seed.

  • n_jobs (Optional[int], optional, Default is None) - Number of jobs for parallel processing.

  • verbose (bool, optional, Default is False) - Whether to print verbose output.

  • inference_config (Optional[InferenceConfig | Dict], optional, Default is None) - Inference configuration.

Methods:

  • fit(self, X, y) - Fit the classifier.

  • predict(self, X) - Predict class labels.

  • predict_proba(self, X) - Predict class probabilities.

  • _batch_forward(self, Xs, ys, shuffle_patterns=None) - Forward pass for batch processing.

class TransformToNumerical

Transforms categorical features to numerical representations.

Parameters:

  • norm_methods (List[str]) - List of normalization methods.

  • feat_shuffle_method (str) - Feature shuffling method.

  • class_shift (bool) - Whether to apply class shifts.

  • outlier_threshold (float) - Outlier detection threshold.

Methods:

  • transform(self, X, y) - Transform input data.

class EnsembleGenerator

Generates ensemble members with different transformations.

Parameters:

  • n_estimators (int) - Number of ensemble members.

  • norm_methods (List[str]) - Normalization methods.

  • feat_shuffle_method (str) - Feature shuffling method.

  • class_shift (bool) - Whether to apply class shifts.

Methods:

  • generate(self, X, y) - Generate ensemble members.

class TabICL(nn.Module)

TabICL neural network model.

Parameters:

  • config - Model configuration.

Input:

  • x (Tensor) - Input tensor.

Output:

  • Tensor - Model predictions.

class InferenceConfig

Configuration for TabICL inference.

Parameters:

  • max_classes (int) - Maximum number of classes.

  • max_features (int) - Maximum number of features.

  • model_dim (int) - Model dimension.

  • num_heads (int) - Number of attention heads.

  • num_layers (int) - Number of layers.

def softmax(x, axis: int = -1, temperature: float = 0.9)

Computes softmax with temperature scaling.

Parameters:

  • x (Tensor) - Input tensor.

  • axis (int, optional, Default is -1) - Axis for softmax computation.

  • temperature (float, optional, Default is 0.9) - Temperature parameter.

Returns:

  • Tensor - Softmax output with temperature scaling.

References:

Jingang Qu and David Holzmüller and Gaël Varoquaux and Marine Le Morvan. TabICL: A Tabular Foundation Model for In-Context Learning on Large Data. arXiv:2502.05564 [cs.LG], 2025. https://arxiv.org/abs/2502.05564