Utils

class TALENT.model.utils.Averager

Bases: object

A simple averager.

add(x)
X

float, value to be added

item()
class TALENT.model.utils.Timer

Bases: object

measure(p=1)

Measure the time since the last call to measure.

P

int, period of printing the time

TALENT.model.utils.ensure_path(path, remove=True)

Ensure a path exists.

path: str, path to the directory remove: bool, whether to remove the directory if it exists

TALENT.model.utils.get_classical_args()

Get the arguments for classical models.

Returns

argparse.Namespace, arguments

TALENT.model.utils.get_deep_args()

Get the arguments for deep learning models.

Returns

argparse.Namespace, arguments

TALENT.model.utils.get_device() torch.device
TALENT.model.utils.get_method(model)

Get the method class.

Model

str, model name

Returns

class, method class

TALENT.model.utils.load_config(args, config=None, config_name=None)

Load the config file.

Args

argparse.Namespace, arguments

Config

dict, config file

Config_name

str, name of the config file

Returns

argparse.Namespace, arguments

TALENT.model.utils.merge_sampled_parameters(config, sampled_parameters)

Merge the sampled hyper-parameters.

Config

dict, configuration

Sampled_parameters

dict, sampled hyper-parameters

TALENT.model.utils.mkdir(path)

Create a directory if it does not exist.

Path

str, path to the directory

TALENT.model.utils.pprint(x)
TALENT.model.utils.rmse(y, prediction, y_info)
Y

np.ndarray, ground truth

Prediction

np.ndarray, prediction

Y_info

dict, information about the target variable

Returns

float, root mean squared error

TALENT.model.utils.sample_parameters(trial, space, base_config)

Sample hyper-parameters.

Trial

optuna.trial.Trial, trial

Space

dict, search space

Base_config

dict, base configuration

Returns

dict, sampled hyper-parameters

TALENT.model.utils.set_gpu(x)

Set environment variable CUDA_VISIBLE_DEVICES

X

str, GPU id

TALENT.model.utils.set_seeds(base_seed: int, one_cuda_seed: bool = False) None

Set random seeds for reproducibility.

Base_seed

int, base seed

One_cuda_seed

bool, whether to set one seed for all GPUs

TALENT.model.utils.show_results(args, info, metric_name, loss_list, results_list, time_list)

Show the results for deep learning models.

Args

argparse.Namespace, arguments

Info

dict, information about the dataset

Metric_name

list, names of the metrics

Loss_list

list, list of loss

Results_list

list, list of results

Time_list

list, list of time

TALENT.model.utils.show_results_classical(args, info, metric_name, results_list, time_list)

Show the results for classical models.

Args

argparse.Namespace, arguments

Info

dict, information about the dataset

Metric_name

list, names of the metrics

Results_list

list, list of results

Time_list

list, list of time

TALENT.model.utils.tune_hyper_parameters(args, opt_space, train_val_data, info)

Tune hyper-parameters.

Args

argparse.Namespace, arguments

Opt_space

dict, search space

Train_val_data

tuple, training and validation data

Info

dict, information about the dataset

Returns

argparse.Namespace, arguments

File and Path Utilities

TALENT.model.utils.mkdir(path)

Create a directory if it does not exist.

Parameters:

  • path (str) – Path to the directory to create

Raises:

  • OSError – If directory creation fails for reasons other than already existing

TALENT.model.utils.set_gpu(x)

Set environment variable CUDA_VISIBLE_DEVICES to specify which GPU to use.

Parameters:

  • x (str) – GPU ID to use (e.g., “0”, “1”, “0,1”)

Example:

set_gpu("0")  # Use GPU 0
set_gpu("0,1")  # Use GPUs 0 and 1
TALENT.model.utils.ensure_path(path, remove=True)

Ensure a path exists, optionally removing existing directory.

Parameters:

  • path (str) – Path to the directory

  • remove (bool, optional) – Whether to remove the directory if it exists. Defaults to True.

Note:

If the path exists and remove=True, will prompt user for confirmation before removing.

Random Seed and Device Management

TALENT.model.utils.set_seeds(base_seed, one_cuda_seed=False)

Set random seeds for reproducibility across all random number generators.

Parameters:

  • base_seed (int) – Base seed value (must be 0 <= base_seed < 2^32 - 10000)

  • one_cuda_seed (bool, optional) – Whether to set one seed for all GPUs. Defaults to False.

Note:

Sets seeds for Python random, NumPy, PyTorch CPU, and PyTorch CUDA generators. Each generator gets a different seed derived from the base_seed.

TALENT.model.utils.get_device()

Get the appropriate device (GPU or CPU) for PyTorch operations.

Returns:

  • torch.device – CUDA device if available, otherwise CPU device

Evaluation Metrics

TALENT.model.utils.rmse(y, prediction, y_info)

Calculate Root Mean Squared Error (RMSE) for regression tasks.

Parameters:

  • y (np.ndarray) – Ground truth values

  • prediction (np.ndarray) – Predicted values

  • y_info (dict) – Information about the target variable, including normalization policy

Returns:

  • float – RMSE value, adjusted for normalization if applicable

Note:

If y_info[‘policy’] is ‘mean_std’, the RMSE is multiplied by the standard deviation to denormalize the result.

Configuration Management

TALENT.model.utils.load_config(args, config=None, config_name=None)

Load configuration file for model training and save current arguments.

Parameters:

  • args (argparse.Namespace) – Command line arguments

  • config (dict, optional) – Pre-loaded configuration dictionary. Defaults to None.

  • config_name (str, optional) – Name for the saved config file. Defaults to None.

Returns:

  • argparse.Namespace – Updated arguments with loaded configuration

Note:

Automatically saves the current arguments to a JSON file in the save_path directory.

Hyperparameter Optimization

TALENT.model.utils.sample_parameters(trial, space, base_config)

Sample hyperparameters from the search space using Optuna trial.

Parameters:

  • trial (optuna.trial.Trial) – Optuna trial object for parameter sampling

  • space (dict) – Hyperparameter search space definition

  • base_config (dict) – Base configuration dictionary

Returns:

  • dict – Sampled hyperparameters

Special Distributions:

  • $mlp_d_layers – Special distribution for MLP layer dimensions

  • $d_token – Special distribution for transformer token dimensions

  • $d_ffn_factor – Special distribution for feedforward network factors

  • ? – Optional parameters with default values

TALENT.model.utils.merge_sampled_parameters(config, sampled_parameters)

Merge sampled hyperparameters into the base configuration.

Parameters:

  • config (dict) – Base configuration to update

  • sampled_parameters (dict) – Sampled parameters to merge

Note:

Recursively merges nested dictionaries and overwrites existing parameters.

Argument Parsing

TALENT.model.utils.get_classical_args()

Parse command line arguments for classical machine learning models.

Returns:

  • tuple – (args, default_para, opt_space) where: * args: Parsed arguments * default_para: Default parameter configurations * opt_space: Hyperparameter optimization space

Supported Models:

  • LogReg, NCM, RandomForest, xgboost, catboost, lightgbm

  • svm, knn, NaiveBayes, dummy, LinearRegression

Key Parameters:

  • normalization: Data normalization method

  • num_nan_policy: Policy for handling numerical missing values

  • cat_nan_policy: Policy for handling categorical missing values

  • cat_policy: Categorical encoding policy

  • num_policy: Numerical feature processing policy

TALENT.model.utils.get_deep_args()

Parse command line arguments for deep learning models.

Returns:

  • tuple – (args, default_para, opt_space) where: * args: Parsed arguments * default_para: Default parameter configurations * opt_space: Hyperparameter optimization space

Supported Models:

  • mlp, resnet, ftt, node, autoint, tabpfn, tangos, saint

  • tabcaps, tabnet, snn, ptarl, danets, dcn2, tabtransformer

  • dnnr, switchtab, grownet, tabr, modernNCA, hyperfast

  • bishop, realmlp, protogate, mlp_plr, excelformer, grande

  • amformer, tabptm, trompt, tabm, PFN-v2, t2gformer

  • tabautopnpnet, tabicl

Results Display

TALENT.model.utils.show_results_classical(args, info, metric_name, results_list, time_list)

Display results for classical machine learning models.

Parameters:

  • args (argparse.Namespace) – Training arguments

  • info (dict) – Dataset information

  • metric_name (list) – Names of evaluation metrics

  • results_list (list) – List of results from multiple trials

  • time_list (list) – List of training times

Output:

Prints formatted results including mean, standard deviation, and GPU information.

TALENT.model.utils.show_results(args, info, metric_name, loss_list, results_list, time_list)

Display results for deep learning models.

Parameters:

  • args (argparse.Namespace) – Training arguments

  • info (dict) – Dataset information

  • metric_name (list) – Names of evaluation metrics

  • loss_list (list) – List of training losses

  • results_list (list) – List of results from multiple trials

  • time_list (list) – List of training times

Output:

Prints formatted results including mean loss, metrics, and GPU information.

Hyperparameter Tuning

TALENT.model.utils.tune_hyper_parameters(args, opt_space, train_val_data, info)

Perform hyperparameter optimization using Optuna.

Parameters:

  • args (argparse.Namespace) – Training arguments

  • opt_space (dict) – Hyperparameter search space

  • train_val_data (tuple) – Training and validation data

  • info (dict) – Dataset information

Returns:

  • argparse.Namespace – Updated arguments with optimized hyperparameters

Features:

  • Uses TPE sampler for efficient optimization

  • Supports both regression (minimize) and classification (maximize) objectives

  • Automatically saves best configuration to JSON file

  • Handles model-specific parameter adjustments

Model Factory

TALENT.model.utils.get_method(model)

Get the method class for a given model name.

Parameters:

  • model (str) – Model name

Returns:

  • class – Method class for the specified model

Raises:

  • NotImplementedError – If the model is not yet implemented

Supported Models:

All deep learning and classical models supported by TALENT.

Utility Classes

class TALENT.model.utils.Averager

A simple averager for tracking running averages.

Methods:

add(x)

Add a value to the running average.

Parameters:

  • x (float) – Value to add

item()

Get the current average value.

Returns:

  • float – Current running average

class TALENT.model.utils.Timer

A timer for measuring elapsed time.

Methods:

measure(p=1)

Measure elapsed time since timer creation.

Parameters:

  • p (int, optional) – Period for time formatting. Defaults to 1.

Returns:

  • str – Formatted time string (e.g., “30s”, “2m”, “1.5h”)

Debugging Utilities

TALENT.model.utils.pprint(x)

Pretty print an object using the PrettyPrinter.

Parameters:

  • x (any) – Object to print