Utils
- class TALENT.model.utils.Averager
Bases:
objectA simple averager.
- add(x)
- X
float, value to be added
- item()
- class TALENT.model.utils.Timer
Bases:
object- measure(p=1)
Measure the time since the last call to measure.
- P
int, period of printing the time
- TALENT.model.utils.ensure_path(path, remove=True)
Ensure a path exists.
path: str, path to the directory remove: bool, whether to remove the directory if it exists
- TALENT.model.utils.get_classical_args()
Get the arguments for classical models.
- Returns
argparse.Namespace, arguments
- TALENT.model.utils.get_deep_args()
Get the arguments for deep learning models.
- Returns
argparse.Namespace, arguments
- TALENT.model.utils.get_device() torch.device
- TALENT.model.utils.get_method(model)
Get the method class.
- Model
str, model name
- Returns
class, method class
- TALENT.model.utils.load_config(args, config=None, config_name=None)
Load the config file.
- Args
argparse.Namespace, arguments
- Config
dict, config file
- Config_name
str, name of the config file
- Returns
argparse.Namespace, arguments
- TALENT.model.utils.merge_sampled_parameters(config, sampled_parameters)
Merge the sampled hyper-parameters.
- Config
dict, configuration
- Sampled_parameters
dict, sampled hyper-parameters
- TALENT.model.utils.mkdir(path)
Create a directory if it does not exist.
- Path
str, path to the directory
- TALENT.model.utils.pprint(x)
- TALENT.model.utils.rmse(y, prediction, y_info)
- Y
np.ndarray, ground truth
- Prediction
np.ndarray, prediction
- Y_info
dict, information about the target variable
- Returns
float, root mean squared error
- TALENT.model.utils.sample_parameters(trial, space, base_config)
Sample hyper-parameters.
- Trial
optuna.trial.Trial, trial
- Space
dict, search space
- Base_config
dict, base configuration
- Returns
dict, sampled hyper-parameters
- TALENT.model.utils.set_gpu(x)
Set environment variable CUDA_VISIBLE_DEVICES
- X
str, GPU id
- TALENT.model.utils.set_seeds(base_seed: int, one_cuda_seed: bool = False) None
Set random seeds for reproducibility.
- Base_seed
int, base seed
- One_cuda_seed
bool, whether to set one seed for all GPUs
- TALENT.model.utils.show_results(args, info, metric_name, loss_list, results_list, time_list)
Show the results for deep learning models.
- Args
argparse.Namespace, arguments
- Info
dict, information about the dataset
- Metric_name
list, names of the metrics
- Loss_list
list, list of loss
- Results_list
list, list of results
- Time_list
list, list of time
- TALENT.model.utils.show_results_classical(args, info, metric_name, results_list, time_list)
Show the results for classical models.
- Args
argparse.Namespace, arguments
- Info
dict, information about the dataset
- Metric_name
list, names of the metrics
- Results_list
list, list of results
- Time_list
list, list of time
- TALENT.model.utils.tune_hyper_parameters(args, opt_space, train_val_data, info)
Tune hyper-parameters.
- Args
argparse.Namespace, arguments
- Opt_space
dict, search space
- Train_val_data
tuple, training and validation data
- Info
dict, information about the dataset
- Returns
argparse.Namespace, arguments
File and Path Utilities
- TALENT.model.utils.mkdir(path)
Create a directory if it does not exist.
Parameters:
path (str) – Path to the directory to create
Raises:
OSError – If directory creation fails for reasons other than already existing
- TALENT.model.utils.set_gpu(x)
Set environment variable CUDA_VISIBLE_DEVICES to specify which GPU to use.
Parameters:
x (str) – GPU ID to use (e.g., “0”, “1”, “0,1”)
Example:
set_gpu("0") # Use GPU 0 set_gpu("0,1") # Use GPUs 0 and 1
- TALENT.model.utils.ensure_path(path, remove=True)
Ensure a path exists, optionally removing existing directory.
Parameters:
path (str) – Path to the directory
remove (bool, optional) – Whether to remove the directory if it exists. Defaults to True.
Note:
If the path exists and remove=True, will prompt user for confirmation before removing.
Random Seed and Device Management
- TALENT.model.utils.set_seeds(base_seed, one_cuda_seed=False)
Set random seeds for reproducibility across all random number generators.
Parameters:
base_seed (int) – Base seed value (must be 0 <= base_seed < 2^32 - 10000)
one_cuda_seed (bool, optional) – Whether to set one seed for all GPUs. Defaults to False.
Note:
Sets seeds for Python random, NumPy, PyTorch CPU, and PyTorch CUDA generators. Each generator gets a different seed derived from the base_seed.
- TALENT.model.utils.get_device()
Get the appropriate device (GPU or CPU) for PyTorch operations.
Returns:
torch.device – CUDA device if available, otherwise CPU device
Evaluation Metrics
- TALENT.model.utils.rmse(y, prediction, y_info)
Calculate Root Mean Squared Error (RMSE) for regression tasks.
Parameters:
y (np.ndarray) – Ground truth values
prediction (np.ndarray) – Predicted values
y_info (dict) – Information about the target variable, including normalization policy
Returns:
float – RMSE value, adjusted for normalization if applicable
Note:
If y_info[‘policy’] is ‘mean_std’, the RMSE is multiplied by the standard deviation to denormalize the result.
Configuration Management
- TALENT.model.utils.load_config(args, config=None, config_name=None)
Load configuration file for model training and save current arguments.
Parameters:
args (argparse.Namespace) – Command line arguments
config (dict, optional) – Pre-loaded configuration dictionary. Defaults to None.
config_name (str, optional) – Name for the saved config file. Defaults to None.
Returns:
argparse.Namespace – Updated arguments with loaded configuration
Note:
Automatically saves the current arguments to a JSON file in the save_path directory.
Hyperparameter Optimization
- TALENT.model.utils.sample_parameters(trial, space, base_config)
Sample hyperparameters from the search space using Optuna trial.
Parameters:
trial (optuna.trial.Trial) – Optuna trial object for parameter sampling
space (dict) – Hyperparameter search space definition
base_config (dict) – Base configuration dictionary
Returns:
dict – Sampled hyperparameters
Special Distributions:
$mlp_d_layers – Special distribution for MLP layer dimensions
$d_token – Special distribution for transformer token dimensions
$d_ffn_factor – Special distribution for feedforward network factors
? – Optional parameters with default values
- TALENT.model.utils.merge_sampled_parameters(config, sampled_parameters)
Merge sampled hyperparameters into the base configuration.
Parameters:
config (dict) – Base configuration to update
sampled_parameters (dict) – Sampled parameters to merge
Note:
Recursively merges nested dictionaries and overwrites existing parameters.
Argument Parsing
- TALENT.model.utils.get_classical_args()
Parse command line arguments for classical machine learning models.
Returns:
tuple – (args, default_para, opt_space) where: * args: Parsed arguments * default_para: Default parameter configurations * opt_space: Hyperparameter optimization space
Supported Models:
LogReg, NCM, RandomForest, xgboost, catboost, lightgbm
svm, knn, NaiveBayes, dummy, LinearRegression
Key Parameters:
normalization: Data normalization method
num_nan_policy: Policy for handling numerical missing values
cat_nan_policy: Policy for handling categorical missing values
cat_policy: Categorical encoding policy
num_policy: Numerical feature processing policy
- TALENT.model.utils.get_deep_args()
Parse command line arguments for deep learning models.
Returns:
tuple – (args, default_para, opt_space) where: * args: Parsed arguments * default_para: Default parameter configurations * opt_space: Hyperparameter optimization space
Supported Models:
mlp, resnet, ftt, node, autoint, tabpfn, tangos, saint
tabcaps, tabnet, snn, ptarl, danets, dcn2, tabtransformer
dnnr, switchtab, grownet, tabr, modernNCA, hyperfast
bishop, realmlp, protogate, mlp_plr, excelformer, grande
amformer, tabptm, trompt, tabm, PFN-v2, t2gformer
tabautopnpnet, tabicl
Results Display
- TALENT.model.utils.show_results_classical(args, info, metric_name, results_list, time_list)
Display results for classical machine learning models.
Parameters:
args (argparse.Namespace) – Training arguments
info (dict) – Dataset information
metric_name (list) – Names of evaluation metrics
results_list (list) – List of results from multiple trials
time_list (list) – List of training times
Output:
Prints formatted results including mean, standard deviation, and GPU information.
- TALENT.model.utils.show_results(args, info, metric_name, loss_list, results_list, time_list)
Display results for deep learning models.
Parameters:
args (argparse.Namespace) – Training arguments
info (dict) – Dataset information
metric_name (list) – Names of evaluation metrics
loss_list (list) – List of training losses
results_list (list) – List of results from multiple trials
time_list (list) – List of training times
Output:
Prints formatted results including mean loss, metrics, and GPU information.
Hyperparameter Tuning
- TALENT.model.utils.tune_hyper_parameters(args, opt_space, train_val_data, info)
Perform hyperparameter optimization using Optuna.
Parameters:
args (argparse.Namespace) – Training arguments
opt_space (dict) – Hyperparameter search space
train_val_data (tuple) – Training and validation data
info (dict) – Dataset information
Returns:
argparse.Namespace – Updated arguments with optimized hyperparameters
Features:
Uses TPE sampler for efficient optimization
Supports both regression (minimize) and classification (maximize) objectives
Automatically saves best configuration to JSON file
Handles model-specific parameter adjustments
Model Factory
- TALENT.model.utils.get_method(model)
Get the method class for a given model name.
Parameters:
model (str) – Model name
Returns:
class – Method class for the specified model
Raises:
NotImplementedError – If the model is not yet implemented
Supported Models:
All deep learning and classical models supported by TALENT.
Utility Classes
- class TALENT.model.utils.Averager
A simple averager for tracking running averages.
Methods:
- add(x)
Add a value to the running average.
Parameters:
x (float) – Value to add
- item()
Get the current average value.
Returns:
float – Current running average
- class TALENT.model.utils.Timer
A timer for measuring elapsed time.
Methods:
- measure(p=1)
Measure elapsed time since timer creation.
Parameters:
p (int, optional) – Period for time formatting. Defaults to 1.
Returns:
str – Formatted time string (e.g., “30s”, “2m”, “1.5h”)
Debugging Utilities
- TALENT.model.utils.pprint(x)
Pretty print an object using the PrettyPrinter.
Parameters:
x (any) – Object to print