Method Base
===========

.. automodule:: TALENT.model.methods.base
   :members:
   :undoc-members:
   :show-inheritance:

Utility Functions
-----------------

.. function:: check_softmax(logits)
   :noindex:

   Check if the logits are already probabilities, and if not, convert them to probabilities.
   
   **Parameters:**
   
   * **logits** (*np.ndarray*) -- Array of shape (N, C) with logits
   
   **Returns:**
   
   * **np.ndarray** -- Array of shape (N, C) with probabilities
   
   **Note:**
   
   This function checks if the input values are already in the [0, 1] range and sum to 1.
   If not, it applies softmax transformation with numerical stability (subtracting max before exp).

Core Method Class
-----------------

.. class:: Method
   :noindex:

   Abstract base class for all machine learning methods in TALENT.
   
   This class provides a unified interface for training, validation, and prediction
   across all deep learning and classical machine learning models in TALENT.
   
   **Attributes:**
   
   * **args** (*argparse.Namespace*) -- Command line arguments and configuration
   * **is_regression** (*bool*) -- Whether the task is regression
   * **D** (*Dataset*) -- Dataset object containing features and labels
   * **train_step** (*int*) -- Current training step counter
   * **val_count** (*int*) -- Counter for validation without improvement
   * **continue_training** (*bool*) -- Whether to continue training
   * **timer** (*Timer*) -- Timer for tracking training time
   * **trlog** (*dict*) -- Training log containing loss, best results, etc.
   * **model** (*torch.nn.Module*) -- The neural network model (to be implemented by subclasses)
   * **optimizer** (*torch.optim.Optimizer*) -- Optimizer for training
   * **criterion** (*callable*) -- Loss function
   
   **Methods:**
   
   .. method:: __init__(args, is_regression)
      :noindex:
      
      Initialize the method with arguments and task type.
      
      **Parameters:**
      
      * **args** (*argparse.Namespace*) -- Command line arguments and configuration
      * **is_regression** (*bool*) -- Whether the task is regression
      
      **Initialization:**
      
      * Sets up training statistics and logging
      * Initializes device (CPU/GPU)
      * Sets up training log with appropriate best result tracking
   
   .. method:: reset_stats_withconfig(config)
      :noindex:
      
      Reset training statistics with a new configuration.
      
      **Parameters:**
      
      * **config** (*dict*) -- New configuration dictionary
      
      **Actions:**
      
      * Resets random seeds for reproducibility
      * Clears training step counter and validation counter
      * Resets training log with new configuration
      * Reinitializes timer
   
   .. method:: data_format(is_train=True, N=None, C=None, y=None)
      :noindex:
      
      Format and preprocess data for training or testing.
      
      **Parameters:**
      
      * **is_train** (*bool, optional*) -- Whether data is for training. Defaults to True.
      * **N** (*dict, optional*) -- Numerical features dictionary. Defaults to None.
      * **C** (*dict, optional*) -- Categorical features dictionary. Defaults to None.
      * **y** (*dict, optional*) -- Target labels dictionary. Defaults to None.
      
      **Processing Pipeline:**
      
      * **Training Mode:**
        * Handle missing values (NaN processing)
        * Process labels (standardization for regression, encoding for classification)
        * Apply numerical feature encoding (PLE, Unary, etc.)
        * Apply categorical feature encoding (ordinal, one-hot, etc.)
        * Apply normalization to numerical features
        * Create DataLoaders for training and validation
        * Set up loss function
      
      * **Testing Mode:**
        * Apply same preprocessing using fitted encoders and normalizers
        * Create DataLoader for testing
        * Prepare test data tensors
   
   .. method:: fit(data, info, train=True, config=None)
      :noindex:
      
      Fit the method to the training data.
      
      **Parameters:**
      
      * **data** (*tuple*) -- Tuple of (N, C, y) where N=numerical, C=categorical, y=labels
      * **info** (*dict*) -- Dataset information including task type and feature counts
      * **train** (*bool, optional*) -- Whether to train the model. Defaults to True.
      * **config** (*dict, optional*) -- Configuration dictionary. Defaults to None.
      
      **Returns:**
      
      * **float** -- Total training time in seconds
      
      **Training Process:**
      
      * Initialize dataset and extract features
      * Format data for training
      * Construct model (implemented by subclasses)
      * Set up optimizer (AdamW)
      * Train for specified number of epochs
      * Save best model and training log
   
   .. method:: predict(data, info, model_name)
      :noindex:
      
      Make predictions on test data.
      
      **Parameters:**
      
      * **data** (*tuple*) -- Tuple of (N, C, y) test data
      * **info** (*dict*) -- Dataset information
      * **model_name** (*str*) -- Name of the saved model file
      
      **Returns:**
      
      * **tuple** -- (loss, metrics, metric_names, predictions) where:
        * loss: Test loss value
        * metrics: List of evaluation metrics
        * metric_names: Names of the metrics
        * predictions: Model predictions
      
      **Prediction Process:**
      
      * Load trained model weights
      * Format test data using fitted preprocessors
      * Run inference on test set
      * Compute evaluation metrics
      * Return results
   
   .. method:: train_epoch(epoch)
      :noindex:
      
      Train the model for one epoch.
      
      **Parameters:**
      
      * **epoch** (*int*) -- Current epoch number
      
      **Training Loop:**
      
      * Set model to training mode
      * Iterate through training batches
      * Forward pass and compute loss
      * Backward pass and update weights
      * Log training progress
      * Update training statistics
   
   .. method:: validate(epoch)
      :noindex:
      
      Validate the model on validation set.
      
      **Parameters:**
      
      * **epoch** (*int*) -- Current epoch number
      
      **Validation Process:**
      
      * Set model to evaluation mode
      * Run inference on validation set
      * Compute validation metrics
      * Check for improvement
      * Save best model if improved
      * Implement early stopping (20 epochs without improvement)
      * Save training log
   
   .. method:: metric(predictions, labels, y_info)
      :noindex:
      
      Compute evaluation metrics based on task type.
      
      **Parameters:**
      
      * **predictions** (*np.ndarray*) -- Model predictions
      * **labels** (*np.ndarray*) -- Ground truth labels
      * **y_info** (*dict*) -- Label information including processing policy
      
      **Returns:**
      
      * **tuple** -- (metrics, metric_names) where:
        * metrics: List of computed metric values
        * metric_names: Names of the metrics
      
      **Metrics by Task Type:**
      
      * **Regression:**
        * MAE (Mean Absolute Error)
        * R² (Coefficient of determination)
        * RMSE (Root Mean Squared Error)
      
      * **Binary Classification:**
        * Accuracy
        * Balanced Recall
        * Macro Precision
        * F1 Score
        * Log Loss
        * AUC (Area Under ROC Curve)
      
      * **Multi-class Classification:**
        * Accuracy
        * Balanced Recall
        * Macro Precision
        * Macro F1 Score
        * Log Loss
        * Macro AUC (One-vs-Rest)

Abstract Methods
----------------

The following methods must be implemented by subclasses:

.. method:: construct_model()
   :noindex:
   
   Construct the neural network model architecture.
   
   **Implementation Required:**
   
   Subclasses must implement this method to create their specific model architecture.
   The model should be assigned to `self.model` and should accept numerical and categorical
   features as separate inputs.
   
   **Expected Model Interface:**
   
   .. code-block:: python
      
      def forward(self, X_num, X_cat):
          # X_num: numerical features tensor or None
          # X_cat: categorical features tensor or None
          # Return: predictions tensor
          pass

Usage Example
-------------

.. code-block:: python
   
   from TALENT.model.methods.base import Method
   import torch.nn as nn
   
   class MyModel(Method):
       def construct_model(self):
           # Define your model architecture
           self.model = nn.Sequential(
               nn.Linear(self.d_in, 128),
               nn.ReLU(),
               nn.Linear(128, self.d_out)
           )
   
   # Usage
   method = MyModel(args, is_regression=True)
   time_cost = method.fit(train_data, info)
   loss, metrics, metric_names, predictions = method.predict(test_data, info, 'best-val')