Support Vector Machine
Support Vector Machine classical method implementation.
This section contains the Support Vector Machine (SVM) implementation for classification tasks. SVM is a supervised learning algorithm that finds a hyperplane to separate data points of different classes with maximum margin.
- class TALENT.model.classical_methods.svm.SvmMethod(args, is_regression)
Bases:
classical_methods- construct_model(model_config=None)
- fit(data, info, train=True, config=None)
- metric(predictions, labels, y_info)
- predict(data, info, model_name)
- class TALENT.model.classical_methods.svm.SVMMethod
Support Vector Machine method for classification tasks.
Key Features:
Uses sklearn’s SVC for classification
Finds optimal hyperplane for class separation
Supports both binary and multiclass classification
Automatically handles data preprocessing including normalization and encoding
Saves trained model to pickle file for later use
Provides probability predictions
Algorithm:
SVM is a supervised learning algorithm that finds a hyperplane to separate data points of different classes with maximum margin. It can handle both linear and non-linear classification using kernel functions.
- __init__(args, is_regression)
Initialize the SVM method.
Parameters:
args (object) – Configuration arguments containing model settings
is_regression (bool) – Whether the task is regression (True) or classification (False)
- construct_model(model_config=None)
Construct the SVM model instance.
Parameters:
model_config (dict, optional) – Model configuration parameters for SVM
Model Creation:
Creates SVC classifier
Configures parameters like kernel, C, gamma, etc.
- fit(data, info, train=True, config=None)
Train the SVM model on the provided data.
Parameters:
data (tuple) – Tuple containing (N, C, y) where N is numerical features, C is categorical features, y is labels
info (dict) – Dataset information
train (bool, default=True) – Whether to train the model or just load from checkpoint
config (dict, optional) – Additional configuration parameters
Returns:
time_cost (float) – Training time in seconds
Training Process:
Data Preprocessing: Handles missing values, categorical encoding, normalization
Model Training: Fits the SVM model with optimal hyperplane
Model Saving: Saves the trained model to disk for later use
- predict(data, info, model_name)
Make predictions using the trained SVM model.
Parameters:
data (tuple) – Tuple containing (N, C, y) where N is numerical features, C is categorical features, y is labels
info (dict) – Dataset information
model_name (str) – Name of the model for saving/loading
Returns:
test_logit (array-like) – Test predictions (probabilities for classification)
Prediction Process:
Data Preprocessing: Applies same preprocessing as training data
Model Loading: Loads the trained SVM model
Prediction: Generates probability predictions
Output: Returns probabilities for classification
Evaluation Metrics:
For classification: returns Accuracy, Avg_Precision, Avg_Recall, F1 metrics
References:
[1] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.