Naive Bayes
Naive Bayes classical method implementation.
This section contains the Naive Bayes implementation for classification tasks. Naive Bayes is a probabilistic classifier based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features.
- class TALENT.model.classical_methods.naivebayes.NaiveBayesMethod(args, is_regression)
Bases:
NCMMethod- construct_model(model_config=None)
- class TALENT.model.classical_methods.naivebayes.NaiveBayesMethod
Naive Bayes method for classification tasks.
Key Features:
Uses sklearn’s GaussianNB for classification
Probabilistic classifier based on Bayes’ theorem
Supports both binary and multiclass classification
Automatically handles data preprocessing including normalization and encoding
Saves trained model to pickle file for later use
Provides probability predictions
Algorithm:
Naive Bayes is a probabilistic classifier based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.
- __init__(args, is_regression)
Initialize the Naive Bayes method.
Parameters:
args (object) – Configuration arguments containing model settings
is_regression (bool) – Whether the task is regression (True) or classification (False)
- construct_model(model_config=None)
Construct the Naive Bayes model instance.
Parameters:
model_config (dict, optional) – Model configuration parameters for Naive Bayes
Model Creation:
Creates GaussianNB classifier
Configures parameters like priors, var_smoothing, etc.
- fit(data, info, train=True, config=None)
Train the Naive Bayes model on the provided data.
Parameters:
data (tuple) – Tuple containing (N, C, y) where N is numerical features, C is categorical features, y is labels
info (dict) – Dataset information
train (bool, default=True) – Whether to train the model or just load from checkpoint
config (dict, optional) – Additional configuration parameters
Returns:
time_cost (float) – Training time in seconds
Training Process:
Data Preprocessing: Handles missing values, categorical encoding, normalization
Model Training: Fits the Naive Bayes model
Model Saving: Saves the trained model to disk for later use
- predict(data, info, model_name)
Make predictions using the trained Naive Bayes model.
Parameters:
data (tuple) – Tuple containing (N, C, y) where N is numerical features, C is categorical features, y is labels
info (dict) – Dataset information
model_name (str) – Name of the model for saving/loading
Returns:
test_logit (array-like) – Test predictions (probabilities for classification)
Prediction Process:
Data Preprocessing: Applies same preprocessing as training data
Model Loading: Loads the trained Naive Bayes model
Prediction: Generates probability predictions using Bayes’ theorem
Output: Returns probabilities for classification
Evaluation Metrics:
For classification: returns Accuracy, Avg_Precision, Avg_Recall, F1 metrics
References:
[1] Rish, I. (2001). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46).