Aquileo | Hyperparameter Tuning with R

Hyperparameter tuning is the process of selecting the most effective values for a machine learning model’s hyperparameters before training starts. Hyperparameters determine how the model learns from data, including how fast it learns, how complex it can be and how well it adapts to new information. Choosing the right hyperparameters can greatly improve a model’s accuracy and its ability to make reliable predictions on unseen data.

hyperparameter_tuning_in_machine_learning — Hyperparameter Tuning

Hyperparameters include examples like learning rate, number of trees in a random forest or number of neighbors in k-NN.
Proper tuning prevents underfitting or overfitting, helping the model generalize better.
It plays a key role in improving prediction performance and overall model reliability.

Techniques for Hyperparameter Tuning

Models can have many hyperparameters and finding the best combination of parameters can be treated as a search problem.

1. Grid Search

Grid search is a method for hyperparameter tuning that systematically tests all combinations of specified hyperparameter values to find the best-performing model. Each combination is evaluated often using cross validation to identify the hyperparameters that give the highest accuracy. While effective, grid search can be slow and resource-intensive when the dataset is large or the hyperparameter grid is extensive.

Grid search works by testing all hyperparameter combinations, evaluating them and selecting the best-performing one

Define a hyperparameter grid: Specify a set of possible values for each hyperparameter you want to tune.
Train and evaluate: The model is trained for every combination of hyperparameters and each version is evaluated using cross-validation.
Select the best combination: The hyperparameter combination that achieves the highest performance score is chosen for the final model.

2. Random Search

Random search is a method for hyperparameter tuning that selects combinations of hyperparameters randomly from predefined ranges or distributions. It does not test every possible combination, which makes it faster and more efficient when the search space is large. This approach can quickly find effective hyperparameter values while exploring different regions of the search space.

Define hyperparameter distributions: Specify a range or distribution for each hyperparameter instead of listing all possible values.
Sample random combinations: The model is trained on randomly selected hyperparameter combinations instead of exhaustively testing every combination.
Evaluate and select: Each combination is evaluated (often using cross-validation) and the one with the best performance is chosen for the final model.

3. Bayesian Optimization

Bayesian optimization is an advanced method for hyperparameter tuning that builds a probabilistic model of the relationship between hyperparameters and model performance. It uses past evaluation results to predict which hyperparameter combinations are likely to improve performance and selects the next combinations to test based on this prediction.

Model the performance: Bayesian optimization creates a surrogate model (usually Gaussian Process) to approximate how hyperparameters affect performance.
Select promising hyperparameters: The method predicts which hyperparameters are most likely to improve the model and tests them.
Iteratively refine: After each evaluation, the model is updated with new results to improve future predictions and guide the search toward optimal values.

This approach makes the search more efficient than grid or random search, especially for complex models or large search spaces.

Step By Step Implementation

Here we implement hyperparameter tuning in R.

Step 1: Install and Load Required Packages

Install and load the necessary R packages. caret is used for machine learning and tuning, randomForest provides the Random Forest model and rBayesianOptimization allows performing Bayesian Optimization.

install.packages("caret", dependencies = TRUE)
install.packages("randomForest")
install.packages("rBayesianOptimization")
library(caret)
library(randomForest)
library(rBayesianOptimization)

Step 2: Load Dataset and Split into Training and Testing Sets

Here we use the iris dataset and split it into training and testing sets to train the model and evaluate its performance on unseen data.

data(iris)
set.seed(123)
train_index <- createDataPartition(iris$Species, p = 0.8, list = FALSE)
train_data <- iris[train_index, ]
test_data  <- iris[-train_index, ]

Step 3: Grid Search for Hyperparameter Tuning

Grid Search systematically tests all combinations of specified hyperparameters to find the best-performing model. It evaluates each combination using cross-validation and selects the combination with the highest accuracy.

expand.grid() creates all possible combinations of mtry, splitrule and min.node.size.
trainControl() with method = "cv" ensures each combination is evaluated reliably across 5 folds.
train() tests all combinations and automatically identifies the best hyperparameters.

grid <- expand.grid(
  mtry = c(1, 2, 3),
  splitrule = "gini",
  min.node.size = c(1, 5)
)

ctrl <- trainControl(method = "cv", number = 5, search = "grid")
set.seed(123)
grid_model <- train(
  Species ~ .,
  data = train_data,
  method = "ranger",
  tuneGrid = grid,
  trControl = ctrl
)
cat("Best Hyperparameters (Grid Search):\n")
print(grid_model$bestTune)

Output:

Best Hyperparameters (Grid Search):
mtry splitrule min.node.size
3 2 gini 1

Step 4: Random Search for Hyperparameter Tuning

Random Search selects random combinations of hyperparameters from predefined ranges and evaluates them using cross-validation. It is faster than Grid Search, especially when the hyperparameter space is large and can still find effective hyperparameters.

trainControl() with search = "random" sets up cross-validation for random hyperparameter combinations.
tuneLength = 10 tells the model to test 10 randomly selected combinations from the hyperparameter space.
train() evaluates each combination and identifies the best-performing hyperparameters automatically.

ctrl_random <- trainControl(method = "cv", number = 5, search = "random")

set.seed(123)
random_model <- train(
  Species ~ .,
  data = train_data,
  method = "ranger",
  tuneLength = 10,   
  trControl = ctrl_random
)

cat("\nBest Hyperparameters (Random Search):\n")
print(random_model$bestTune)

Output:

Best Hyperparameters (Random Search):
mtry splitrule min.node.size
3 2 gini 5

Step 5: Bayesian Optimization for Hyperparameter Tuning

Bayesian Optimization uses a probabilistic model to predict which hyperparameter combinations are likely to improve model performance. It evaluates promising combinations iteratively and refines its predictions after each step, making it efficient for large or complex search spaces.

The rf_bayes function trains a Random Forest model for given hyperparameters and returns the training accuracy as the score to maximize.
BayesianOptimization() runs the optimization, testing initial random points (init_points) and iteratively evaluating new combinations for n_iter steps.
The method automatically identifies the hyperparameter combination that gives the best performance.

rf_bayes <- function(mtry, min.node.size){
  model <- ranger(
    Species ~ ., 
    data = train_data, 
    mtry = as.integer(mtry), 
    min.node.size = as.integer(min.node.size),
    num.trees = 100,
    classification = TRUE
  )
  pred <- predict(model, train_data)$predictions
  acc <- sum(pred == train_data$Species) / nrow(train_data)
  return(list(Score = acc, Pred = 0))
}

set.seed(123)
bayes_opt <- BayesianOptimization(
  FUN = rf_bayes,
  bounds = list(mtry = c(1L, 3L), min.node.size = c(1L, 5L)),
  init_points = 5,
  n_iter = 10,
  acq = "ucb",
  kappa = 2.576,
  verbose = TRUE
)

cat("\nBest Hyperparameters (Bayesian Optimization):\n")
print(bayes_opt$Best_Par)

Output:

Download code form here

Advantages

Hyperparameter tuning is an essential step in building high-performing machine learning models. Selecting the right hyperparameters can significantly improve a model’s accuracy, efficiency and generalization ability. Some key advantages include:

Improved Model Performance: Proper tuning helps the model learn patterns in the data more effectively, leading to higher accuracy and better predictions.
Better Generalization: Hyperparameter tuning reduces overfitting or underfitting, ensuring the model performs well on unseen data.
Efficient Learning: Optimized hyperparameters, such as learning rate or tree depth can make the model train faster and converge more reliably.
Resource Optimization: By tuning hyperparameters you can avoid wasting computational resources on models that are too complex or inefficient.
Informed Decision Making: Tuning provides insights into which hyperparameters significantly impact model performance, helping in model selection and improvement.

Hyperparameter Tuning with R

Techniques for Hyperparameter Tuning

1. Grid Search

2. Random Search

3. Bayesian Optimization

Step By Step Implementation

Step 1: Install and Load Required Packages

Step 2: Load Dataset and Split into Training and Testing Sets

Step 3: Grid Search for Hyperparameter Tuning

Step 4: Random Search for Hyperparameter Tuning

Step 5: Bayesian Optimization for Hyperparameter Tuning

Advantages

Explore