top of page

Light Gradient Boosting



18 min


Model Type:


in the lesson we go into a less common python library and explore the Machine Learning model Light Gradient Boosting.  The model is developed by microsoft and one of the main adavantages is it is powerfully fast.

About the Model

The decision to use LightGBM over scikit-learn's GradientBoosting comes with various advantages and considerations, depending on your machine learning task's specific requirements. LightGBM is notably faster and more efficient, thanks to its histogram-based split finding approach. This acceleration makes it an excellent option for handling large datasets and time-sensitive applications.

LightGBM (Light Gradient Boosting Machine) has a wide range of hyperparameters and arguments that you can use to configure and fine-tune your model. Below, I'll provide an overview of some of the most commonly used arguments:

  1. Objective Function Parameters:

    • objective: Specifies the learning task (e.g., 'regression', 'binary', 'multiclass', 'lambdarank', etc.).

    • num_class: Number of classes in a multiclass problem.

  2. Tree Parameters:

    • num_leaves: Maximum number of leaves for each tree (limits tree complexity).

    • max_depth: Maximum depth of the trees.

    • min_data_in_leaf: Minimum number of data points in a leaf node.

    • min_sum_hessian_in_leaf: Minimum sum of Hessian (second-order gradient) required in a leaf.

  3. Data Parameters:

    • data: The dataset used for training.

    • categorical_feature: A list of indices or column names specifying categorical features.

    • weight_column: A column name or index that specifies sample weights.

  4. Boosting Parameters:

    • boosting_type: The type of boosting to use ('gbdt', 'dart', 'goss', etc.).

    • num_iterations (or num_boost_round): Number of boosting rounds.

    • learning_rate (or eta): The step size for updates.

    • early_stopping_rounds: The number of rounds to wait for early stopping.

  5. Regularization Parameters:

    • lambda_l1 (or reg_alpha): L1 regularization term.

    • lambda_l2 (or reg_lambda): L2 regularization term.

  6. Feature Parameters:

    • feature_fraction (or colsample_bytree): Fraction of features to use in each boosting round.

    • bagging_fraction (or subsample): Fraction of data to use in each boosting round.

    • bagging_freq: Frequency for bagging. Use 0 to disable bagging.

  7. Optimization Parameters:

    • max_bin: Maximum number of bins used for histogram-based split finding.

    • min_data_in_bin: Minimum number of data points in each bin.

    • bin_construct_sample_cnt: Minimum number of data points in the bin sampling.

    • sparse_threshold: A threshold for data sparsity.

  8. Objective-Specific Parameters:

    • Depending on the chosen objective function (e.g., 'poisson', 'gamma', 'lambdarank'), there are specific parameters that control the objective function's behavior.

  9. Metric Parameters:

    • metric: Specifies the evaluation metric for model performance (e.g., 'l1', 'l2', 'binary_logloss', etc.).

    • metric_freq: The frequency for metric output.

  10. Other Parameters:

    • verbosity: Controls the amount of information printed during training.

    • num_threads: Number of threads to use for training (for multi-threaded computing).

Free Python Code Example of LightGBM Machine Learning Model

A Litte Bit more about LightGBM

Using LightGBM over scikit-learn's GradientBoosting implementation has several advantages, and the choice between the two often depends on the specific requirements and characteristics of your machine learning task. Here are some key reasons someone might prefer LightGBM over scikit-learn's GradientBoosting:

  1. Speed and Efficiency:

    • LightGBM is known for its exceptional speed and efficiency. It uses a histogram-based approach for split finding, which significantly accelerates the training process. This makes it a great choice for large datasets and time-sensitive applications.

  2. Low Memory Usage:

    • LightGBM is memory-efficient, thanks to techniques like histogram-based split finding and gradient-based one-side sampling (GOSS). It can handle large datasets that might not fit into memory with other gradient boosting implementations.

  3. Categorical Feature Handling:

    • LightGBM can efficiently handle categorical features without the need for one-hot encoding, reducing data dimensionality and speeding up training.

  4. Parallel and Distributed Training:

    • LightGBM supports parallelism, multi-threading, and distributed computing, making it suitable for high-performance computing environments. This can significantly reduce training time for large datasets.

  5. GPU Acceleration:

    • LightGBM offers GPU acceleration, allowing you to train models even faster if you have access to GPU hardware.

  6. Automatic Handling of Missing Data:

    • LightGBM can automatically handle missing data during training, eliminating the need for preprocessing to impute missing values.

  7. Built-in Hyperparameter Optimization:

    • LightGBM includes built-in hyperparameter tuning options like grid search and random search, making it easier to find optimal hyperparameters for your problem.

  8. Community and Development:

    • LightGBM has an active and growing community and is actively developed by Microsoft. This means you can expect ongoing improvements, bug fixes, and support.

  9. Good Out-of-the-Box Performance:

    • LightGBM often provides competitive performance with default hyperparameters, making it a good starting point for many machine learning tasks.

  10. Scalability:

    • LightGBM is designed to scale efficiently as the dataset size grows, making it a practical choice for big data applications.

Data Science Learning Communities

Real World Applications of LightGBM