Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantized Grad Randomly Fails best_split_info.left_count check #5994

Closed
bbstats opened this issue Jul 20, 2023 · 3 comments
Closed

Quantized Grad Randomly Fails best_split_info.left_count check #5994

bbstats opened this issue Jul 20, 2023 · 3 comments
Labels

Comments

@bbstats
Copy link

bbstats commented Jul 20, 2023

Description

When using the new 'use_quantized_grad' in python, lightgbm will randomly fail, giving this error:

LightGBMError: Check failed: (best_split_info.right_count) > (0) at /__w/1/s/lightgbm-python/src/treelearner/serial_tree_learner.cpp, line 855 .

Even when locking in the random state, it will fail randomly. Run the below code multiple times and you'll see it occasionally passes but fails maybe 1/2 the time?

Reproducible example

In google colab,

%pip install -U lightgbm
from lightgbm import LGBMRegressor
from sklearn.datasets import make_regression
X,y=make_regression()
while True:
    model = LGBMRegressor(use_quantized_grad=True, random_state=4, verbose=-1)
    model.fit(X,y)
    print('pass')

Environment info

Google Colab
Python 3.10.6
LightGBM version or commit hash: 4.0.0

Command(s) you used to install LightGBM

%pip install -U lightgbm
@jameslamb jameslamb added the bug label Jul 20, 2023
@jameslamb
Copy link
Collaborator

Linking this related issue: #5982

@jameslamb
Copy link
Collaborator

jameslamb commented Jul 21, 2023

Even LightGBM's CI is sometimes randomly failing with this error. e.g., I just saw this on Windows: https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=14958&view=logs&j=67ae2ff8-4b33-578b-0813-9c0440091667&t=aed7750d-29a7-50d7-d7f0-fc7145577059

FAILED ..\tests\python_package_test\test_engine.py::test_quantized_training
= 1 failed, 527 passed, 33 skipped, 2 xfailed, 284 warnings in 112.45s (0:01:52) =

[LightGBM] [Fatal] Check failed: (best_split_info.left_count) > (0) at C:\Users\VssAdministrator\AppData\Local\Temp\pip-req-build-7q_w26kz\src\treelearner\serial_tree_learner.cpp, line 845 .

Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 13, 2023
Ten0 pushed a commit to Ten0/LightGBM that referenced this issue Jan 12, 2024
…5994) (microsoft#6092)

* fix leaf splits update after split in quantized training

* fix preparation ordered gradients for quantized training

* remove force_row_wise in distributed test for quantized training

* Update src/treelearner/leaf_splits.hpp

---------

Co-authored-by: James Lamb <jaylamb20@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants