-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] Add CI job running rchk on the R package (fixes #4400) #4449
Conversation
@@ -94,7 +94,12 @@ fi | |||
|
|||
# Manually install Depends and Imports libraries + 'testthat' | |||
# to avoid a CI-time dependency on devtools (for devtools::install_deps()) | |||
packages="c('data.table', 'jsonlite', 'Matrix', 'R6', 'testthat')" | |||
# NOTE: testthat is not required when running rchk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{testthat}
has a lot of dependencies, and since the rchk
job runs on Linux they have to be compiled from source (since CRAN does not provide Linux binaries).
This additional if
statement adds a bit of new complexity to this CI script, but I think it's worth it because I found in testing on my fork that this saves 1-2 minutes on the run time of the job.
RCHK_LOG_FILE="rchk-logs.txt" | ||
docker run \ | ||
-v $(pwd)/packages:/rchk/packages \ | ||
kalibera/rchk:latest \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docker image doesn't have any already configured R environment, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it does, but the way that it is set up, it expects that you've built the package outside and then passed it into a container running the image.
https://github.com/kalibera/rchk/blob/master/doc/DOCKER.md#checking-a-package-from-a-tarball
that's why I chose to use docker run
here after our usual R setup stuff on an ubuntu-latest
VM, instead of running a containerized job like
LightGBM/.github/workflows/r_package.yml
Lines 167 to 171 in ec1debc
test-r-sanitizers: | |
name: r-package (ubuntu-latest, R-devel, GCC ASAN/UBSAN) | |
timeout-minutes: 60 | |
runs-on: ubuntu-latest | |
container: rhub/rocker-gcc-san |
I think it's ok since docker
is available by default in the ubuntu-latest
image: https://github.com/actions/virtual-environments/blob/7d0d3aa82ab8565cc43684bd08bf72fb6f5cd9f7/images/linux/Ubuntu2004-README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That's exactly what I wanted to clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you see that?
https://hub.docker.com/r/rhub/ubuntu-rchk
https://github.com/r-hub/rhub-linux-builders/blob/master/ubuntu-rchk/Dockerfile
Maybe this image can help us to run containerized job or we don't want to do that?..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting, no I didn't know about that. I can try it tonight or tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright I tested this tonight but couldn't quite get it working.
I tried the following from the root of the LightGBM
repo.
# get a shell inside an ubuntu-rchk container
docker run \
-v $(pwd):/opt/LightGBM \
-w /opt/LightGBM \
-it rhub/ubuntu-rchk \
/bin/bash
# run the following inside the container
sh build-cran-package.sh
Rscript -e "install.packages(c('R6', 'data.table', 'jsonlite', 'Matrix'), repos = 'https://cran.r-project.org')"
rchk.sh lightgbm_*.tar.gz
Compilation of {lightgbm}
and its dependencies succeeded, but running this failed with an error I don't understand.
Here is the tail of the logs:
/usr/local/bin/wllvm++ -stdlib=libc++ -std=gnu++11 -shared -L/usr/local/lib -o lightgbm.so boosting/boosting.o boosting/gbdt.o boosting/gbdt_model_text.o boosting/gbdt_prediction.o boosting/prediction_early_stop.o io/bin.o io/config.o io/config_auto.o io/dataset.o io/dataset_loader.o io/file_io.o io/json11.o io/metadata.o io/parser.o io/train_share_states.o io/tree.o metric/dcg_calculator.o metric/metric.o objective/objective_function.o network/ifaddrs_patch.o network/linker_topo.o network/linkers_mpi.o network/linkers_socket.o network/network.o treelearner/data_parallel_tree_learner.o treelearner/feature_parallel_tree_learner.o treelearner/gpu_tree_learner.o treelearner/linear_tree_learner.o treelearner/serial_tree_learner.o treelearner/tree_learner.o treelearner/voting_parallel_tree_learner.o c_api.o lightgbm_R.o -pthread
installing to /home/docker/R-svn/packages/lib/00LOCK-lightgbm/00new/lightgbm/libs
** R
** data
** demo
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
- DONE (lightgbm)
/home/docker/rchk/scripts/check_package.sh: line 100: 2577 Killed $RCHK/src/$T $RBC $F > $FOUT 2>&1
So I think this PR should be merged in its current state. It would have been great to have a fully-containerized job so that we wouldn't have to spend CI time installing R for this, but I don't think investigating this further is the best use of our time, given that the current run time of the job is already not too bad (around 13 minutes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, got it! Thanks a lot for the try! If you have some time, I guess it worth to report this error to https://github.com/r-hub/rhub-linux-builders with example of successful run directly in https://github.com/kalibera/rchk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very for this cool CI enhancement!
LGTM! But please check #4449 (comment) before merging.
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Fixes #4400
This PR proposes adding a new R CI job that tests the R package with
rchk
. Such a check can be used to catch bugs of the form "C/C++ code failed to protect an object from R's garbage collection".Please see #4400 for a detailed description and links to other documentation explaining the value of this check.
Notes for Reviewers
I tested this job on my fork before opening this PR. jameslamb#45
This job is taking around 13 minutes to run, so I'm proposing that it just be added to the regular set of checks run on every commit and not require manual triggering by a comment.
As an added bonus, this PR tests that LightGBM can be compiled with
wllvm++
.master
: https://github.com/jameslamb/LightGBM/pull/45/checks?check_run_id=2995447785Example error caught by this job: