-
Notifications
You must be signed in to change notification settings - Fork 10
Description
As of #121 LightGBM tests can still fail after the first run with a serious C++ error from the LightGBM side:
[ERROR] munmap_chunk(): invalid pointer
[ERROR] Aborted (core dumped)After that the following messages will show up:
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Openml LightGBM lib 3.0.1-with_model_locale_fix_for_java_and_streaming SUCCESS [ 1.659 s]
[INFO] openml-lightgbm-module 1.0.12-SNAPSHOT ............. SUCCESS [ 0.378 s]
[INFO] openml-lightgbm 1.0.12-SNAPSHOT .................... FAILURE [ 9.691 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12.312 s
[INFO] Finished at: 2022-05-27T14:48:38+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on project openml-lightgbm: There are test failures.
[ERROR]
[ERROR] Please refer to /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/surefire-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider && /home/alberto.ferreira/.sdkman/candidates/java/8.0.302-open/jre/bin/java -javaagent:/home/alberto.ferreira/.m2/repository/org/jacoco/org.jacoco.agent/0.8.4/org.jacoco.agent-0.8.4-runtime.jar=destfile=/home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/jacoco.exec -jar /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/surefire/surefirebooter2472734679044173114.jar /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/surefire 2022-05-27T14-48-29_842-jvmRun1 surefire8810960277057352224tmp surefire_07484658466683880775tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 134
[ERROR] Crashed tests:
[ERROR] com.feedzai.openml.provider.lightgbm.LightGBMBinaryClassificationModelTrainerTest
[ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM terminated without properly saying goodbye. VM crash or System.exit called?
[ERROR] Command was /bin/sh -c cd /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider && /home/alberto.ferreira/.sdkman/candidates/java/8.0.302-open/jre/bin/java -javaagent:/home/alberto.ferreira/.m2/repository/org/jacoco/org.jacoco.agent/0.8.4/org.jacoco.agent-0.8.4-runtime.jar=destfile=/home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/jacoco.exec -jar /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/surefire/surefirebooter2472734679044173114.jar /home/alberto.ferreira/code/external/feedzai-openml-java/openml-lightgbm/lightgbm-provider/target/surefire 2022-05-27T14-48-29_842-jvmRun1 surefire8810960277057352224tmp surefire_07484658466683880775tmp
[ERROR] Error occurred in starting fork, check output in log
[ERROR] Process Exit Code: 134
[ERROR] Crashed tests:
[ERROR] com.feedzai.openml.provider.lightgbm.LightGBMBinaryClassificationModelTrainerTest
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:748)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:305)
[ERROR] at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:265)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
[ERROR] at org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
[ERROR] at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:210)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:156)
[ERROR] at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:148)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:117)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:81)
[ERROR] at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:56)
[ERROR] at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:305)
[ERROR] at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:192)
[ERROR] at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:105)
[ERROR] at org.apache.maven.cli.MavenCli.execute(MavenCli.java:957)
[ERROR] at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:289)
[ERROR] at org.apache.maven.cli.MavenCli.main(MavenCli.java:193)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR] at java.lang.reflect.Method.invoke(Method.java:498)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406)
[ERROR] at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347)
How to reproduce
# Go to repo root and run:
mvn clean install -DskipTests
cd openml-lightgbm # Go to provider directory to run its tests.Tests will always run with:
mvn clean testHowever, they will always fail (at least after the first run) with:
mvn testConclusion
surefire settings: How to have mvn test work all the time
By simply changing maven-surefire-plugin's forking settings (see [1]) it's possible to have mvn test work all the time too.
This is the observed behaviour regarding mvn test:
- default settings fail (forkCount=1, reuseForks=true)
- disabling forks works (forkCount=0)
- not reusing forks works (reuseForks=false)
Update: All fork settings can still fail after all. The failure odds change though. The only command that works all the time is still mvn clean test. This means the approaches described below don't fix the issue.
Analysis
Given that disabling forks altogether vs running the tests in a forked process (default) causes the issues, the problem seems caused by Java's test stack and not by the LightGBM C++ side (which always works with mvn clean test anyway).
Note: Upgrading to the latest versions of jacoco 0.8.6 and maven-surefire-plugin to 3.0.0-M6 didn't fix the issue either.
Possible solutions
Now, we can disable forks altogether, instead of forking the process before starting the tests once (default, which is failing); or we can set tests not to reuse forks. The latter option is slower as it forks the process for every test class.
Are there any downsides to doing these approaches? And which of those approaches is best?